The most time-wasting thing when learning natural languages or reading literature works is nothing more than consulting a dictionary. During my studies, I came through three stages: paperback dictionary, online dictionary and Google Chrome search engines. The time requiring to consult one entry has decreased from a few minutes to dozens of seconds, but I’m still not satisfied with this speed.
It happens it’s been around thirteen months that I moved from Vim to Emacs which means, in other words, that I handle all text-based file in Emacs. Naturally, I searched if there was such dictionary package for Emacs. Unsurprisingly, the author of Ivy, Oleh Krehel (a.k.a. abo-abo) has written an easy-to-use package named define-word for this purpose. In short, at each request it parses with regex an HTML page, retrieves the word definition and displays the result on minibuffer. A request can be made from a selected region or a prompted string.
As I just started to work on my first ever package (org-glaux to be precise), I read the source code and learn how it is easy to write a new parser. Of course, that is due to the design made by abo-abo1. The first online dictionary I wanted to parse was Wiktionary2, but because of its wiki specificity, it has complex markup and hyperlink, so I gave up straight away. Then, I moved on to a Franco-Chinese dictionary I use from time to time. It works, but there was still some sporadic issues, such as looking up conjugated verb form doesn’t redirect to the verb definition but to the verb morphology. Also, the HTML layout was barely consistent, thus not very interesting to parse for what it provides.
My thoughts then turned to the Larousse dictionary. This French-French dictionary despite having the problem of one word might have multiple pages, it has not so many defects.
My thoughts then turned to the Larousse dictionary. This French-French dictionary, despite having the problem that one word might have multiple corresponding pages, it has not so many defects.
After adding some utilities in the source code, such as HTML special characters decoding, the Larousse parser I wrote has finally highlighting for lexical class, examples and etymology (see figures 3 & 4).
If I spend more time on this, I would certainly be able to add some cool features like the possibility to search among multiple dictionaries, and even combine it to make flashcard with org-drill3. It is would be also good to add a few encyclopedias and synonym dictionaries during my spare time.
Look up one word now takes an overhead of barely one second4. I’m quite happy with the result, but of course still open to a faster solution!
You can find here my fork implementing Larousse dictionary and test it by yourself.
And sync it in org-agenda by schedule it. ↩︎
It’s a matter of pressing
C-c dupon a word. ↩︎