Emacs Lisp: Chinese character Reference Linkify

Perm url with updates: http://xahlee.org/emacs/elisp_chinese_char_linkify.html

Emacs Lisp: Chinese Character Reference Linkify

Xah Lee, 2011-07-27

Another quick elisp that enhance my productivity 10-fold!


I write many blogs. In one of the blog , i often need to create links to several dictionaries. For example, if the cursor is on the Chinese char 魂, then pressing a button i want it to become this:

<div class="cdict">
<a href="http://en.wiktionary.org/wiki/魂">Wiktionary 魂</a> ◇
<a href="http://translate.google.com/#zh-CN|en|魂">Google 魂</a> ◇
<a href="http://www.chineseetymology.org/CharacterEtymology.aspx?submitButton1=Etymology&amp;characterInput=魂">history 魂</a>

Which appears in browser like this:


I've already wrote ~20 elisp functions like this the past years. (See: Emacs Lisp Power! Transform Text Under Cursor.) So, today's problem is simply a matter of 5 minute job of copy-pasting.

Here's the code for today's case:

(defun chinese-linkify ()
  "Make the current Chinese character into several Chinese dictionary links.
If there's a text selection, use that for input."
  (let (ξchar p1 p2 templateStr resultStr)

    (if (region-active-p)
          (setq p1 (region-beginning) )
          (setq p2 (region-end) )
        (setq p1 (point) )
        (setq p2 (1+ (point)) ) ) )

    (setq ξchar (buffer-substring-no-properties p1 p2))

    (setq templateStr
          "<div class=\"cdict\">
<a href=\"http://en.wiktionary.org/wiki/獵\">Wiktionary 獵</a> ◇
<a href=\"http://translate.google.com/#zh-CN|en|獵\">Google 獵</a> ◇
<a href=\"http://www.chineseetymology.org/CharacterEtymology.aspx?submitButton1=Etymology&amp;characterInput=獵\">history 獵</a>

    (setq resultStr (replace-regexp-in-string "獵" ξchar templateStr))

    (delete-region p1 p2)
    (insert resultStr) ))

The code is pretty simple. If you are not familiar, see: Emacs Lisp Idioms (for writing interactive commands).

This is truely a time saver. Before, i had to go to the website, type in the Chinese char to search, copy url, back to emacs then paste, then call a emacs function to make the link. Do this for 3 other sites. Now, i just presse one button in emacs.

The added advantage is that they'll have consistent form. So if later on i want to add one more dictionary to my dictionary list, or use a different HTML tag/format, i can use a truely regular regular expression in find & replace to do it site-wide.

Note: technically, by spec, you should not have Chinese chars in a URL. They should be URL Encoded (See: URL Percent Encoding and Unicode) by bytes from the char's UTF-8 encoding. e.g. For example, the char 魂's UTF-8 encoding is 3 bytes of the following hexadecimal: E9 AD 82. So, this url:


should be:


However, i think the situation of percent encoding is a abomination. (See: Problems of Symbol Congestion in Computer Languages (ASCII Jam; Unicode; Fortress).) I decided to not botch my Chinese chars in URL. Today's browsers will automatically do the encoding for you when user clicks on it.

The weird ξ you see in my elisp code is Greek x. I use unicode char in variable name for experimental purposes. You can just ignore it. (See: Programing Style: Variable Naming: English Words Considered Harmful.)

Most of the time my work in emacs is about HTML. So, vast majority of my elisp examples are processing HTML. Though, i think not that many people deal with raw HTML these days. I think most of you who use emacs do C, C++, PHP, Python, etc in your day jobs. I've been thinking of writing some elisp tutorial for some needs of working with these langs, but since i don't work with them that much these days, its hard to come up with some examples. What are some applications of interactive elisp code that might work for you? I'd love to hear it.

Popular posts from this blog

Browser User Agent Strings 2012

11 Years of Writing About Emacs

does md5 creates more randomness?