Emacs Lisp Tutorial: Replacing Title Brackets to HTML Tags

Perm url with updates: http://xahlee.org/emacs/elisp_replace_title_tags.html

Emacs Lisp Tutorial: Replacing Title Brackets to HTML Tags

Xah Lee, 2010-12-26


In my blogs, i use angle brackets for book titles and article titles. For example:

• 〈The Rise of “Worse is Better”〉 (1991) ...
• 《The Unix-Hater's Handbook》 (1994) ...

When a page is a book list, i'd prefer the book names to be colored so it is more readable. So, i want a html tag like this:

• <span class="atlt">The Rise of “Worse is Better”</span> (1991) ...
• <span class="bktl">The Unix-Hater's Handbook</span> (1994) ...

With proper CSS, the titles will be colored, and the brackets also added.. Here's what it looks like:

  • The Rise of “Worse is Better” (1991) ...
  • The Unix-Hater's Handbook (1994) ...

The CSS would look like this:

span.bktl, span.atlt {color:#006400}

(See: HTML/CSS Tutorial.)

It is very tedious to do the replacement, even using Emacs Keyboard Macro feature. I'd like to have the changes done by just pressing a button.


Wrote this elisp code on the spot. (took about 15 min.)

(defun title-bracket-to-html-tag ()
  "Replace all 《...》 to <span class=\"bktl\">...</span> in current buffer.
Also, replace 〈...〉 to <span class=\"atlt\">...</span>.
The bracket 《》 is used for book titles.
The bracket 〈〉 is used for other titles (article, film, music...).

See: 〈Intro to Chinese Punctuation with Computer Language Syntax Perspectives〉
URL `http://xahlee.org/Periodic_dosage_dir/bangu/chinese_punctuation.html'"
  (let (changedItems)
    (setq changedItems '())

      (goto-char (point-min))
      (while (search-forward-regexp "《\\([^》]+?\\)》" nil t) 
        (setq changedItems (cons (match-string 1) changedItems ) )
        (replace-match "<span class=\"bktl\">\\1</span>")

      (goto-char (point-min))
      (while (search-forward-regexp "〈\\([^〉]+?\\)〉" nil t) 
        (setq changedItems (cons (match-string 1) changedItems ) )
        (replace-match "<span class=\"atlt\">\\1</span>")

    (with-output-to-temp-buffer "*changed items*" 
       (lambda (myTitle)
         (princ myTitle)
         (princ "\n")

The code is not that complex. If you know Emacs Lisp Basics, then you can understand it.

You can try the code. Put the following content in a buffer:

• 〈Defective C++〉 (2007), by Yossi Kreinin. At: yosefk.com.
• 《The Unix-Hater's handbook》 (1994), by Simson Garfinkel, Daniel Weise, Steven Strassmann, and Don Hopkins. The entire book is available at mit.edu. (ℤ local copy)
• 〈The Rise of “Worse is Better”〉 (1991), by Richard P Gabriel. At: dreamsongs.com
Richard Gabrielw is a well known figure in lisp community, the starter of what's now known as XEmacs. He's the recipient of ACM's 1998 Fellows Award and the 2004 Allen Newell Award.
〈The Rise of “Worse is Better”〉 is probably the first article that analyzed the strategy of software success from a evolutionary biology perspective.
• 〈Extreme Programming Explained〉 (2008), by Yossi Kreinin. At: yosefk.com
• 〈Java: Slow, ugly and irrelevant〉 (2001-01-08), by Simson Garfinkel. At: salon.com (ℤ local copy)
• 〈Optimization: Your Worst Enemy〉, (1999), by Joseph M Newcomer. At: flounder.com (ℤ local copy)
• 〈Will it rot my students' brains if they use Mathematica?〉 (2002-05), by Theodore W Gray. At: theodoregray.com (ℤ local copy)
Theodore is the author of Mathematica frontend. The article discusses educational math software, video games, and violence.
• 〈Go To Statement Considered Harmful〉 (1968), by Edsger W Dijkstra. Source; (ℤ local copy)
• 〈Skin Cancer〉 (2000), by Greg Knauss. At: suck.com. (ℤ Local copy)
A satire on Netscape browser and the “Skin” phenomenon.
• 〈Censorzilla〉 (2004), by Jamie Zawinski. At: jwz.org (ℤ local copy)
Jamie is a notorious programer of xemacs and Netscape web browser, has written a webpage that contains codes from Netscape browser before its Open Source release. Note the profanity laiden comments and what they say. It gives a indication of the pain and fucked-up-ness of computing industry.
• 〈Let's Make Unix Not Suck〉 (1999), by Miguel De Icaza. At: primates.ximian.com
Miguel de Icazaw is the man behind Linux's Gnome project and Mono project. This article is written in the era when unixes do not really have a desktop or any concept of coherent development framework. It was controversial.
• 《Code Complete: A Practical Handbook of Software Construction》, by Steve C McConnell amazon.
Throw away all your Design Patterns or eXtreme Programming books. If you want a scientific book on software development analysis, read this book instead.
Steve McConnellw. «a author of many software engineering textbooks including Code Complete, Rapid Development, and Software Estimation. In 1998, McConnell was named as one of the three most influential people in the software industry by Software Development Magazine, along with Bill Gates and Linus Torvalds.»

Then call “title-bracket-to-html-tag”. It will generate a output on a separate pane showing you all the changed items. Here's the output:

Let's Make Unix Not Suck
Skin Cancer
Go To Statement Considered Harmful
Will it rot my students' brains if they use Mathematica?
Optimization: Your Worst Enemy
Java: Slow, ugly and irrelevant
Extreme Programming Explained
The Rise of “Worse is Better”
The Rise of “Worse is Better”
Defective C++
Code Complete: A Practical Handbook of Software Construction
The Unix-Hater's handbook

Showing the changed items is important, because your text may have a mis-matched bracket. You have have a quick glance in the output and know if something is incorrect. This is also why keyboard macros isn't a good solution here.

Here's a short explanation of the code.

  • Do a repeated find replace using “while”, and “search-forward-regexp” and “replace-match”.
  • For each occurance, also put the title into a list. (so later on we can do a report on changed item)
  • Then, use a temp buffer to print the changed item. (use “with-output-to-temp-buffer”)

All the functions in this code are very basic and is frequently used for text processing jobs. You can just use this function as a template to write your own.

(The book list above are from: The Tech Geekers and Software Engineering.)


CSS Compressor

This find replace is useful in many situations. For example, you know how webmasters often need to compact javascript or CSS code, so the file size becomes smaller and decrease page load time? There are many libraries and tools to do that for js and css . With emacs, you can have a lisp function that does a simple code compacting by just find replace. For example, here's my CSS compactor:

(defun compact-css-region ()
  "Remove unnecessary whitespaces of CSS source code in region.
CSS is Cascading Style Sheet.
WARNING: not bullet proof. Only does a simple compression by find replace."
(let (mystr p1 p2)
(setq p1 (region-beginning))
(setq p2 (region-end))
(setq mystr (buffer-substring p1 p2))
(setq mystr (replace-regexp-pairs-in-string mystr '(["  +" " "])))
(setq mystr (replace-pairs-in-string mystr
["\n" ""]
[" /* " "/*"]
[" */ " "*/"]
[" {" "{"]
["{ " "{"]
["; " ";"]
[": " ":"]

[";}" "}"]
["}" "}\n"]

(delete-region p1 p2)
(insert mystr)

With that, a button compress can compact your code. (See: How to Define Keyboard Shortcuts in Emacs.) You can easily modify the code so it does un-compact or works on whole buffer, or output to a new file.

The code above uses a elisp library “xfrp_find_replace_pairs.el” so the elisp code itself is more compact and readable (See: How to Replace Multiple String Pairs in Emacs Lisp Buffer.), but it is not necessary. You can just use the “while” loop with “search-forward-regexp” like before.

For another application, see: Multiple Find/Replace Pairs on File with Emacs Lisp.

Another different use, but essentially same technique of find and replace, is to turn a plain text table into a html table. See: Emacs Lisp: How to Write a make-html-table Command.

Emacs is beautiful!

Was this page useful? If so, please do donate $3, thank you donors!

Popular posts from this blog

11 Years of Writing About Emacs

does md5 creates more randomness?

Google Code shutting down, future of ErgoEmacs