2010-06-09

Emacs nxml-mode Fontification Changes

Perm url with updates: http://xahlee.org/emacs/emacs_23.2_nxml_fontification_changes.html

Emacs nxml-mode Fontification Changes

Xah Lee, 2010-06-06

Apparently, the fontification in nxml-mode changed in emacs 23.2 from 23.1.

In emacs 23.1, all tag names have font “function-name”, and attributes have “variable-name.”

In emacs 23.2, the fontification is more complex. Most tag names have the font property of “nxml-element-local-name”. Attributes have “nxml-attribute-local-name”. Attribute values has “nxml-attribute-value”. Double quotes has “nxml-attribute-value-delimiter”. The slash in “/>” is “nxml-tag-slash”. Angle brackets are “nxml-tag-delimiter”.

Normally you wouldn't notice this, but you would if you regularly use htmlize to turn a buffer into html code. Then, you'll see that that each font name gets turned into a span tag, and the 23.2's version is much more verbose.

Here's a example. Suppose you have this xml code:

 <entry>
   <title>Tarsier</title>
   <id>tag:xahlee.org,2010-06-09:144817</id>
   <updated>2010-06-09T07:48:17-07:00</updated>
   <summary>photo</summary>
   <content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<div class="img">
<img src="../creatures/i/Tarsius_Syrichta_250px.jpg" alt="Tarsius Syrichta 250px" width="250" height="375" />
<p class="cpt">Tarsier</p>
</div>
</div>
   </content>
  <link rel="alternate" href="http://xahlee.org/creatures/tarsier.html"/>
 </entry>

here's what the htmlized version looks like in emacs 23.2:

 <span class="nxml-tag-delimiter">&lt;</span><span class="nxml-element-local-name">entry</span><span class="nxml-tag-delimiter">&gt;</span>
   <span class="nxml-tag-delimiter">&lt;</span><span class="nxml-element-local-name">title</span><span class="nxml-tag-delimiter">&gt;</span><span class="nxml-text">Tarsier</span><span class="nxml-tag-delimiter">&lt;</span><span class="nxml-tag-slash">/</span><span class="nxml-element-local-name">title</span><span class="nxml-tag-delimiter">&gt;</span>
   <span class="nxml-tag-delimiter">&lt;</span><span class="nxml-element-local-name">id</span><span class="nxml-tag-delimiter">&gt;</span><span class="nxml-text">tag:xahlee.org,2010-06-09:144817</span><span class="nxml-tag-delimiter">&lt;</span><span class="nxml-tag-slash">/</span><span class="nxml-element-local-name">id</span><span class="nxml-tag-delimiter">&gt;</span>
   <span class="nxml-tag-delimiter">&lt;</span><span class="nxml-element-local-name">updated</span><span class="nxml-tag-delimiter">&gt;</span><span class="nxml-text">2010-06-09T07:48:17-07:00</span><span class="nxml-tag-delimiter">&lt;</span><span class="nxml-tag-slash">/</span><span class="nxml-element-local-name">updated</span><span class="nxml-tag-delimiter">&gt;</span>
   <span class="nxml-tag-delimiter">&lt;</span><span class="nxml-element-local-name">summary</span><span class="nxml-tag-delimiter">&gt;</span><span class="nxml-text">photo</span><span class="nxml-tag-delimiter">&lt;</span><span class="nxml-tag-slash">/</span><span class="nxml-element-local-name">summary</span><span class="nxml-tag-delimiter">&gt;</span>
   <span class="nxml-tag-delimiter">&lt;</span><span class="nxml-element-local-name">content</span> <span class="nxml-attribute-local-name">type</span>=<span class="nxml-attribute-value-delimiter">"</span><span class="nxml-attribute-value">xhtml</span><span class="nxml-attribute-value-delimiter">"</span><span class="nxml-tag-delimiter">&gt;</span>
<span class="nxml-tag-delimiter">&lt;</span><span class="nxml-element-local-name">div</span> <span class="nxml-namespace-attribute-xmlns">xmlns</span>=<span class="nxml-namespace-attribute-value-delimiter">"</span><span class="nxml-namespace-attribute-value">http://www.w3.org/1999/xhtml</span><span class="nxml-namespace-attribute-value-delimiter">"</span><span class="nxml-tag-delimiter">&gt;</span>
<span class="nxml-tag-delimiter">&lt;</span><span class="nxml-element-local-name">div</span> <span class="nxml-attribute-local-name">class</span>=<span class="nxml-attribute-value-delimiter">"</span><span class="nxml-attribute-value">img</span><span class="nxml-attribute-value-delimiter">"</span><span class="nxml-tag-delimiter">&gt;</span>
<span class="nxml-tag-delimiter">&lt;</span><span class="nxml-element-local-name">img</span> <span class="nxml-attribute-local-name">src</span>=<span class="nxml-attribute-value-delimiter">"</span><span class="nxml-attribute-value">../creatures/i/Tarsius_Syrichta_250px.jpg</span><span class="nxml-attribute-value-delimiter">"</span> <span class="nxml-attribute-local-name">alt</span>=<span class="nxml-attribute-value-delimiter">"</span><span class="nxml-attribute-value">Tarsius Syrichta 250px</span><span class="nxml-attribute-value-delimiter">"</span> <span class="nxml-attribute-local-name">width</span>=<span class="nxml-attribute-value-delimiter">"</span><span class="nxml-attribute-value">250</span><span class="nxml-attribute-value-delimiter">"</span> <span class="nxml-attribute-local-name">height</span>=<span class="nxml-attribute-value-delimiter">"</span><span class="nxml-attribute-value">375</span><span class="nxml-attribute-value-delimiter">"</span> <span class="nxml-tag-slash">/</span><span class="nxml-tag-delimiter">&gt;</span>
<span class="nxml-tag-delimiter">&lt;</span><span class="nxml-element-local-name">p</span> <span class="nxml-attribute-local-name">class</span>=<span class="nxml-attribute-value-delimiter">"</span><span class="nxml-attribute-value">cpt</span><span class="nxml-attribute-value-delimiter">"</span><span class="nxml-tag-delimiter">&gt;</span><span class="nxml-text">Tarsier</span><span class="nxml-tag-delimiter">&lt;</span><span class="nxml-tag-slash">/</span><span class="nxml-element-local-name">p</span><span class="nxml-tag-delimiter">&gt;</span>
<span class="nxml-tag-delimiter">&lt;</span><span class="nxml-tag-slash">/</span><span class="nxml-element-local-name">div</span><span class="nxml-tag-delimiter">&gt;</span>
<span class="nxml-tag-delimiter">&lt;</span><span class="nxml-tag-slash">/</span><span class="nxml-element-local-name">div</span><span class="nxml-tag-delimiter">&gt;</span>
   <span class="nxml-tag-delimiter">&lt;</span><span class="nxml-tag-slash">/</span><span class="nxml-element-local-name">content</span><span class="nxml-tag-delimiter">&gt;</span>
  <span class="nxml-tag-delimiter">&lt;</span><span class="nxml-element-local-name">link</span> <span class="nxml-attribute-local-name">rel</span>=<span class="nxml-attribute-value-delimiter">"</span><span class="nxml-attribute-value">alternate</span><span class="nxml-attribute-value-delimiter">"</span> <span class="nxml-attribute-local-name">href</span>=<span class="nxml-attribute-value-delimiter">"</span><span class="nxml-attribute-value">http://xahlee.org/creatures/tarsier.html</span><span class="nxml-attribute-value-delimiter">"</span><span class="nxml-tag-slash">/</span><span class="nxml-tag-delimiter">&gt;</span>
 <span class="nxml-tag-delimiter">&lt;</span><span class="nxml-tag-slash">/</span><span class="nxml-element-local-name">entry</span><span class="nxml-tag-delimiter">&gt;</span>

The original is 492 chars, now it is 5627 chars, about 10 times.

Every left angle bracket “<” becomes:

<span class="nxml-tag-delimiter">&lt;</span>

Right angle bracket “>” becomes:

<span class="nxml-tag-delimiter">&gt;</span>

For every double quote, now it becomes:

<span class="nxml-attribute-value-delimiter">"</span>

and slash “/” in closing tags become:

<span class="nxml-tag-slash">/</span>

In emacs 23.1, the angle brackets are simply “&lt;” and “&gt;”, slash and double quote are left as is.


So, this means, if you have a lot xml tutorial web pages showing xml code in color, it'll become huge.

Though, over all i'm not sure this is a problem. The fontification in emacs simply improved by being more exact and elaborate. When you turn buffer fontification into html/css code, naturally you got this huge number of span tags, and that's just the design of html. But overall, the web today has video streaming all over, so few kilo bytes is normal, and doesn't take more time for browser to parse it than say loading a single image file.

If you look at any of the top site's html source code, most of them are not human readable since about mid 2000s.