Programing Language Design: Syntax Sugar Problem: Irregularity vs Convenience

Perm URL with updates: http://xahlee.info/comp/syntax_irregularity_vs_convenience.html

one of the idiocy of HTML spec is that the “pre” tag discards the first blank line.

for example, if you have:

<pre style="border:solid thin red">
x = 3
</pre>

Here's how your browser renders it:

x = 3

The first blank line is ignored. However, only the FIRST blank line is ignored. If you have 2 blank lines in the beginning, it'll be rendered with 1 blank line.

<pre style="border:solid thin red">

x = 3
</pre>

x = 3

They do this, because, it's convenient for coder. Because, one likes to see the pre content aligned to the left in raw HTML.

For example, you rather write it this way:

<pre>
1
2
3
</pre>

than

<pre>1
2
3</pre>

this is a idiocy because it mixes convenience with syntax.

The problem comes, when you have programs that deal with code. That's why, in programing, computing tech, there are one hundred exceptions, irregularities, and thus bugs, headaches. The worst offender is unix shell syntax. 〔☛ Unix Shell Syntax Irregularities Galore

At first, syntax conveniences like these are nice. The rules are lax, and you use it without problems. But then, once the language grew, and you deal with many languages, you find everywhere there's exceptions, special rules, and you can't remember what rule they thought were convenient at the time, and there is no simple systematic rule about them. Each one becomes a ad hoc syntax soup of hell.

For example of the bad consequence of the “pre” tag, see: CSS “pre” Problem: No Linebreak After Tag. And syntax coloring tools that color computer program source code in HTML, have to work-around the problem by wrapping “span” tag with line-breaks at unnatural places. 〔☛ Emacs Lisp: Syntax Color Source Code in HTML

Almost all languages have this problem, to various degrees. C language syntax is worst. It is basically of no design. Most of the syntax “design” is based on user's typing convenience at the time. 〔☛ Programing: Why I Hate C〕 Even lisp, didn't escape this problem. 〔☛ Programing Language: Fundamental Problems of Lisp

Another major problem of HTML irregularity is letting users to omit ending tags. Big offender is Google, telling users to omit ending tags in their HTML style guide. The consequence is that people will omit ending tags that cannot be ommited, and we are back to syntax-soup quirk-mode hell. See:

How to Solve the Syntax Sugar Problem?

This problem should be solved by clear separation of issues. For example, XML takes the regularity approach, and you can have editors that represent the data to the user in a most easy-to-read format, or structural editors. Another approach is Mathematica, where you have a systematic syntax layer. So, at the bottom layer, it's purely nested like XML and LISP, but without irregularities, and another layer on top, that supports all the syntax warts we human have got used to, as in traditional math notation and infix notation. Yet, there's a simple, regular, systematic, transformation rules that can change these two layers easily.

Instead of syntax sugar, you should have a 100% regular syntax, or a layer with systematic rule, and let editor deal with it, and present code to user in a different layer.

See also:

Popular posts from this blog

Browser User Agent Strings 2012

11 Years of Writing About Emacs

does md5 creates more randomness?