What's the Most Readable Computer Language Syntax?

Perm url with updates: http://xahlee.org/comp/whats_most_readable_comp_lang_syntax.html

What's the Most Readable Computer Language Syntax?

Xah Lee, 2011-01-05

On 2010-12-31, girosenth <girose...@india.com> wrote:

How to improve the readability of (any) LISP or any highlevel
functional language to the level of FORTH ?

There are many people who have trivia complaints about parens in lisp,
but I dont.

LISP is a prefix notation.

Please don't call it “prefix”, because that's misleading. Call it “nested notation” would be a practical term and not incorrect.

When you say “prefix” or “postfix” notation, implied in the word is the use of operators with implied ordering between the operator and adjacent symbols (operands).

Lisp syntax does not use operators, or rather, it primarily uses just one single operator (the match-fix parenthesis). And as a match-fix operator, the word “pre-fix” is misleading because that word is primarily for operators used in a linear (none nested) way.

For example, * + (2 4) 3 is a form of prefix syntax, however, it has very different properties from lisp's “prefix” (* (+ 2 4) 3), and you might also call the traditional math notation f(x,y) as prefix.

See: The Concepts and Confusions of Prefix, Infix, Postfix and Fully Nested Notations.

sequence of operations would look like this on operands (ops) :

(f ops (g ops (h ops (j ops (k ops (l ops ))...))))

How do you make it readable ?
How do you home to the center or centers ?

(f (g (h (j (k (l ops)))...)))

is easy to read or

ops l k j h g f

???

Which is linear reading from L->R ? LISP or FORTH ?

AND, if I must break the nested function structure, will I not be
visiting the forbidden territory of imperative programming ?

(setq L (l ops))
(setq K (k L  ))
....
....
(setq F (f G  ))

If I use setq, I am using globals, atleast in elisp.

If I use let*, I have limited options as I am constrained inside the
rigid structure of let*

(let*
  ((L (l ops))
   (K (k L  ))
   ....
   (F (f G  )))

some more
)

Is there a postfix functional language that also gets rid of parens
and is not as primitive as FORTH or POSTSCRIPT ?

you might get some tips about this from this article: What's Point-free Programing? (point-free function syntax).

In lisp, xml, all use match-fix that often result deep nesting and thus hard to read, inconvenient to type, prone to collapse if one single bracket is missing. So, recently i've had some thought about a syntax that does not use any match-fix operators (nesting of symbols) whatsoever. For example, in Haskell, OCaml, you don't need parens for function's arguments, and such form makes function sequencing quite convenient to use, similar to unix shell's pipe. But my current conclusion is that:

  • Strictly no nesting whatsoever may not be a desired property.
  • When done to a large extend yet not 100% (e.g. APL and derivatives), you sacrifice several advantages in syntax and also some semantic possibility in the lang. (e.g. you can't represent a data structure of a tree literally, or other inherently nested structure such as a matrix. Humm, actually you can do like python, using indentation for level.)

also note, reduction or elimination of match-fix operators in so-called stack based lang such as Forth, does not really qualify as a syntactical solution. It rids of match-fix by a semantic solution. (i.e. there's implicit hiding of arguments. You have this non-syntax concept of pushing it into a “stack”). Another way to view this is that, when we look at Forth (which am not familiar) or HP-28S Advanced Scientific Calculator's language (which i'm familiar), or if we look at so-called “reverse polish notation” RPN, the RPN is not a complete syntax system on its own right, but relies on a language system... (not sure if anyone see what i mean here... i need to do a lot more thinking to express this in some “formal” way, so as to clearly indicate the property that's categorically different)

What are the syntax advantages of ERLANG, ML, CAML, OCAML, HASKELL, PROLOG, RUBY over LISP?

That's a loaded question of course.

But my own pet peeve is that there is to-date no major general purpose comp lang that actually have a lexical grammar. The awareness of the concept of a grammar for syntax is little known among programers. Each lang basically create a bunch of ad hoc syntax, that are not consistent nor well defined. In fact, no major general purpose languages even have a lexical grammar per se. (what we have are just tools that helps define and parse)

The only close exception is XML, but of course it is just a markup lang.

for some detail, see:

How does one improve readability so that the code is self-commenting?

it's a common mistake among programers to think that certain lang's syntax is so clear that it is “self-documenting”. For example, lisp coders said it in 1970s or earlier when sexp and the idea of a lang that reflex math directly is new. Mathematica literature said it in 1990s because pattern matching is a good fit for symbolic manipulation. Haskeller has said it because they think Haskell code is so much a direct mirror of traditional math notation for functions (and this is quite laughable when compared to Mathematica). And Ruby coder said it because they feel how the syntax mirror programing algorithms so clearly and concisely.

Perl mongers to various degree also think of that of their lang, because how the idiomatic perl allow omission of many syntactical details and so malleable and they think it reflect the way human uses natural lang like english.

So, sometimes in discussion, someone shows you a line of code without any comment. You are perplexed at what the code does, but when you ask, you find out that they are genuinely surprised because they think that the code's meaning is so plain and obvious that any additional explanation actually complicate it.

Part of all these feelings is due to the fact that when you are familiar with a lang, it becomes part of your thinking, a written language to express your thoughts. Especially so if you don't know much of other langs. You are too familiar with the lang to realize the fact that different languages have always been a barrier to communication. The more expert you are with a lang, and the less the number of other langs you actually work with in depth, the more likely you forgot that different langs are just different. What's obvious to you, even just a short line, is really just a string of gibberish symbols mixed together in a weird way to another lang user.

So, your question “How does one improve readability so that the code is self-commenting?” has many answers depending what you really want. Comp lang syntax readability is a subject in the field of psychology, linguistics, cognition. Among comp lang forums among programers, they knew nothing of it, and what you read there usually is utter refuse. But to take a general programer practioner's point of view, for example, Python is considered very readable, and the primary reason for saying so is actually just code formatting (i.e. short lines, and all neatly indented), and the reason python code are well formatted is because the formatting is worked into the language's syntax.

Take a complete different perspective, there's Mathematica. For example, what do you think if comp lang source code can be like traditional math notation that's so-call 2-dimensional notation? e.g. you have x^3 with the 3 raised and smaller and without the extra symbol “^”, you have 1/2 with 1 on top of a bar and 2 below the bar, etc. And when the expression gets complex, e.g.

-b + Sqrt[b^2-4 a c]/(2 a)

it becomes much easier to read when in traditional math notation. In Mathematica, its syntax system is such that, it can display the source code in 2-dimensional notation automatically, if you want. you can see some example here, also in PDF format: Math Typesetting, Mathematica, MathML.

The same expression in lisp style you get this:

(/ (+ (- b) (sqrt (+ (^ b 2) (- (* 4 a c))))) (* 2 a))

Is it readable? Many lispers insist that its the most readable syntax.

See also: What's Function, What's Operator?.

Popular posts from this blog

11 Years of Writing About Emacs

does md5 creates more randomness?

Google Code shutting down, future of ErgoEmacs