TeX Pestilence

Perm url with updates: http://xahlee.org/cmaci/notation/TeX_pestilence.html

The TeX Pestilence

Xah Lee, 2004-08

The following essay, is edited version of my email messages, on the harm of TeX.

Problems of TeX

TeX is detrimental because it harbors ignorance of the structural content embodied in most math notations in most fields. What TeX does is typesetting, as opposed to math expression encoding. In other words, what TeX does is pretty-printing.

The language is designed in a way that any structural info in a math expression are botched. As such, TeX, even though it is a full-fledged computer language capable of great programing, but it understands zilch of math expressions, it encodes zilch math expressions. Now that is a egregious error of a computer language purporting to express mathematics. And more so because it is a product of a mathematician who should've known better.

LaTeX mended TeX by turning a pretty printing system into a structured documentation system. However, since TeX at its core is a pretty picture system, no amount fix can correct that other than complete discard. LaTeX fixed nothing about TeX's botching of the structural info in math notations.

As a testament of TeX's shortcomings, TeX botches structural info in math expressions so bad that no program whatsoever short of Human-Level Artificial Intelligence will be be able to convert from TeX into another system of math notation.

As a pretty-printing system, Mathematica is no match for TeX. As a knowledge representation system that is filled with the need of math notations old and new, Mathematica is superior to TeX/LaTeX by far in just about any aspect of this endeavor.

TeX's Damage to Society

TeX has done significant damage to the math community, by getting people to focus and love pretty printing, which is what typesetting is mainly about. And the key to see this is that TeX has absolutely no concept of the semantic content or structure of math expressions. As to alternative systems, as i've argued, there is Mathematica, and subsequently MathML, and i'm sure others based on semantic content and automatic formatting that i'm not aware of, and or killed by TeX before they possibly could have a chance.

• TeX is a system for typesetting. Typesetting is primarily concerned with esthetics. As a art form, typesetting is insignificant. Typesetting, taken by itself other than reading facilitation, is in general of little serious utility.

• TeX as a knowledge representation system used by scientists, seduced vast number of scientists into the rather wasteful activity of appearance doodling.

• TeX, because it is a pure typesetting system and is not aware any structural info embedded in math notations, it destroys this information whenever math is written in TeX. Even notational systems using plain, one-dimensional ascii text such as in Mathematica before version 3 (~1997), or other computer algebra language, maintains structural info. (by necessity, because they have to actually work in some sense as a math formalism system)

• Because TeX is free, it halts progress of competing systems and ideas. If TeX had not been invented and given out free, systems that preserve structure (e.g. MathML, Mathematica) for the purposes of displaying 2-dimensional math notations, would have been invented earlier or wide spread.

Free software acts as a virus. Free systems have the potency to wipe out any other protocol or design, including any superior ones (unless they are also free). A example is the various Unix systems and protocols has done huge irreversible damage to society.

• As a computer language, the design of TeX's syntax and semantics is arguably bad. The syntax is rather inconsistent, complex, imperative style (as opposed to functional programing), and in general the language is difficult to use to control the typesetting or documentation structure. It spread bad ideas or experiences to non-programers (e.g. mathematicians).

Any computer language, however bad is its quality, will attract users once exposed to it, and may become addictive, and its users will feel empowered. It's like jigsaw puzzle in comparison to Rubic Cube. In the context of puzzles, the Jigsaw is a complete waste of time as a game with respect to mathematical values, it however ferment people into fans. A example of this is in computing, is the language Perl.

The above ideas is not new, however, it is hardly ever heard. All one hear every day everywhere are chants of the greatness TeX by ignorant computing geeks and mathematicians. The former know coding, the latter knows abstract specialization, both of which in general are ignorant of the history, psychology, and linguistic theories of knowledge presentation and symbolic system. This must be stopped.

Downfall of TeX

I do think TeX's popularity is waning because of its inappropriateness. They being:

  • Not based on Graphical User Interface.
  • Not interactive. (its usage requires “compile” cycles)
  • Inconsistent syntax.
  • Bad language semantics. (it is difficult to use TeX as a embedded language in typeset programing or document structure programing. (compare: Microsoft Word with VisualBasic, Emacs with elisp))
  • A system based on appearance, not on math expression structure.

As of today, structural and semantic info are receiving greater and greater awareness as opposed to formatting or displaying aspects. Witness today's thoughts of Semantic Web, and a slew of technologies oriented with maintaining structure such as SGML/HTML/XML/MathML, DOM, RSS, Mathematica. Even CSS, is structural based that removed formatting issues from HTML, and is today moving to XLS for XML.

In part, TeX isn't to blame for its own fall, because it is set out to do typesetting and it does that well. The thing is, the entire typesetting business itself is largely a waste of humanity's time. As communication tech progresses, typesetting as it is understood (i.e. concerns about em/en-space, ligatures, typefaces) is going to be extinct. (See: The Moronicities of Typography.)

The Structure in Math Expressions

Math notation does not always have well-defined structure or meaning. However, it is important to consider the preservation of semantic in building a math notation representation language. For example, if we have x^2+Sqrt[x], the system should know that it is a operation applied to two things, each of which is some particular operation on symbols x. Similarly for the traditional notation of matrix, subscripts, sets, absolutes/norms, summation, derivative, integral, ... etc. For any new math notation that does not have a clear structure (say some bunch of arrows towards symbols in some homological algebra), we can for example have a tag that markup the part of expression to indicate that it has no meaning and is for displaying 2D-typeset purposes only.

In TeX, every math expression is just a sequence of structurally meaningless but micro-position-aware symbols.

TeX and Microsoft Equation Editor Considered Equal

TeX, being a pretty-printing system, can be considered in the same class as Microsoft Word Equation Editor. The difference lies only in their mode of operation. Specifically, TeX is by compile and batch operation like a typical computer language, and the Equation Editor is by using a mouse to click menus and buttons with graphical user interface. The heart of both as far as math notation is concerned, is doodling of a em space or en dash. All math notation's semantic structure are lost.

LaTeX vs Microsoft Word for Structured Document

Consider LaTeX as a structured documentation system. Microsoft Word's Outline feature does better, plus it has background spell checking, tabs setting, embedded version control, voice annotations, and so on. (provided the person know how to properly use MS Word)

TeX's Place in the World of Typesetting

In order to evaluate TeX asides from its massive brainwashing of the Math community, the question we have to ask is to what degree TeX has made a impact in strictly the typesetting and publishing community. I agree TeX as a typesetting system has made a major impact in this community, however, as a technology, it is far from taking a leading role. Consider QuarkXpress, FrameMaker, PDF, all are in similar market and are not free. (PDF document creating software made by Adobe is not free)


So you know more about this than all of us professional mathematicians who have been working with it all our lives?

Successful and professional writers will in general not know much about linguistics or writing systems. Similarly, musicians typically are ignorant of the history, design, of musical notation systems.

Likewise, professional mathematicians, although they have used math notation all their lives, few have studied or thought about the history of math notation, writing systems, symbolic logic systems, syntax of computer algebra systems or theorem proving systems, linguistics of computer languages, cognizance sciences, psychology of perception, all are related to math notation systems.

In the opinion of just about everyone who knows about such things, TeX is the best markup language where precise and beautiful typography is essential AND for typesetting mathematical formulas and equations. The fact is that ALL professional mathematicians learn TeX as graduate students and write their thesis in TeX and from then on are so hooked on it that they not only write their mathematics in TeX but also usually their letters and other documents. Something written in MS Word just looks ugly by comparison, particularly if it contains formulas using the brain-dead MS equation editor. BTW, just about all physicists use TeX for the same reason.

I'm very well acquainted with the history TeX and its position in society, how scientists receives it, and how ubiquitous and its position as a standard. I've read extensively about TeX a decade ago. I'm very well aware how it compares to something like MS Word or other related WYSIWYG equation editors.

On the whole, my thesis is that TeX, although a extremely successful and well done tool that has satisfied a niche, but it may in fact be a disservice in humanity in that it massively mislead people in the wrong direction. If TeX did not exist, then typesetting will remain in the realm of professional typesetting and printing community, while mathematicians and scientists, will not have wearied their energy into typesetting, but instead put their focus and energy in coming up with a syntax that makes mathematics readable as well as meaningful, based on immense modern knowledge of symbolic logic, computer algebra, and linguistics of computer languages.

Consider typesetting for a moment. What is it? It is no more than pretty-printing. It has some element that facilitates reading, but only a bit. The bulk of it is meticulousness is about appearances. Typical typesetting concerns are things like en-dash, em-dash, ligatures, small-caps, typeface design, serif, sans-serif, micro positions, kerning ... etc. Typesetting is a cultural development. Even if we suppose that the esthetics in Western typesetting is universal, its esthetics values in the context of artistic endeavors is dismal. (For example, contrast it with calligraphy, painting, sculpture etc.)

By introducing TeX the way it is designed, it encroaches the symbolic language of mathematics with pretty-printing, and devalued the system of symbolic communication used by mathematicians, and subtly derailed what mathematicians do best.

If TeX as it is designed has not been invented, then today we might already have a alternative system such as the proprietary Mathematica, or MathML (very much influenced by Wolfram Research Inc.), which has taken consideration that the symbolic language of mathematics is more than just pretty glyphs arranged in a special way, in that it has a not-well-understood but undeniable structure and relation to mathematics, to such a degree it can influence where mathematics is going, and a design with these thoughts in mind helps us actually advance computational mathematics.


In recent years, some software have taken steps to address TeX's shortcomings. Here are examples:

  • GNU TeXmacs. A word-processor-like program that renders 2D math formulas and also functions as front-end to several math packages such as Mathematica. TeXmacs borrows ideas from TeX and Emacs but is not dependent on them.
  • LyX A word-processor-like front-end to LaTeX, designed for ease of use.
  • http://tug.ctan.org/tex-archive/macros/latex/contrib/cool/ Content Oriented LaTeX. A LaTeX package that tries to retain the math expression's structure.
  • XeLaTeX A TeX engine that support unicode. That is, special characters such as greek letters, Chinese chars or different languages, can be entered directly without using markup.


“Mathematica Notation: Past and Future” (2000-10-20), by Stephen Wolfram, at http://www.stephenwolfram.com/publications/recent/mathml/index.html.

A great article on math notation, This article should teach those coding sophomorons and idiotic authors in the computer language design community, who harbor the notion that syntax is not really important, picked up by all the elite i-reddit & twittering & “Hacker News” am-hip dunces.

Personally, particular interesting info i've learned is that, for all my trouble in the past decade expressing problems of traditional math notation, i learned from his article this single-phrase summary: “traditional math notation lacks a grammar”.

The article is somewhat disappointing though. I was expecting he'd go into some details about the science of math notations, or, as he put it aptly: “linguistics of math notations”. However, he didn't touch the subject, except saying that it haven't been studied.

There are some errors in his article. On this page: http://www.stephenwolfram.com/publications/recent/mathml/mathml2.html He mentioned the Plimpton 322 tablet. It is widely taught in math history books, that this table is pythagorean triples. However, in recent academic publications (2002), it is suggested that this is not pythagorean triples, but rather: “a list of regular reciprocal pairs” as teacher's solutions to exercises for students. See Plimpton 322.

Popular posts from this blog

11 Years of Writing About Emacs

does md5 creates more randomness?

Google Code shutting down, future of ErgoEmacs