The Moronicities of Typography

Perm url with updates: http://xahlee.org/Periodic_dosage_dir/bangu/typography.html

The Moronicities of Typography

Xah Lee, 2007-10, 2010-08-13

This article discusses some issues in typography, especially those related to the dash and quotation marks

I've had some interest in typography since early 1990s of the Mac's Desktop publishing era. Basically, i avidly read books about fontography in libraries or Mac magazines such as Mac User or Mac World, and played with fonts and math typesetting in software such as Microsoft World and Mathematica, including reading Knuth's book on typography and using his TeX system, reading about font technology such as TrueType . So, i am generally acquainted with the concepts and issues of typography, though never worked in any professional area related to it.

I'll have to say, the entire typographical efforts and establishment is rather largely a waste of time, similar in the sense that some “artistic” circles chalks up photography as high art, or that grammarians and pedants have voluminous and vociferous writing style guides and guilds.

Some of the most fartful things the typography-sensitive crowd discuss or distinguish are: hyphen, en-dash, em-dash, ligature, kerning, font “design”.

In general, the function of typography is mainly about issues in printing with respect to the facilitation of reading. So, the major issues involved are: line length, line spacing, serif and sans serif fonts, margin, font sizes, and these pretty much are about it. But since how things are rendered on paper does create differences in the sense of esthetics, sometimes rather pronounced difference, thus typography does indeed have some esthetical elements. However, this is blown out of proportion to stupendous profundity.

Hyphen and Dashes

Look at these guilded morons go:

Traditionally an em dash—like so—or spaced em dash — like so — has been used for a dash in running text. The Elements of Typographic Style recommends the more concise spaced en dash – like so – and argues that the length and visual magnitude of an em dash "belongs to the padded and corseted aesthetic of Victorian typography". The spaced en dash is also the house style for certain major publishers (Penguin, Cambridge University Press, and Routledge among them). However, some longstanding typographical guides such as The Chicago Manual of Style still recommend unspaced em dashes for this purpose. The Oxford Guide to Style (2002, section 5.10.10) acknowledges that ...

The above is from Wikipedia Dash.

Here's my own rule regarding the use of dash: There are 2 kinds: the short dash and the long dash. For the short one, press the “-” key on your keyboard. For the long one — as a punctuation mark for embedded thought — press it twice. That's it. Simple and functional. (personally, in my writings published on my site, i replace the double dash by a em-dash “—” only because it is prettier, but don't consider it important)

The character “-” you type on your keyboard is the ASCII 45. The character is named “hyphen” in the ASCII standard, but is “hyphen-minus” in Unicode. (because Unicode has now proper code points for hyphen, figure-dash, en-dash, em-dash, (math) minus, and quite a few others)

As to the typographer's senses and sensibilities about how figure-dash should be used for numbers and en-dash is used for ranges and em-dash is for punctuation and hyphen is for word-breaking ... etc, i regard them pretty much all as trifles produced by morons who's brain is inadequate to sense or tackle the depth of logic and mathematics of languages and structures but fell into a niche of diddling and went on to procure their efforts to heighten themselfs among human animals.

Hypen and Narrow Columns

As for hyphen, as in “breaking a word for words near the margin”, my general advice is to abolish such practice. But what to do in a narrow column of text? My general advice is to abolish the practice of layout using very narrow columns. A related concept here is typographical Justification. My general advice here is to abolish the practice of justification entirely. (leave it jagged at one end; actually as esthetically superior. (and factually functionally superior with regards to reading-facilitation))

The typographic conventions of ligatures (as in adjoining certain letter combinations such as “fi”) should also be abolished.

Quotation Marks

Related here is the quotation mark. If you read Wikipedia Quotation mark, non-English usage, you'll see that there are huge variations. Here's some sample characters used for quotations and their Unicode names.

left/right-pointing double angle quotation:«» single left/right-pointing angle quotation:‹› left/right double quotation mark:“” left/right single quotation mark:‘’ left/right white lenticular bracket:〖〗 left/right black lenticular bracket:【】 left/right corner bracket:「」 left/right white corner bracket:『』 left/right angle bracket:〈〉 left/right double angle bracket:《》 double high-reversed-9 quotation mark: double low-9 quotation mark: single low-9 quotation mark:

Here's a list of conventions of using the double curly quotes:

  • „…“ German, Bulgarian, Croatian, Czech, Estonian, Icelandic, Lithuanian, Romanian, Serbian, Slovak, Slovene, Sorbian...
  • “…” English, American, Irish ...
  • „…” Dutch, Polish, Hungarian, ...
  • ”…” Swedish, Finnish ...

Ain't it bizarre?

For some languages, such as Chinese, it is rational how it developed into using symbols that are different from European languages's curly quotation marks (e.g. 『』「」《》〈〉【】〖〗〔〕). However, among european langs, there are extreme diversity in using the curly quotation marks. Even the American and English reverse the purpose of the single and double quotes. Some lang reverses the semantics of the left/right pair, some lang positions the mark at the bottom instead of top, same lang places them in opposite corners (as opposed to both on top), some lang uses the same symbol to enclose the quoted text.

One thing interesting about the curly double quotation mark pair is that the two symbols are not bilateral symmetric, but is rotational symmetric. That is, if you rotate the left one 180 degrees, you get the right one. Most other matching pairs chars “([{«〖《” are bilaterally symmetric (i.e. there is a horizontal line of mirror reflection). The fact that the curly quotes are rotational symmetry only, must have contributed significantly the weird diversity in their role as the choice in the opening/closing mark and whether to position them level or facing corners. (Note that the Chinese brackets 「」『』 also lack a bilateral symmetry, however, their box-corner shape intuitively and uniquely define their placements.)

Combinatorial Possibilities

This glyph (unicode 8220) points upper-right. This glyph can be mirrored in a vertical line or horizontal line to create the matching variation, a total of 4 possibilities (think of p q b d).

Here are the different pointing curly quotes from Unicode: “ ” ‟

In Unicode, i couldn't find one that is pointing to upper-left. This is somewhat curious. I created one with image here just for the illustration: double curly quote upleft.png.

The quotation mark can be placed on the upper baseline of the text (as in English convention) or lower baseline (as in the beginning quotation mark in German convention), a total of 2 possibilities.

So, 4 choices of glyph orientation, 2 possible positions, that's 8 possibilities for the opening quote. Same for the closing quote. So, the total quotation punctuation convention using the double curly quote is 8x8=64.

It is a good thing that this hasn't been exploited.

How it should be

The function of quotation marks is to demarcate text, and as such delimiters, it should be a matching pair such as ()[]{}, and it should have no more than a bilateral symmetry to reflect the natural one-dimensional (left and right) of written text (or, up/down in Asian langs).

If we can rewrite convention or restart history, i'd say we all just use simple left/right pairs such as ()[]{}<>. Since these already have a purpose, then we could use ‹›«»〈〉《》【】〖〗. The French is actually the most sensible here, their quotation convention is with ‹›«». (though, other countries using these glyphs for quotation also revere direction or use the same glyph for both opening and ending. This is idiocy gone berserk.)

But since we cannot restart history nor do we want to break convention radically because we'd create confusion, what i do today personally of writings published on my website, is to use the most ubiquitous convention, the American convention “like this”. (I experimented in using the French convention of «double angle brackets» thru-out, but that turns out to be too in-your-face for English readers)

Straight Quotation Marks

It is unfortunate, thru the historical development of the typewriter and the computer keyboard and ASCII, that our keyboard doesn't have the proper matching curly quotes, but instead, has the straight quotes. Here's the symbols and their given Unicode name:

quotation mark (ASCII 34):" apostrophe (ASCII 39):'

This creates a problem because it forces us to use the same symbol for a purpose that naturally calls for a matching pair. Using a single symbol is harder to read. Further, it causes global damage when one is missing (e.g. caused by typo, transmission error).

It would've been better, if the typewriter was designed with a matching ‘single curly quote’. This way, the matching property is solved and double quotes can be created by typing twice.

A lot documents in the computing world sticks with a convention by using the back tick (ASCII 96) ` for left curly single quote and the ASCII 39 ' for the right single curly quote, and repeat them for the double version. So, it's like ``this'' and `this'. In particular, this style is used by the Free Software Foundation in their GNU Project.

grave accent (ASCII 96):`

Although this workaround solves a semantic problem in a technical writing context, i think it is rather unnecessary and ugly. For a workaround with the constraint of ASCII for a matching quote, i would have adopted something more symmetric such as ('this') maybe or {'this'} or -'this'-. But the problem with the GNU is that even today, in 2007, where curly quotes have been widely available in word processors for over a decade (and Unicode have been practical and widely available for at least 5 years), they are still using plain ASCII hacks. (in general, GNU and the Open Source morons have like a 5 to 10 years lag in adopting technology, for reasons that are inadvertently intentional and or simply incapable)

No Ending Quotation Marks in Long Paragraphs

There is a very stupid convention used in novel printing. In novels, often a long paragraph is entirely a character's dialog. So, logically, the whole paragraph would be enclosed in matching quotes, and if there are a series of such paragraphs, each and every should be enclosed in matching quotes. However, this is not done because it is considered repetitious. The typography convention is to not use any ending quote, if the quoted text is long. So, we'd have a series of paragraphs that all starts with a opening quote, but is never closed.

This is another moronicity of the typographers. Such irregular tampering starts to show its problems in the computing era. Generally speaking, it makes it difficult to process the text and creates ambiguities, both for human and for machine.

Curly vs Straight Apostrophe

Another moronity in our subject, is about the choice fo glyph for apostrophe as a punctuation in English writing. For example «I'd», «he's», «James'». This is a rather big subject to tackle, dragging in the bag of grammarians and stylists and their guilds and guides and rules and exceptions, but i'll just focus on the typographical aspect of whether to use the straight quote or the curly one «I’d», «he’s», «James’».

RIGHT SINGLE QUOTATION MARK (Unicode 8217): APOSTROPHE (ASCII/Unicode 39):' PRIME (Unicode 8242):

Typically, the issue is that people were using the straight version because the curly one isn't available on the keyboard. However, in my opinion, we should not use the curly version for the apostrophe. Because, the single curly quote already has a logical and conventional semantics. It is used as a matching pair for nested quotes. By using the same character for both apostrophe and closing quote, it confounds the meaning, increase the cost of computation on texts. (e.g.: «“i said: ‘he’s’.”») But also, the nature of apostrophe in no way calls for a slanted glyph.

The reason curly was the convention, is because actually we wanted a slanted apostrophe, however, the slanted version of apostrophe, the unicode char named “Prime”, is not conveniently available, while most word processors today has curly quote. We wanted a slanted one, because that's how we write it by hand. We write it by hand slanted, because that's easier, because most people are right handed, and a vertically straight one is too easy to be confused with I or 1. This is why, in print on on-screen, curly one became the convention for apostrophe.

The gist of this is that if we want to demarcate a text, the symbol used should be a matching pair, and if the semantics does not require a matching pair, we should not be using matching pair. Further, preferably, each symbol should not be used for multiple purposes.

Popular posts from this blog

11 Years of Writing About Emacs

does md5 creates more randomness?

Google Code shutting down, future of ErgoEmacs