what is gamma in digital images


perm url with updates: http://xahlee.org/img/what_is_gamma_correction.html

What Is Gamma In Digital Images

Xah Lee, 2010-02-24

Been quite familiar with the technology of bitmapped image, the mathematical gist of their formats, since about 1995. For example, the diff between gif, jpeg, png. There's RGB color model , but also HSL and HSV, and CMYK.

In one sentence, a image is made up by a rectangular grid of pixels, and a pixel has 3 sets of values, for Red, Green, Blue, together they specify a color for the pixel, and all the pixels form the image. In early 2000s, Alpha channel became important. It allows you to have semi-transparent images. Transparency is important for images on computers, because for example, you might want a image partially transparent so users can see what's behind.

One element of digital image i never understood is Gamma.

Going to Wikipedia's gamma article, but it is unreadable. But then i found this: Gamma FAQ - Frequently Asked Questions about Gamma (2002-12-16), by Charles Poynton Source. Skipping to the “What is gamma?” section, it began with this sentence:

The luminance generated by a physical device is generally not a linear function of the applied signal.

That explains it! How simple and potent explanation.

So, gamma, isn't a intrinsic quality of a image. Rather, it is a value to compensate the quirks of various display devices.

Reading other parts of the article by Charles, i find it excellent. Apparantly, it is written by a scientist, and explain things precisely, relpying on a physics background. In contrast, the hundreds of articles i read in the past 20 years about color, are mostly wishy-washy, or filled with technical arcana that indicates the writer doesn't really understand it. Going over to his home page http://www.poynton.com/index.html, he turns out to be a notable specialist in digital image formats.

Reading the whole gamma faq article, it is truely illuminating. It reminds me how complex is the science of color, and technologies related to color, in particular, digital image formats and image processing. In practice, this complexity is experienced in getting images right, from camera, video, computer displays, screenshots, scanners.

After reading Charle's article, Wikipedia article about gamma now becomes understable.

Another article on Gamma is this: PNG (Portable Network Graphics) Specification, Version 1.2, section “13. Appendix: Gamma Tutorial”, at http://www.libpng.org/pub/png/spec/1.2/PNG-Contents.html.

The whole thing about the gamma complexity, is of 2 main causes:

  • (1) Human perception of the intensities of color or brightness, does not have a linear relationship, with respect to the values of intensity of light waves in physics.
  • (2) The brightness of image display devices, such as CRT, does not have a linear relation of its voltage input.

These are the main reasons. The real complexity is a lot more than that, involving human understanding of nature (physics of color), psychology of perception, technology and engineering limitations (camera and display devices, image and video formats). Starting from the moment you create a digital image either by camera or camcorder, which needs to capture light intensities, thru lens, onto some light-sensitive media (CCD, Photographic film), which eventually needs to be converted to some file format when stored. And when you display it, the software needs to read the data, interpret it appropriately, and eventually convert the bits to voltage in your screen.

One fact surprising to me i learned is, that video formats such as mpeg, video broadcast signals such as NTSC, or image formats (jpeg, png), they actually store the adjusted light intensities (gamma corrected), and not the unprocessed light intensities.

The reason also seems reasonable. For example, for TV, it much economic to do the gamma correction processing once and send the gamma corrected signals, than having every TV receiver do the gamma correction.

Triangular-Grid Based Image Formats And Display

Another thing i wondered is that, whether there are triangular/hexagonal based grid for image format or display system. (i.e. like honeycomb) We know that a image is just a rectangular array of pixels. And monitors are more or less that too. CRT's display unit is basically lines Flat panel displays are all rectangular grids.

(The CRT tech basically scan a beam of electrons horizontally, but part of the CRT tech uses Shadow mask, which may have a triangular grids, but CRT's nature with respect to a grid is still array of lines.)

Note that packing dots by a rectangular array is not optimal, with respect to density. The optimal one is a triangular grid. (see: Circle packing ) So, with respect to better resolution and the display grid, for a display technology, a triangular grid is superior to rectangular. So, i wonder if there are display devices based on triangular grid, or image formats that is based on triangular grid. I'm sure they exist in some specialized niche.

Gamma Error In Picture Scaling

What made me look into gamma today is when reading this article: Gamma error in picture scaling (2007-08-30), by Eric Brasseur. Source. The article details a defect that almost all image processing software have. When you scale some images, the algorithm used in these software has a defect, so the result scaled image is not optimal. The site gives one particular example input image. When you scale the image, the result is very bad. I tested it with ImageMagick, and verified that it also have this defect.


Web Share Widgets

perm url with updates: http://xahlee.org/js/share_widgets.html

Web Share Widgets

Xah Lee, 2010-02-22

These days, there are lots of websites that have tiny icons that allow readers to add the article to blog sites, or bookmark sites, social networking sites, such as facebook, blogger, twitter, delicious, digg, stumble upon, etc. Sites that includes such share widgets includes times.com, nytimes.com, cnn.com, usatoday.com, washingtonpost.com. You can just go to http://news.google.com/ and click on news articles to arrive at a site and look at their share widgets. For tech sites that have share widgets, there's http://blog.wolfram.com/, http://arstechnica.com/, http://slashdot.org/, http://tomshardware.com/, etc.

Is there a sharing widget plug-in for web masters?

If you look at their sharing widgets on major sites, looks like most are home cooked.

Adding A Share Widget To Your Website

I spend 10 min and found 2. One of them is addthis.com, and the other is sharethis.com. The sharethis.com site requires you to register. I bypassed it at that step. The addthis.com is quite simple. In 1 min you get this code:

<a class="addthis_button"
<img src="http://s7.addthis.com/static/btn/v2/lg-share-en.gif"
 width="125" height="16" alt="Bookmark and Share" style="border:0"/></a>
<script type="text/javascript"

It shows up like this:

This seems something i can use for my website.

Though, actually it is easy to create your own links. All the little icons you can pull. Most if not all these social networking sites have API that lets you submit sites, pretty much just like this: “http://bookmarkcenter.org/add.php?url=http://articles.com/nice_article.html”, though the url after the question mark needs to be Percent encoded.

Should Your Site Have A Share Widget?

One decision i have to make is whether my site should have such share widgets. It might be just a temporary fad for teens. Blogs began with such a perception, but today they are important tool in communication with major world impact. Web syndication (RSS/Atom) had such a perception too, when it began in about 1997, but is today a widely used standard.

Majority of people who uses the web probably do not use any of the social networking or blog or bookmark sites. But a minority, probably in hundreds of thousands, do. Having a link right there to let them talk about a article is convenient. So, from a tool point of view, it is really good to have such a share widget. A important benefit for the web site owner is that it certainly does increase your traffic, when your article is discussed about in some site.

Mac OS X Changes On Unix

perm url with updates: http://xahlee.org/UnixResource_dir/writ/mac_os_x_unix_changes.html

Mac OS X Changes On Unix

Xah Lee, 2010-02-15

This blog is some random notes on Mac OS X.

Recently, i was helping my friend, professor Richard Palais install and fix some svn repsotiory issues at his os x server at uci. I've been doing senior unix sys admin work during 1998-2002 on Solaris. But I haven't really been doing any serious sys admin since. It is a pleasure to dive into the unix quagmire again.

As discussed in my The Unix Pestilence page, the best resource for unix admin is not some commercial books, but rather, the documentations published by the unix maker itself.

The current documentation for Mac OS X 10.6 (Snow Leopard) is at: Mac OS X Server Guides Source. They are excellent.

Changes To Unix

Mac OS X, made many fundamental changes to the incompetent unix in the past decade since it adopted unix. Here's some of them as far as i know:

The login account management and infrastructure has changed. No more “useradd” or “groupadd”. It was replade by Netinfo, but NetInfo got replaced by Lightweight Directory Access Protocol mechanism in 10.5 (Leopard, 2007). My unix expertise is quite still classical, so i don't really understand NetInfo or LDAP.

OS X's Launchd, replaces several unix's start up processes and background processes. Here's Wikipedia quote (slightly edited):

launchd is a unified, open source service management framework for starting, stopping and managing daemons, programs and scripts. It was introduced with Mac OS X v10.4/Darwin v8.0, and is licensed under the Apache License.

The launchd daemon is essentially a replacement for init, rc, the init.d and rc.d scripts, SystemStarter (Mac OS X), inetd and xinetd, atd, crond and watchdogd (Mac Os X).

Of these replaced tech, i particular know about inetd and xinetd. The xinetd replaces the incompetences of inetd. The unix story is all about these replacements. (See: The Unix Pestilence: Tools and Software.)

SSH Session Drops Idle Connection

One annoying thing about OS X is that, if you connect to it thru ssh, after some short period of inactivity, it logs you out. This is a pain in the ass for sys admin. This is actually not caused by OS X, nor ssh settings, but somewhere in the network some software or device auto disconnect when the connection is idle for some period of time.

Here's a trick to solve this problem.

Edit the file “/etc/sshd_config”, so you have the lines:

ClientAliveInterval 280
ClientAliveCountMax 3

The ClientAliveInterval means the number of seconds to check if client is alive, by sending client a encrypted message. Every n seconds, sshd will check if client is active. If client does not respond, it adds a count. When this count reaches ClientAliveCountMax, sshd disconnect the client. You can read more by “man sshd_config”.

You may need to restart ssh. You can do it by going to Control Panel, Sharing, uncheck the “Remote Login” line, then check again.

In traditional unix, you make sshd reread the config file by sending SIGHUP to ssh. (see “man sshd”). For example: “ps auwwx | grep ssh” then “kill -s HUP ‹pid›”. However, i don't think that works in OS X, because when there is no connection, sshd isn't running. When there is a connection, and if you do “ps auwwx | grep ssh”, you'll see that it starts with “-i” option, which means it is running thru inetd. This means, sshd by default in os x is actually not running as a dedicated server, but only launched when there is a connection.

mds Process Problem

On his server, i noticed that the mds process is hogging cpu 100%. Apparently, something is wrong. A seach on the web found many posts about it. Basically, the problem is solved by making Spotlight to re-index the hard drives. (mds is one of the process of Spotlight)

To have Spotlight reindex drives, go to Control Panel, Spotlight, then Privacy Tab. Drag drives to the window. Wait for some 30 seconds for it to take effect. This will delete Spotlight's indexed files (i presume). Then, Remove the drives by clicking the minus sign.


While is was doing sys admin work 8 years ago, i use vi occasionally. That means few times a week, when i have to login to remote servers that doesn't have emacs installed. Long story short, today, i have to use vi again, so have to brush up some commands i forgot. So, i extended by vi tutorial here: Emergency vi.

The reason i have to use vi is a bit complex. Mac OS X has emacs bundled. I have 2 machines. My main machine is a PC running Windows Vista, my secondary machine, which sits on the side, is a PPC Mac running OS X 10.4.x. I can access the server either by ssh from my Windows, or from my Mac, or using the remote desktop software Timbuktu (software) on my Mac. Using the remote desktop software is not a luxury, because i actually need to access the Mac GUI for several reasons. For example, i need to access the various OS X's sys admin GUI utilities. Also, i need to give instructions about using various GUI software on the server. Also, when i use Timbuktu, he can watch while i work on the server, it helps communication. (we also are chatting in voice on Skype)

Normally, if i need to read some doc on the web, or do command line work, i use my main machine, which is Windows. However, due to the Mac ssh idle drop out problem, it makes ssh from Windows to OS Server a problem. The ssh program on my Windows is PuTTY, not that good either, and i haven't spend time to setup a alias to the server etc, so everytime i got disconnected, i have to type the damn long server name and login and password and cd to the dir i was working. A pain. So, for command line work, i work on my Mac thru Timbuktu.

To make things clear, let me repeat: I use my Mac, and Timbuktu, to connect to the server. So, i see the server's screen inside a window on my Mac. Then, i start Terminal there. While in Terminal, i could start emacs in the command line, or, start the GUI emacs Carbon Emacs. But it's not a good option. Because, i've been using emacs for 12 years, and have developed a complete custome designed set of basic shortcuts (see: ErgoEmacs Keybinding.), as well as over a thousand lines of custome elisp code. (See: Xah Lee's Emacs Customization File.) On the remote server, i don't remember what state of my emacs init files there are. Remember, the key confusion would be very complex, because i'm using a PC keyboard, connected to my Mac, going thru Timbuktu, then running Terminal on the server, then running Emacs. Also, my Mac is setup to use the Dvorak layout, and so is my account on the server. When i type a key combination on my keyboard, the keystrokes goes thrue several layers of translation to reach the emacs on the server. Adding the fact that i'm used to a different shortcut set that is different than the emacs default. So, if i just want to make some simple edit in some simple file once, it's easier to just start vi. (or, TextEdit)

Learning Notes On Goto, Continuation, Coroutine

perm url with updates: http://xahlee.org/comp/goto_continuation_coroutine.html

Learning Notes On Goto, Continuation, Coroutine

Xah Lee, 2010-02-05, 2010-02-05

Some learning notes on Goto, Continuation, Coroutine, in programing languages.

I realized today that these concepts are not about implementation, hardware, efficiency, nor algorithm, but rather, they are a particular construct, choice of design, in the semantics of computer languages. For example, to express a repeatition, some languages has “for” loop, some has “while”, some has “do ... until”, some relies on “Nest”, “Fold”, “FixedPoint”, most languages have a combination of these, but any single one of them is sufficient with respect to expressing a algorithm. Same thing can be said for constructs like “case”, “which”, “switch”, “if then elseif ... elseif ...”, all can be expressed by just having a simple “if else” construct. Continuations and coroutine, are a bit more complex, but they are basically a form of GOTO statement.

More specifically, i realized, that any language without any of goto, continuation, coroutine, does not necessarily mean the language is less powerful, or less expressive, or require more lines of code in practice. In fact, i think a language with them is probably not desirable. I think these are mostly useful in concurrency applications, and languages designed for concurrency have better models or constructs than any of Goto, Continuation, Coroutine.

These language design issues are quite interesting. As far as i know, there is not any practical, rigorous mathematical foundation about this.

When you code in some lang, suppose it doesn't have “do ... until”, but has “while”, it is trivial to code around it. However, in many situations, it is not easy to code one construct using another. If you have coded in several languages, you'll know this. Often the difficulty is because you are not familiar with the lang, and you are coding in lang X with lang Y idioms familiar to you. However, it does happen sometimes that you know what you are doing and certain construct is optimal for your task but lang X doesn't have it.

For example, when you code in a lang using nested for loop construct and need conditional exit in several places in inner loops, and if the lang does not have any “break” or “continue”, usually it is a pain to code around it. Also, in some functional langs that doesn't have any sort of iterations but only map and nest constructs, sometimes a iteration type of construct is the most clear and optimal. Likewise, sometimes all you need is a Map or Fold, or passing functions as a expression as a argument, but most imperative langs don't have it.

It's important to note here few issues:

  • In online forums, most programer's complaint about some lang lacking some construct is due to their inexperience in the lang or the particular style of programing of that lang's domain.
  • There are situations that some construct is optimal and not in the lang, even for a well designed lang.
  • There is not a PRACTICAL, rigorous, mathematical model (algebra), that can tell us what construct should be in a lang, or even how to measure expressibility of a lang.

I've been doing functional programing for 15 years. I consider myself to be among the world's top 500 expert in Mathematica. Over the years i learned a little bit of other functional langs: Scheme, Haskell, OCaml. But never obtained any practical coding expertise in these (except Emacs Lisp). In argument in rowdy discussion forums such as newsgroup comp.lang.scheme, often some tech geekers will throw out terms like continuation and coroutine and taunt you for ignorance in a heated debate. Sometimes i challenge them to explain what these really means. I want them to, tell me, in mathematical terms, what these are. I was never able to get any answer. Of course, today i know, that a mathematical answer is impossible, because there is no math model to speak of it, and continuation and coroutine are rather complex to explain because they involve passing program state, which, inevitably involves implementation issues such as stacks that is not exactly a language semantics issue. However, it is possible to give a clear context of what these construct means, or their significance, in the domain of language semantics and design. However, to do that is beyond their ability, either in actually understanding it, or expressing it in words.

So, what's continuation in one sentence? We can say that a continuation is a controlled form of goto statement. Instead of simply moving execution point from one place to another, it also passes the current program state (e.g. local vars) alone.

So, what's coroutine in one sentence? We can say that a coroutine is a subroutine/function in which you can call other subroutine in the block and passing all local state info to the called routine. In a sense, in is a local form of continuation. So, suppose you have subroutine f and g. f can call g in its definition, and when it calls g, it passes all state info to g, and f is simply done/exit, and all control is now in g. In particular, f does not wait for g to finish and return control to f. Likewise, g can call f the same way. So, 2 subroutines can mutually call each other and form a recursion, but without incurring the non-ending resource growth problem.

The following explains in more detail.

Understanding Goto

For modern programers, to understand goto, continuation, coroutine, a easy way to understand them is by first understanding goto as used in 1960 or 1970's languages. First, read Wikipedia Goto.

Unstructured vs Structured Programing

Essentially, older languages, like BASIC, at that time, does not have much of a concept like today's subroutines, code blocks such as “while”, “for” etc, and of course no libraries or module concept. Here's a code example from Wikipedia that illustrate this:

10   INPUT "What is your name: ", U$
20   PRINT "Hello "; U$
30   INPUT "How many stars do you want: ", N
40   S$ = ""
50   FOR I = 1 TO N
60   S$ = S$ + "*"
70   NEXT I
80   PRINT S$
90   INPUT "Do you want more stars? ", A$
100  IF LEN(A$) = 0 THEN GOTO 90
110  A$ = LEFT$(A$, 1)
120  IF A$ = "Y" OR A$ = "y" THEN GOTO 30
130  PRINT "Goodbye "; U$
140  END

Note that basically a source code is just one single file of hundreds of lines. There's not much of a code block or subroutine, not much of local variables, and the program basically just execute lines sequentially, and GOTO is used to jump to different lines, as primary means of aptly-called “control flow” for program logic.

Now look at the Wikipedia article on GOTO again, you'll understand it. Now, you can imagine this unstructured way is complex and is a problem. The use of GOTO as flow control turns your code logic into spaghetti. There is no clarity of logical units. So, in the 1970s or 1980s, languages evolved into what's called structure programing. This is familiar to programers today. Regardless what language you use, we are all doing structured programing. So, what's structured programing? You can see from this example of improved BASIC language:

INPUT "What is your name: ", UserName$
PRINT "Hello "; UserName$
  INPUT "How many stars do you want: ", NumStars
  Stars$ = STRING$(NumStars, "*") 
  PRINT Stars$
    INPUT "Do you want more stars? ", Answer$
  LOOP UNTIL Answer$ <> ""
  Answer$ = LEFT$(Answer$, 1)
LOOP WHILE UCASE$(Answer$) = "Y"
PRINT "Goodbye "; UserName$

You can see that now there's more clear concept of code blocks or code units, and GOTO is not used here. This is more of a Structured programming, which is familiar to all of us. But more interesting is that this obvious switch from unstructured to structured languages isn't so obvious in the 1970s. You can read Structured program theorem, and see back then it was in debate, and in fact the theoretical background was just come into being.

At this point, it is worth to remember that whether the language is unstructured or allows structured code, both are equivalent with respect to the possible algorithms they can express. More precisely, here's how Wikipedia puts it:

[Structured program theorem] states that every computable function can be implemented in a programming language that combines subprograms in only three specific ways. These three control structures are

  • 1. Executing one subprogram, and then another subprogram (sequence)
  • 2. Executing one of two subprograms according to the value of a boolean variable (selection)
  • 3. Executing a subprogram until a boolean variable is true (iteration)

Understanding Continuation

Continuations is in a sense a controlled form of goto. Scan these articles:

The last one is most easy to understand. By understanding continuation passing style with actual code, you might get some idea about continuation.

Suppose we define a function named “times”, that takes 2 arguments and multiply them together. However, we make it to have 3 parameters. The 3rd parameter will be a function with 1 argument. (the purpose of the 3rd parameter will be explained soon) The “return value” of times is the return value of the f applied to “x * y”.

def f (x y g) {
g(x * y);
(define (times x y f)
  (f (* x y)))

So, to compute “x * y” using our “times” function, we give it a third argument of a “identity” function, which takes one argument and returns it. (it doesn't do anything)

; define the “identity” function
(define (identity x) x)

; compute “3 * 4” using our function “times”
(times 3 4 identity)

Following is a code in Scheme Lisp that defines a function “pyth” that computes Sqrt[x^2+y^2].

(define (pyth x y)
  (sqrt (+ (* x x) (* y y))))

Following is the same function written in Continuation Passing Style.

(define (pyth x y k)
  (* x x (lambda (x2)
           (* y y (lambda (y2)
                    (+ x2 y2 (lambda (x2py2)
                               (sqrt x2py2 k))))))))

You can see that, basically it is like a style of piping, or called filtering or function sequencing.

Am not clear what is the significance of this continuation passing. Or, what's the level or context of this style. Is it implementation of a language (compiler)? Syntax design? Language semantic design? or just different style of coding of a given language for a given problem? If the last, then it doesn't make much sense, since it is better or easier to write it simply as nested function or function sequence. Then, perhaps with continuation passing style, the compiler creates efficient code... but then that is not necessarily so due to the style, since there's no reason a function sequencing or nesting style wouldn't produce the same background for creating efficient machine code.

Don't fully understand continuation yet, will have to dig later.

Understanding Coroutine

Here's Wikipedia quote:

In computer science, coroutines are program components that generalize subroutines to allow multiple entry points for suspending and resuming execution at certain locations. Coroutines are well-suited for implementing more familiar program components such as cooperative tasks, iterators, infinite lists and pipes.

Here's a sample pseudo-code:

var q := new queue

coroutine produce
        while q is not full
            create some new items
            add the items to q
        yield to consume

coroutine consume
        while q is not empty
            remove some items from q
            use the items
        yield to produce

As with Continuation, will need to have hands on coding experience of a lang with that feature to really understand fully.

Coroutine As A Model Of Instruction Set For Abacus

Was reading about Eve online, a massive multiplayer online spaceship game. Here's a interesting quote:

Both the server and the client software for Eve Online are developed in Stackless Python, a variant of the Python programming language. Stackless Python allows a relatively large number of players to perform tasks without the overhead of using the call stack used in the standard Python distribution. This frees the game developers from performing some routine work and allows them to apply changes to the game universe without resetting the server.[52]

Humm... Stackless Python. And then on this article, it mentioned that Second Life is also using it. The source of this claim is here http://wiki.secondlife.com/w/index.php?title=Eventlet&oldid=51543.

Central to the issue here is Coroutine, which i never understood. From reading the Wikipedia article, it appears to be a language feature that allows mutual loop. For example, 2 functions f and g, each calling the other at the end. Normally, that would quickly run out of memory or such. hum... still not sure i fully understand. How is the mathematical nature of this feature?

When thinking about computers, often i model it to manipulating abacus. (you could use Turing Machine to model but more complicated to think about) So, a subroutine, is then a reusable part of algorithm, or, reusable part of instruction on how to manipulate the abacus. So, normally, subroutine is expected to terminate, meaning, given a unit of abacus manipulation instruction set, the way you use this is to expect it to reach the end of the instruction set, then follow another instruction set. But coroutine, is then such a instruction set that is not expected to reach a end. Rather, you expect that in the middle of the instruction to follow some other instruction set (kinda like Goto). This all makes much sense now.

Some Wikipedia Reading On Compiler Optimization

Some easy to understand compiler optimization techniques: Constant folding, Dead code elimination.

Also, to facilitate the above optimizations , modern compilers uses a Static single assignment form for their intermediate form.

On reading these, gave me some thoughts about how to write programs today. For example, in the past, especially functional programing, you avoid defining variables, so that your code runs faster. For example, say you want a constant “delayTime” to be 2 hours, and you may write “delayTime = 60*60*2” which is easier to understand than “delayTime = 7200”. However, ten years ago you may want to do that instead because your program avoids computing some multiplication. But today, compiler technology has advanced a lot. The bottom line is basically the same: Always write your code in the MOST human readable way, and never optimize unless in the final stage and if you really have a NEED. AND, when u optimize, the most important step is to analyze at the algorithmic level. That is, your approach to the problem and the overall algorithm chosen. Then, bottle necks. The aspect of optimization you should have the least concern, is about code diddling, such as trying to use more “idioms”. The reasons for this principle are few:

  • (1) Hardware speed increases faster than typical software have to perform, in general.
  • (2) Compiler technology optimizes away mathematical equivalences in a way more able than human ever possibly can.
  • (3) Premature optimization is harmful and wasteful. It reduces your code's flexibility, and human time is some thousand times more expensive than cpu time. Tech geeking programers often have a unhealthy infatuation with optimization by diddling their code (e.g. perlers, and otherwise “idiom” lovers, or “pythonic” fuck.).

More items to study:

mathematica optimization

Perm url with updates: http://xahlee.org/UnixResource_dir/writ/Mathematica_optimization.html

A Mathematica Optimization Problem

Xah Lee, 2008-12-05

Recently, in a discussion in newsgroup “comp.lang.python”, Dr Jon Harrop challenged me for a Mathematica optimization problem. In summary, he posted a Mathematica code in which he badmouths Mathematica. I challenged for him to pay me $5 paypal and i'll post code to show how lousy his code is. Then, someone named Thomas M Hermann offered me $20. This page discusses the code.

Here's the full thread:

Newsgroups: comp.lang.lisp, comp.lang.functional,
 comp.lang.perl.misc, comp.lang.python, comp.lang.java.programmer
From: Xah Lee
Date: Sun, 30 Nov 2008
Subject: Mathematica 7 compares to other languages

In the following, i'll just post the technical details of this optimization problem.

Optimizing A Raytrace Code

Here's Jon's original code, to be run in Mathematica 6 (notebook: raytrace_v6_original.nb.gz):

delta = Sqrt[$MachineEpsilon];
RaySphere[o_, d_, c_, r_] := Block[{v, b, disc, t1, t2}, v = c - o;
    b = v.d;
    disc = Sqrt[b^2 - v.v + r^2];
    t2 = b + disc;
    If[Im[disc] ≠ 0 || t2 ≤ 0, ∞, t1 = b - disc;
      If[t1 > 0, t1, t2]]]

Intersect[o_, d_][{lambda_, n_}, Sphere[c_, r_]] :=
  Block[{lambda2 = RaySphere[o, d, c, r]},
    If[lambda2 ≥ lambda, {lambda, n}, {lambda2,
        Normalize[o + lambda2 d - c]}]]

Intersect[o_, d_][{lambda_, n_}, Bound[c_, r_, s_]] :=
  Block[{lambda2 = RaySphere[o, d, c, r]},
    If[lambda2 ≥ lambda, {lambda, n}, Fold[Intersect[o, d], {lambda, n}, s]]]

neglight = N@Normalize[{1, 3, -2}];
nohit = {∞, {0, 0, 0}};

RayTrace[o_, d_, scene_] :=
  Block[{lambda, n, g, p}, {lambda, n} = Intersect[o, d][nohit, scene];
    If[lambda == ∞, 0, g = n.neglight;
      If[g ≤ 0,
        0, {lambda, n} =
          Intersect[o + lambda d + delta n, neglight][nohit, scene];
        If[lambda < ∞, 0, g]]]]

Create[level_, c_, r_] :=
  Block[{obj = Sphere[c, r]},
    If[level == 1, obj,
      Block[{a = 3*r/Sqrt[12], Aux},
        Aux[x1_, z1_] := Create[level - 1, c + {x1, a, z1}, 0.5 r];
          3 r, {obj, Aux[-a, -a], Aux[a, -a], Aux[-a, a], Aux[a, a]}]]]]

scene = Create[1, {0, -1, 4}, 1];
Main[level_, n_, ss_] :=
  Block[{scene = Create[level, {0, -1, 4}, 1]},
          RayTrace[{0, 0, 0},
            N@Normalize[{(x + s/ss/ss)/n - 1/2, (y + Mod[s, ss]/ss)/n - 1/2,
                  1}], scene], {s, 0, ss^2 - 1}]/ss^2, {y, 0, n - 1}, {x, 0,
        n - 1}]]

AbsoluteTiming[Export["image.pgm", Graphics@Raster@Main[9, 512, 4]]]

Since my version of Mathematica is v4, so i have to make a few changes:

  • AbsoluteTiming is not available in v4. Timing is used instead.
  • Normalize not in Mathematica version 4 neither. I have to code my own. (for detailed explanation on the code, see: A Example of Mathematica's Expressiveness.)

Here's the modified reference implementation in v4: raytrace_v4_reference_imp.nb.gz

Here's my optimized version: raytrace_v4_Xah_Lee.nb.gz.


Image produced by the raytracer with input “Main[3, 200, 3.]”. (converted to png format)

Analysis and Explanation

If we rate that piece of mathematica code on the level of: Beginner Mathematica programer, Intermediate, Advanced, where Beginner is someone who has less than than 6 months of coding Mathematica, then that piece of code is Beginner level.

Here's some basic analysis and explanation.

The program has these main functions:

  • RaySphere
  • Intersect
  • RayTrace
  • Create
  • Main

The Main calls Create then feeds its output to RayTrace. Create calls itself recursively, and basically returns a long list of a repeating element, each of the element differ in their parameter.

RayTrace calls Intersect 2 times. Intersect has 2 forms, one of them calls itself recursively. Both forms calls RaySphere once.

So, the core loop is with the Intersect function and RaySphere. Some 99.99% of time are spent there.

The Main[] function calls Create. The create has 3 parameters: level, c, and r. The level is a integer for the number of spheres in the raytraced image. The c is a vector for sphere center. The r is radius of the sphere. His input has c and r as integers, and this in Mathematica means computation with exact arithmetics (and automatic kicks into infinite precision if necessary). Changing c and r to float immediately reduced the timing to 0.3 of original.

What a incredible sloppiness! and he intended to use this code in a language speed comparison?

Now, back to the core loop. The RaySphere function contain code calls Im, which is the imaginary part of a complex number!! and if so, it returns the symbol Infinity! The possible result of Infinity is significant because it is used in Intersect to do a numerical comparison in a If statement. So, here in these deep loops, Mathematica's symbolic computation is used for numerical purposes!

So, first optimization at the superficial code form level is to get rid of this symbolic computation.

In RaySphere, instead of checking whether his “disc = Sqrt[b^2 - v.v + r^2]” has imaginary part, one simply check whether the argument to sqrt is negative. After getting rid of the symbolic computation, i made the RaySphere function to be a Compiled function Here's the result:

RaySphere =
    Compile[{o1, o2, o3, d1, d2, d3, c1, c2, c3, r},
      Block[{v = {c1 - o1, c2 - o2, c3 - o3},
          b = d1*(c1 - o1) + d2*(c2 - o2) + d3*(c3 - o3),
          discriminant = -(c1 - o1)^2 - (c2 - o2)^2 + (d1*(c1 - o1) +
                    d2*(c2 - o2) + d3*(c3 - o3))^2 - (c3 - o3)^2 + r^2, disc,
          t1, t2},
        If[discriminant < 0., myInfinity, disc = Sqrt[discriminant];
          If[(t1 = b - disc) > 0., t1,
            If[(t2 = b + disc) ≤ 0., myInfinity, t2]]]]];

Note that RaySphere is now a function with 10 parameters (3 vectors and a radius). Before, RaySphere is of the form “RaySphere[o_, d_, c_, r_]” where each of the o, d, c, are vectors. So, this means i'll also have to change how RayTrace is called in Intersect.

I stopped my optimization at this step. The result timing is about 0.2 of the original.

The above are some fundamental things any dummy who claims to code Mathematica for speed should know. Jon has written a time series Mathematica package that he's selling commercially. So, either he got very sloppy with this Mathematica code, or he intentionally made it look bad, or that his Mathematica skill is truely beginner level. Yet he dares to talk bullshit in this thread.

Besides the above basic things, there are several aspects that his code can improve in speed. For example, he used rather complicated pattern matching to do intensive numerical computation part. Namely:

Intersect[o_, d_][{lambda_, n_}, Bound[c_, r_, s_]]
Intersect[o_, d_][{lambda_, n_}, Sphere[c_, r_]]

Note that the way the parameters of Intersect defined above is a nested form. The code would be much faster if you just change the forms to:

Intersect[o_, d_, {lambda_, n_}, Bound[c_, r_, s_]]
Intersect[o_, d_, {lambda_, n_}, Sphere[c_, r_]]

or even just this:

Intersect[o_, d_, lambda_, n_, c_, r_, s_]
Intersect[o_, d_, lambda_, n_, c_, r_]

Also, note that the Intersect is recursive. Namely, the Intersect calls itself. Which form is invoked depends on the pattern matching of the parameters. However, not only that, inside one of the Intersect it uses Fold to nest itself. So, there are 2 recursive calls going on in Intersect. Reducing this recursion to a simple one would speed up the code possibly by a order of magnitude.

Further, if Intersect is made to take a flat sequence of argument as in “Intersect[o_, d_, lambda_, n_, c_, r_, s_]”, then pattern matching can be avoided by making it into a pure function “Function”. And when it is a “Function”, then Intersect or part of it may be compiled with Compile. When the code is compiled, the speed should be a order of magnitude faster.

Also, he used “Block”, which is designed for local variables and the scope is dynamic scope. However the local vars used in this are local constants. A proper code would use “With” instead.

Jon Harrop's site about this raytracing code in several languages: http://www.ffconsultancy.com/languages/ray_tracer/.

Advice For Mathematica Optimization

Here's some advice for mathematica optimization, roughly from most important to less important:

  • Any experienced programer knows, that optimization at the algorithm level is far more important than at the level of code construction variation. So, make sure the algorithm used is good, as opposed to doodling with your code forms. If you can optimize your algorithm, the speed up may be a order of magnitude. (for example, various algorithm for sorting algorithms illustrates this.)
  • If you are doing numerical computation, always make sure that your input and every intermediate step is using machine precision. This you do by making the numbers in your input using decimal form (e.g. use “1.”, “N[Pi]” instead of “1”, “Pi”). Otherwise Mathematica may use exact arithmetics.
  • For numerical computation, do not simply slap “N[]” into your code. Because the intermediate computation may still be done using exact arithmetic or symbolic computation.
  • Make sure your core loop, where your calculation is repeated and takes most of the time spent, is compiled, by using Compile.
  • When optimizing speed, try to avoid pattern matching. If your function is “f[x_]:= ...”, try to change it to the form of “f=Function[x,...]” instead.
  • Do not use complicated patterns if not necessary. For example, use “f[x_,y_]” instead of “f[x_][y_]”.

Discovered a blog F# For Scientists Misses the Boat On Mathematica Performance (2008-08-26), by Sal Mangano. (at Source) Sal talks about how Joh Harrop bad mouth Mathematica. Jon is the guy who perpetually peddles his F# book while bad mouth every other langs in a inciting manner, in lisp, java, functional programing newsgroups, and just about every online forum and blogs he can post. Apparently he uses some media monitoring service (e.g. Google Alert) that automatically notifies him any site mentioning his name or his book or Ocaml/F#, because every such blog seem to be infested with his comments.

Apparently, Jon did bad mouth in Mathematica forum before, with this very code about raytracing. (See: Source) Here's a summary: On 2008-11-19, Mathematica 7 release is announce in the Mathematica newsgroup (aka MathGroup). MathGroup is a moderated forum. Someone asked a follow up question on parallel computing. Jon side track it and show cases his Mathematica code on raytracing, in the same inciting manner. Daniel Lichtblau, a long time Wolfram employee and a major implementor of Mathematica, posted a relpy that essentially educated him about Mathematica optimization. The content pretty much finding the same problems i found in Jon's Mathematica code.

Jon's interpretation of Daniel's free tips for him on Mathematica optimization, as a comment on Sal's blog, is this:

FWIW, Wolfram Research themselves already took up my challenge with Daniel Lichtblau producing the fastest Mathematica implementation ...


nsa mass surveillance

perm url with updates http://xahlee.org/comp/nsa_mass_surveillance.html

NSA Mass Surveillance

Xah Lee, 2010-02-23

(some blog comment about mass surveillance in USA)

I watched parts of the movie The Taking of Pelham 123 (2009 film) and reading Wikipedia about it. It has this quote in the Critical Recption section:

In a review for MSNBC, Alonso Duralde was critical of John Travolta's performance in the film, comparing it to his roles in Swordfish and Battlefield Earth: "Travolta remains singularly unbelievable as a villain. In movies like this and 'Swordfish' and, let's not forget, 'Battlefield Earth,' the actor strives for malice but generally can’t get much darker than playground-bully meanness."[26]

Swordfish is a crime thriller about computer hacking, but i haven't seen it. Heard it isn't so good. Reading Wikipedia, it has this quote:

Stanley Jobson (Jackman) is an elite hacker who infected the FBI's Carnivore program with a potent computer virus, ...

Swordfish isn't worth watching, however, Carnivore (software) gets interesting. Quote:

Carnivore was a system implemented by the Federal Bureau of Investigation that was designed to monitor email and electronic communications. It used a customizable packet sniffer that can monitor all of a target user's Internet traffic. Carnivore was implemented during the Clinton administration with the approval of the Attorney General.

After prolonged negative coverage in the press, the FBI changed the name of its system from "Carnivore" to the more benign-sounding "DCS1000." DCS is reported to stand for "Digital Collection System"; the system has the same functions as before. The Associated Press reported in mid-January 2005 that the FBI essentially abandoned the use of Carnivore in 2001, in favor of commercially available software, such as NarusInsight.[8]

NarusInsight. Quote:

Narus is a US private company which produces mass surveillance systems. It was founded in 1997 by Ori Cohen, who had been in charge of technology development for VDONet, an early media streaming pioneer.[1]

It is notable for being the creator of NarusInsight, a supercomputer system which is used by the NSA and other bodies to perform mass surveillance and monitoring of citizens' and corporations' Internet communications in real-time, and whose installation in AT&T's San Francisco Internet backbone gave rise to a 2006 class action lawsuit by the Electronic Frontier Foundation against AT&T, Hepting v. AT&T. [2]

Hepting vs. AT&T. Quote:

Hepting v. AT&T is a United States class action lawsuit filed in January 2006 by the Electronic Frontier Foundation (EFF) against the telecommunications company AT&T, in which the EFF alleges that AT&T permitted and assisted the National Security Agency (NSA) in unlawfully monitoring the communications of the United States, including AT&T customers, businesses and third parties whose communications were routed through AT&T's network, as well as Voice over IP telephone calls routed via the Internet.

The case is separate from, but related to, the NSA warrantless surveillance controversy, in which the federal government agency bypassed the courts to monitor U.S. phone calls without warrants. Hepting v. AT&T does not include the federal government as a party.

In July 2006, the United States District Court for the Northern District of California – in which the suit was filed – rejected a federal government motion to dismiss the case. The motion to dismiss, which invoked the State Secrets Privilege, had argued that any court review of the alleged partnership between the federal government and AT&T would harm national security.

The case was immediately appealed to the Ninth Circuit, where it has been argued and awaits a decision.

See also: NSA warrantless surveillance controversy.