2010-09-25

Short Intro of Mathematica For Lisp Programers (list processing example)

Perm url with updates: http://xahlee.org/UnixResource_dir/writ/notations_mma.html

Short Intro of Mathematica For Lisp Programers (list processing example)

Xah Lee, 2008-08

The following is a exposition on the how programing language Mathematica is used for a list processing problem. The essay particulars details the nature of nested syntax in comparison to lisp. This article is originally posted to “comp.lang.lisp” newsgroup.

Cortez wrote:

I need to traverse a list of lists, where each sublist is labelled by a number, and collect together the contents of all sublists sharing the same label. So if I have the list -

((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q r) (5 s t))

where the first element of each sublist is the label, I need to produce -

((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))

I do this with the following -

(defun test (list)
  (loop for j in list
          for index = (first j)
          for k = (rest j)
          with indices = nil
          if (not (member index indices))
            do (pushnew index indices)
            and collect k into res
          else
            do (nconc (nth index res) k)
          finally (return res)))

I suspect that there is a more efficient and elegant way of doing this, however. Any suggestions welcome.

Brief background: this is part of a program I've written for reading data from SDIF files, a binary format which stores sound description data. The labelled lists represent partials in spectral analysis data (partial-index, time, frequency).

Here's how one'd do it in Mathematica.

define the list:

mylist={0[a,b],1[c,d],2[e,f],3[g,f],1[i,j],2[k,l],4[m,n],2[o,p],4[q,r],5[s,t]}

then do this:

Sort@mylist //. {f___,x_[a__],x_[b__],l___} -> {f,x[a,b],l}

output is:

{0[a, b], 1[c, d, i, j], 2[e, f, k, l, o, p], 3[g, f], 4[m, n, q, r], 5[s, t]}

if you want the result cleaned up so that the integer labels are removed, do like this

result /. _Integer[b___] -> {b}

output:

{{a, b}, {c, d, i, j}, {e, f, k, l, o, p}, {g, f}, {m, n, q, r}, {s, t}}

Explanation:

The sort@mylist is syntactically equivalent to Sort[mylist]. It just sorts it. The result is:

{0[a, b], 1[c, d], 1[i, j], 2[e, f], 2[k, l], 2[o, p], 3[g, f], 4[m, n], 4[q, r], 5[s, t]}

The //. {f___,x_[a__],x_[b__],l___} -> {f,x[a,b],l} means use a pattern matching so that if ajacent elements has the same head, merge them into one.

The shortcut syntax for structural transformation used above is this:

myExpr //. myPattern -> myNewForm

The “myExpr” is any expression. The “myPattern” is any structural pattern (i.e. like regex except it work on list structures and datatypes). The “myNewForm” is like the regex's replacement string.

The syntax myExpr //. myPattern -> myNewForm can also be written in purely nested form, like this:

ReplaceRepeated[myExpr, Rule[myPattern, myNewForm]]

Now, here's some explanation on the the shortcut syntax used to match patterns:

  • _ means any single symbol. FullForm syntax is Blank[].
  • __ means any 1 or more symbols. FullForm syntax is BlankSequence[].
  • ___ means any 0 or more symbols. FullForm syntax is BlankNullSequence[].

x_ is a syntax shortcut for Pattern[x,Blank[]], meaning it is a pattern, to be named “x”, and the pattern is Blank[]. We name the pattern so later you can refer to the captured element as “x”.

f___ is a syntax shortcut for Pattern[f,BlankNullSequence[]]. It means a pattern that matchs 0 or more elements, and any expression that matches can be later refered to as “f”.

The x_[a__] means basically a expression with 1 or more elements. The head we named “x”, and its elements we named “a”. FullForm syntax is Pattern[x, Blank[]][Pattern[a, BlankSequence[]]]. Similar for x_[b__].

So, all together, the {f___,x_[a__],x_[b__],l___} just matches a list and capture neighbors that has the same head.

The {f,x[a,b],l} just means the new form we want. Note the f,x,a,b,l are just names we used for the captured pattern.

So now, Sort@mylist //. {f___,x_[a__],x_[b__],l___} -> {f,x[a,b],l} gives us the desired result.

Now to clean up the head that function as integer labels, we do:

result /. _Integer[b___] -> {b}

which is another expression transformation with pattern. The _Integer[b___] just means any list who's head is of Integer type, and its element we name “b”. Any expression matching it is replaced by a list of its elements, expressed as {b}.


Now, all the above may seem like weired syntax. But actually it has a purely nested form.

For example, this whole expression:

Sort@mylist //. {f___, x_[a__], x_[b__], l___} -> {f, x[a, b], l}

is syntactically equivalent to this:

ReplaceRepeated[
 Sort[mylist], 
 Rule[
  List[
       Pattern[f, BlankNullSequence[]], 
       Pattern[x, Blank[]][Pattern[a, BlankSequence[]]], 
       Pattern[x, Blank[]][Pattern[b, BlankSequence[]]], 
       Pattern[l, BlankNullSequence[]]], 
  List[f, x[a, b], l] ] ]

In a lisp form, it'd be:

(ReplaceRepeated
 (Sort mylist)
 (Rule
  (List
   (Pattern f (BlankNullSequence))
   ((Pattern x (Blank)) (Pattern a BlankSequence))
   ((Pattern x (Blank)) (Pattern b BlankSequence))
   (Pattern l (BlankNullSequence)))
  (List f (x a b) l) ) )

That's some power of fully nested, REGULAR, syntax.

Now, Qi support pattern matching in common lisp. I wonder if the above can be translated to Qi.

2010-09-25

Sourav Mukherjee supplies this purely functional solution in R5RS Scheme lisp. Sourav_Mukherjee_sourav.work_gmail.scm