Emacs Golf: Align and Sort

Perm url with updates: http://xahlee.org/emacs/emacs_align_and_sort.html

Emacs: Aligning Text and Sorting by Fields

, 2011-11-02

This page shows a practical example of using {“align-regex”, “sort-regexp-fields”, “sort-columns”}.

Problem Description

So today i was trying to understand the sizes of hectare and km². I got this list:

California 423,970 km²
Taiwan 36,008 km²
Japan 377,944 km²
Germany 357,021 km²
Iraq 438,317 km²
Iran 1,648,195 km²
Korea (North+South) 219,140 km²
Mexico 1,964,375 km²

but i want it to be in this form:

Taiwan                 36,008 km²
Korea (North+South)   219,140 km²
Japan                 377,944 km²
Germany               357,021 km²
California            423,970 km²
Iraq                  438,317 km²
Iran                1,648,195 km²
Mexico              1,964,375 km²

How would you do it? I do not have a solution.

Recently, Jon Snader (jcs), Mickey Petersen, Tim Visher are all doing emacs golf. It's interesting. Their blogs are:

Solution

Jon Snader and “jm”, provided the following solution.

Align Text with “align-regexp”

First, we align the text. Select the text first, then press 【Ctrl+u】 then call “align-regexp”, with the regexp .* \([0-9,]+\).* then choose -1 for group, 1 for spacing, and n for repeat.

Here's what it means. “align-regex” lets you align a region by a regex in complex ways.

  • The regex .* \([0-9,]+\).* matches a whole line (you can add ^ at the beginning and $ at end if you like, but is not necessary). The pattern \([0-9,]+\) captures our numbers part.
  • The prompt “Parenthesis group to modify (justify if negative):”, we answer “-1”, because we want the first matched pattern to be used for alignment, and we want it to be justified to the right (meaning, align to the right of text captured by our pattern).
  • The query “Amount of spacing (or column if negative): ”, we use 1.
  • In “Repeat throughout line?” we answer “n”.
  • Ctrl+u】 is necessary for “align-regex” to promp you for various parameters (though, “align-regex”'s inline doc does not mention it).

The result is this:

California             423,970 km²
Taiwan                  36,008 km²
Japan                  377,944 km²
Germany                357,021 km²
Iraq                   438,317 km²
Iran                 1,648,195 km²
Korea (North+South)    219,140 km²
Mexico               1,964,375 km²

Using “sort-regexp-fields” and “sort-columns”

To sort it, there are 2 methods. One is using “sort-regexp-fields”, with this regex ^.*\([0-9 ,]\{9\}\) km²$.

Another method is simply use “sort-columns”. This command sort lines by using a vertical column of text as sort key. The column is specified by the position of mark and cursor. So, place the cursor at the upper right, mark it, then move to lower left of our number, like this:

California             423,970 km²▮
Taiwan                  36,008 km²
Japan                  377,944 km²
Germany                357,021 km²
Iraq                   438,317 km²
Iran                 1,648,195 km²
Korea (North+South)    219,140 km²
Mexico              ▮1,964,375 km²

Then call “sort-columns”. We got our desired result:

Taiwan                  36,008 km²
Korea (North+South)    219,140 km²
Japan                  377,944 km²
Germany                357,021 km²
California             423,970 km²
Iraq                   438,317 km²
Iran                 1,648,195 km²
Mexico               1,964,375 km²

All these commands: {“align-regex”, “sort-regexp-fields”, “sort-columns”} will be quite useful when you need it. (Big thanks to Jon Snader and “jm” for the excellent solutions.)

Note: there's also these simpler commands if you don't know already:

  • “sort-fields”
  • “sort-lines”
  • “sort-numeric-fields”
  • “sort-pages”
  • “sort-paragraphs”

For working with text columns, “*rectangle*” commands are very useful. See: Emacs: Manipulate Column Text, string-rectangle, ASCII-Art.

Popular posts from this blog

Browser User Agent Strings 2012

11 Years of Writing About Emacs

does md5 creates more randomness?