Emacs lisp Lesson: Hash Table

Emacs lisp Lesson: Hash Table

Xah Lee, 2008-01

This page shows you how to use hash table in emacs lisp. If you don't know elisp, first take a gander at Emacs Lisp Basics.

(for HTML version with colors and links, see:
http://xahlee.org/emacs/elisp_hash_table.html
)

---------------------------------------------
What Are Hash Tables

Often, you need to have a list with elements being key-value pairs. For example, the key may be a person's name, and the value can be their age. Or, in a more complex case, the key can be the person's ID number, and the value can be a data of name, age, sex, phone number, address, etc. You might have thousands of such entries, and you want to be able to know that, given a ID, find out whether it exists in your data. You also want to be able to add or delete, modify entries in the data.

From a interface point of view, such a keyed list looks like this: “((key1 value1) (key2 value2) (key3 value3) ...)”. If you want to know a particular key exist, you can write a loop thru the list to test if any first element of a element matches your given key. Similarly, you can add a entry by appending to the list, or delete and modify the list using your language's list-manipulation operators or functions. All this functionality is provided in elisp, called Association List.

Reference: Elisp Manual: Association-Lists.

However, this method will be too slow to the point of unusable if your have thousands of entries and frequently need to check, add, delete, or modify the data. This is because most languages are not smart enough to analyze your use of list to determine how best to compile it in the language. The solution provided in most high-level languages of today is to provide a particular data type (called a hash table), so that when you want to use a list of huge number of pairs with fast lookup, you use that data type.

For example, here's how hash table looks like in some high-level languages:

# perl
%mydata = ("mary"=> 4, "jane"=> 5, "vicky"=>7);

# PHP
$mydata = array("mary"=> 4, "jane"=> 5, "vicky"=> 7);

# Python
mydata = {"mary":4, "jane":5, "vicky":7}

(see Keyed List in Python and Perl, PHP: Keyed List) Similarly, elisp provides a hash table datatype.

Note: Hash table is also known as hash, keyed-list, dictionary, map. It's called hash table for historical reasons. Basically, a technique to create fast-lookup of a data, is by using a so-called hash-function, that converts the key into a unique (randomish) number. So that, instead of going thru the huge list to check every item, you just check one particular memory address to see it exists. Such a function is called “hash-function” because its key-conversion operation is analogous to processing a type of food dish made by disorderly and irreversibly chopping up and mixing up veggies or meat. (See Hash function↗, Hash (food)↗, Hash browns↗. )

(If your data needs more entries than that, the proper solution is to use a Relational Database↗.)

---------------------------------------------
Hash Table In Elisp

To create a hash table, use “(make-hash-table :test 'equal)”.

Here's a complete code that shows how to creat a hash table, then add, change, remove, entries:

(let (myhash val)

;; create a hash table
(setq myhash (make-hash-table :test 'equal))

;; add entries
(puthash "mary" "19" myhash)
(puthash "jane" "20" myhash)
(puthash "carrie" "17" myhash)
(puthash "liz" "21" myhash)

;; modify a entry's value
(puthash "jane" "16" myhash)

;; remove a entry
(remhash "liz" myhash)

;; get a entry's value
(setq val (gethash "jane" myhash))

(message val) ; print it
)

In the above example, both the entry's keys are datatype “strings”. However, elisp's hash table can have any lisp object as the key or value. When we create a hash table, we used “:test 'equal” to let elisp know that the function “equal” should be used when testing whether a key exists. If your keys are other lisp data type, you need to specify which equality functon emacs should use to test if a key exist. The available value for “:test” are “'eql”, “'eq”, “'equal”. (You can also define your own equality test. See elisp manual for detail. (linked at bottom of this page))

To clear all entries in a hash table, use “(clrhash «myhash»)”.

-------------------
Check If A Key Exists

To check if a key exists, use the “(gethash «mykey» «myhash»)”. If it exists, its value is returned, else “nil” is returned. Example:

(let (myhash)
(setq myhash (make-hash-table :test 'equal))
(puthash 'mary "19" myhash)

(gethash 'vicky myhash) ; returns nil
)

“gethash” has a optional third parameter “default”. If the key does not exist, “default” is returned.

-------------------
Apply A Function To All Key-Value Pairs

To apply a function to all entries in a hash table, use “(maphash «myfun» «myhashtable»)”. The function “myfun” must take 2 arguments, key and value. For example, if you want a list of all keys, you can write a function that puts the key into a list, then print out the list. We show several examples below.

-------------------
Get All Keys

To get all keys into a list, we can use a maphash with a function that pulls the keys into a list. Here's a example:

;; example of getting all keys from a hash table
(let (myhash allkeys)

;; add items to the hash
(setq myhash (make-hash-table :test 'equal))
(puthash "mary" "19" myhash)
(puthash "jane" "20" myhash)
(puthash "carrie" "17" myhash)
(puthash "liz" "21" myhash)

;; set to a empty list
(setq allkeys '())

(defun pullkeys (kk vv)
"prepend the key “kk” to the list “allkeys”"
(setq allkeys (cons kk allkeys))
)

;; get all the keys
(maphash 'pullkeys myhash)

;; sort and print it out
(sort allkeys 'string<)
)

In the above example, “pullkeys” is used only once. Since it is used only once, we don't need to define it separately, but just use lambda construct. So, the maphash line can be just like this: “(maphash (lambda (kk vv) (setq allkeys (cons kk allkeys))) myhash)”.

You can define a “keys” function that returns all keys of a given hash table. Like this:

(defun keys (hashtable)
"Return all keys in hashtable."
(let (allkeys)
(maphash (lambda (kk vv) (setq allkeys (cons kk allkeys))) hashtable)
allkeys
)
)

With this function, you can use it like this: “(keys myhash)”.

-------------------
Get All Values

Getting all values is just like getting all keys above. Here's a function that does it.

(defun hash-values (hashtable)
"Return all keys in hashtable."
(let (allvals)
(maphash (lambda (kk vv) (setq allvals (cons vv allvals))) hashtable)
allvals
)
)

-------------------
Print All Entries Sorted By Key

Here's a example how you can print out all entries in a hash table by sorted key.

First, we define a function that turns a hash table into a list.

(defun hash-to-list (hashtable)
"Return a list that represent the hashtable."
(let (mylist)
(maphash (lambda (kk vv) (setq mylist (cons (list kk vv) mylist))) hashtable)
mylist
)
)

;; example of turning a hash table into list then sort it
(let (myhash mylist)

;; add items to the hash
(setq myhash (make-hash-table :test 'equal))
(puthash "mary" "19" myhash)
(puthash "jane" "20" myhash)
(puthash "carrie" "17" myhash)
(puthash "liz" "21" myhash)

;; get the hash table into a list
(setq mylist (hash-to-list myhash))

;; sort and print it out
(sort mylist (lambda (a b) (string< (car a) (car b))))
)

;; prints (("carrie" "17") ("jane" "20") ("liz" "21") ("mary" "19"))


-------------------
Hash Table With Lisp Objects

Elisp's hash table's key or value can be any lisp object. Also, you can define your own equality test, as well as providing few other parameters when creating a hash table. For detail, see elisp manual.

Reference: Elisp Manual: Hash-Tables.

Emacs is beautiful!

Xah
xah@xahlee.org
∑ http://xahlee.org/

Popular posts from this blog

11 Years of Writing About Emacs

does md5 creates more randomness?

Google Code shutting down, future of ErgoEmacs