Updating Atom/RSS with Elisp
Updating Atom/RSS with Elisp
Xah Lee, 2009-01-21
This page describes a real world example of using emacs to update a web syndication (RSS/Atom) page. If you don't know elisp, see: Emacs Lisp Basics.
I want to write a command, so that, when invoked, the current selected text will be added as a entry in a RSS/Atom file.
This lesson will show you how write a command that grabs the region text, switch buffer, search string to locate position for inserting text, insert the text, and update date field in a file.
I run a website “xahlee.org”. The site is hosted by a website service provider. Typically, i create or edit my site on local disk. Then, i upload by switching to shell “Alt+a sh”, then type “trsync ”, it would automatically be expanded to:
rsync -z -av --exclude="*~" --exclude=".DS_Store" --delete --rsh="ssh -l xyz" ~/web/ email@example.com:~/
This will update my website on the server.
You can define your keyboard shortcut, alias, abbreviation, like this:
(global-set-key (kbd "M-a") 'execute-extended-command) ;; easier typing (defalias 'sh 'shell) ;; shorter command name ;; save typing (define-abbrev-table 'global-abbrev-table '( ("trsync" "rsync -z -av --rsh=\"ssh -l xyz\" ~/web/ firstname.lastname@example.org:~/" nil 0) ))
One of my site's page is a blog. I write the blog page and update my site daily using the above mechanism. But i also want to create a RSS so that readers don't have to keep visiting the blog site just to see if there's new entry. They can just subscribe using the RSS and use several RSS reader in browser or other services that can notify them new entries or send them email.
Basically, a Atom file is a xml file, like this:
<?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom" xml:base="http://xahlee.org/Periodic_dosage_dir/"> <title>Xah's Blog</title> <subtitle>Ethnology, Ethology, and Tech Geeking</subtitle> <link rel="self" href="http://xahlee.org/Periodic_dosage_dir/pd.xml"/> <link rel="alternate" href="http://xahlee.org/Periodic_dosage_dir/pd.html"/> <updated>2006-09-11T02:35:33-07:00</updated> <author> <name>Xah Lee</name> <uri>http://xahlee.org/</uri> </author> <id>http://xahlee.org/Periodic_dosage_dir/pd.html</id> <icon>http://xahlee.org/siteicon.png</icon> <rights>© 2006 Xah Lee</rights> <entry> <title>Batman thoughts</title> <id>tag:xahlee.org,2006-09-09:015218</id> <updated>2006-09-08T18:52:18-07:00</updated> <summary>Some notes after watching movie Batman.</summary> <content type="xhtml"> <div xmlns="http://www.w3.org/1999/xhtml"> <p>I watched Batman today ...</p> <!-- more xhtml here --> </div> </content> <link rel="alternate" href="pd.html"/> </entry> </feed>
The file's header contains standard info such as: blog title, author info, copyright info, blog url, (unique) id for this blog. Then, the main body is made of several “entry”. Each entry has a title, id, timestamp, summary, perm link url, and full content (optional).
What i want my emacs script to do is to grab the current selected text and insert it as a entry in the Atom file, and also update the “updated” tag in the header with the a time stamp.
Here's the solution. The following code will grab the current text selection, and insert it as a entry in a Atom file in the right location, and update the Atom file's “updated” tag with a new timestamp.
(defun make-pdxml-entry (begin end) "Insert current region as a Atom RSS entry to file “pd.xml”. Detail: create a new Atom entry in pd.xml, with the current region as its content, update the Atom file's update date." (interactive "r") (let ((meat (buffer-substring-no-properties begin end))) (find-file "~/web/Periodic_dosage_dir/pd.xml") (goto-char (point-min)) (re-search-forward "<entry>" nil t) (move-beginning-of-line 1) (insert-atom-entry) (re-search-backward "<div xmlns=\"http://www.w3.org/1999/xhtml\">" nil t) (re-search-forward ">" nil t) (insert meat) (update-pdxml-date) (find-file "~/web/Periodic_dosage_dir/pd.xml") (goto-char (point-min)) (re-search-forward ">ttt" nil t)))
The above code works by first grabbing the current text selection, save it to the variable “meat”. Then it opens the atom file using “find-file”. It finds the location to insert a new entry by searching for “<entry>”. Then, it calls “insert-atom-entry” to insert new entry template. Then, it places cursor location somewhere in the new entry to insert the text for content. The line “(insert meat)” inserts the selected text. Then, “(update-pdxml-date)” is called to update the “updated” tag in Atom. Finally, the file is opened again (because update-pdxml-date might have closed it), and cursor is moved to the right location for me to type a title or summary.
This code can be improved in many ways. For example, right now it is hard-coded into updating one specific atom file. What if you have more than on Atom feed? Also, the title and summary tag is not automatically generated. What if you also want a RSS format too? What if you want the feed automatically sent to server? To fix these, you'll have to go into designing a general system for dealing with RSS, but right now it just work for me. When i need more flexibility, i can easily modify my code to adopt. This is the beauty of emacs.
The following are supplementary functions called by make-pdxml-entry.
(defun insert-atom-entry () "Insert a blank Atom RSS entry template." (interactive) (insert (concat " <entry>\n <title>ttt</title>\n <id>" (format-time-string "tag:xahlee.org,%Y-%m-%d:%H%M%S" (current-time) 1) "</id>\n <updated>" (concat (format-time-string "%Y-%m-%dT%T") ((lambda (x) (concat (substring x 0 3) ":" (substring x 3 5))) (format-time-string "%z")) ) "</updated> <summary>ttt</summary> <content type=\"xhtml\"> <div xmlns=\"http://www.w3.org/1999/xhtml\"> </div> </content> <link rel=\"alternate\" href=\"http://xahlee.org/Periodic_dosage_dir/pd.html\"/> </entry>\n\n" )))
Note that Atom spec requires that each entry has a world-wide unique id string, and this string must be a uri format. There are several methods discussed on the web about how to generate such a id. The method i used is a combination of domain name and timestamp, adopted from a online suggestion. You can search the web using “atom, entry, id” for these suggestions.
(defun update-pdxml-date () "Update the Atom RSS updated tag in pd.xml.\n That is, the first occurance of: <updated>2006-10-10T22:58:42-07:00</updated>" (interactive) (find-file "~/web/Periodic_dosage_dir/pd.xml") (goto-char (point-min)) (let (x1) (setq x1 (re-search-forward "<updated>" nil t)) (delete-region x1 (+ x1 25)) (insert-date-time)))
(defun insert-date-time () "Insert current date-time string." (interactive) (insert (concat (format-time-string "%Y-%m-%dT%T") ((lambda (x) (concat (substring x 0 3) ":" (substring x 3 5))) (format-time-string "%z")))))
With the above code, i write my blog html file as usual, then i select the region of text, then press “Alt+x make-pdxml-entry”, then i'm switched to the Atom file with the entry inserted. I edit the Title, Summary, and the perm url for that entry if any. Save file. Then i'm done, and can “Alt+x sh Enter trsync Enter”, then my web server is updated with blog and RSS.
Emacs is flexible!
For a simple intro to Atom, and links for Atom validation, tutorial, spec, sample Atom file, see: Intro to Atom.