Emacs: Form Feed and Source Code Section Paging Commands

Perm url with updates: http://xahlee.org/emacs/emacs_form_feed_section_paging.html

Emacs: Form Feed and Source Code Section Paging Commands

Xah Lee, 2011-04-25

This page discusses some issues about the Form Feed character (^L) used in emacs, and commands to page thru source code sections.

Problem of “forward-paragraph” and “backward-paragraph”

Emacs has commands “forward-paragraph” 【Ctrl+】 and “backward-paragraph” 【Ctrl+】. The problem with these is that the definition of “paragraph” in emacs is not well defined; it depends on what mode you are in. For example, in “text-mode”, a paragraph is basically a text block separated by newline characters. But in “html-mode”, it moves by some weird way. (try view source of this page, open in emacs, and move.) Technically, this is because the notion of “paragraph” in emacs is dependent on emacs's syntax table. (emacs's syntax table is a elementary system that cagetorize character to a list of semantic categories. From my experience in the past few years, is not a good system, and i think emacs could do without.)

“forward-block” and “backward-block”

I much prefer a key that always page by text block, as defined by newline sequences. So i wrote the following code:

(defun forward-block ()
  "Move cursor forward to next occurrence of double newline char.
In most major modes, this is the same as `forward-paragraph', however,
this function behaves the same in any mode.
forward-paragraph is mode dependent, because it depends on
syntax table that has different meaning for “paragraph”."
  (skip-chars-forward "\n")
  (when (not (search-forward-regexp "\n[[:blank:]]*\n" nil t))
    (goto-char (point-max)) ) )

(defun backward-block ()
  "Move cursor backward to previous occurrence of double newline char.
See: `forward-block'"
  (skip-chars-backward "\n")
  (when (not (search-backward-regexp "\n[[:blank:]]*\n" nil t))
    (goto-char (point-min))

You can assign a key like this:

(global-set-key (kbd "<C-up>") 'backward-block) ; Ctrl+↑
(global-set-key (kbd "<C-down>") 'forward-block) ; Ctrl+↓

This way, when you press a key to page by “paragraph”, the cursor movement is expected and predictable.

Problems of Emacs's “forward-page” and “backward-page”

In Emacs Lisp convention, the Form Feed character (ASCII 12) is a indicator of a code section in a source code. There's a builtin command “backward-page”【Ctrl+x [】 and “forward-page”【Ctrl+x ]】 to page thru sections. In many emacs lisp source code files, you'll see lots of these chars, displayed as “^L”.

This convention have several problems:

  • It is a 1970 or 1980's convention, it's outdated. The Form Feed char was used as a page break marker in printer protocol. Today's printers don't use that char anymore, and vast majority of programers don't know what ^L is, and no other lang use it except lisp.
  • The ^L is hard to read. It should be displayed as a line, like html's “hr” tag. There is a elisp package (PrettyControlL) that makes emacs display that char as a line. (See: Emacs Form Feed (^L) Display Suggestion and Tips) However, it is not bundled in GNU Emacs.
  • Even with the PrettyControlL package, it causes problems with emacs's whitespace-mode.

For the compatibility problem with whitespace-mode, see:

Newsgroups: gnu.emacs.help, comp.emacs
From: Xah Lee
Date: Tue, 29 Mar 2011 16:26:20 -0700 (PDT)
Local: Tues, Mar 29 2011 4:26 pm
Subject: compatibility problem of whitespace-mode & PrettyControlL?
Source groups.google.com

So, in the end, using the form feed as section indicator causes a few problems. One is technical. If GNU Emacs people by default display it as a line, then that would solve half of the problem. It would certainly solve the compatibility problem with whitespace-mode. But i'm not sure GNU Emacs people are willing to fix this. It's a pity, because a single character as a universal markup for source code section break, is quite elegant and useful. (You can type the character by pressing 【Ctrl+q Ctrl+l】. See: Emacs Line Return and Windows, Unix, Mac, All That ^M ^J ^L) The second problem is the outdated convention. Today, source code shy away from using any invisible glyphs other than line break, even tab chars are frowned upon. (the reason of this evolution is probably because invisible glyhps are not intuitive to use)

So, overall, i think i'm going to move away from using form feed char in my elisp source code.

Unfortunately, there's no good or standard replacement. Usually, people just add a bunch of hyphen like this “-----” or equal sing “=====” or underscore “_____” or just a sequence of comment char, whatever it is in the language. These are hackish means. The problem with these approach is you cannot have a key to do paging because the page marker varies.

So, a workaround i came up is to use the unicode section sign “§” together with a bunch of dashes “-----”. The section sign is meant as a proper marker to indicate a code section. The line graph is meant to be a visual display for lack of better alternative at the moment. (better would be to have the editor automatically display the section char with a line after it.) For drawing lines, a mor proper unicode char than dash is the unicode glyph the BOX DRAWINGS LIGHT HORIZONTAL (U+2500) like this “─────”. The probably is that it's harder to input. So, my section break in elisp looks like this:

; § ────────────────────

“forward-section” and “backward-section”.

So i wrote “forward-section” and “backward-section” to page by the “§” markup.

(defun forward-section ()
  "Move cursor forward to next occurrence of the SECTION SIGN § char (unicode 167)."
  (when (not (search-forward-regexp "§" nil t))
    (goto-char (point-max)) ) )

(defun backward-section ()
  "Move cursor forward to previous occurrence of the SECTION SIGN § char (unicode 167)."
  (when (not (search-backward-regexp "§" nil t))
    (goto-char (point-min)) ) )

Here's the keys i assigned for it:

(global-set-key (kbd "<s-prior>") 'backward-section) ; Win+PageUp
(global-set-key (kbd "<s-next>") 'forward-section) ; Win+PageDown

(global-set-key (kbd "<C-M-prior>") 'backward-page) ; Ctrl+Alt+PageUp
(global-set-key (kbd "<C-M-next>") 'forward-page) ; Ctrl+Alt+PageDown

For inputting the § char, press 【Ctrl+x 8 S】 (See: Emacs and Unicode Tips.). Or, you can define:

(define-key key-translation-map (kbd "H-s") (kbd "§")) ; SECTION SIGN. Hyper+s

Or, you can buy my Emacs Unicode Math Symbols Input Mode (xmsi-mode).

See also: Emacs: How to define Hyper & Super Keys.

(thanks to Miura Masahiro for the tip on inputting §.)

Popular posts from this blog

Browser User Agent Strings 2012

11 Years of Writing About Emacs

does md5 creates more randomness?