2011-07-17

Little Parser Problem Challenge: Matching Pairs Validation

Perm url with updates: http://xahlee.org/comp/validate_matching_brackets.html

Lisp, Python, Perl, Ruby Code to Validate Matching Brackets

Xah Lee, 2011-07-21

This is a preliminary report on scripts of several languages to validate matching brackets.

Problem Description

Little Parser Problem Challenge: Matching Pairs Validation

The problem is to write a script that can check a dir of text files (and all subdirs) and reports if a file has any mismatched matching brackets.

  • The files will be utf-8 encoded (unix style line ending).
  • If a file has mismatched matching-pairs, the script will display the file name, and the line number and column number of the first or last instance where a mismatched bracket occures. (or, just the char position (as in emacs's “point”)) Exactly which position is considered as the “first” or “last” doesn't matter much, as long as it report a char that breaks the nesting matching pair syntax.
  • The matching pairs are all single unicode chars. They are these and nothing else: () {} [] “” ‹› «» 【】 〈〉 《》 「」 『』 . Note that ‘single curly quote’ is not consider matching pair here.
  • You script must be standalone. Must not be using some parser tools. But can call lib that's part of standard distribution in your lang.

Here's a example of mismatched bracket: ([)], (“[[”), ((, 】etc. (and yes, the brackets may be nested. There are usually text between these chars.)

I'll be writing a emacs lisp solution and post in 2 days. Ι welcome other lang implementations. In particular, perl, python, php, ruby, tcl, lua, Haskell, Ocaml. I'll also be able to eval common lisp (clisp) and Scheme lisp (scsh), Java. Other lang such as Clojure, Scala, C, C++, or any others, are all welcome, but i won't be able to eval it. javascript implementation will be very interesting too, but please indicate which and where to install the command line version.

I hope you'll find this a interesting “challenge”. This is a parsing problem. I haven't studied parsers except some Wikipedia reading, so my solution will probably be naive. I hope to see and learn from your solution too.

i hope you'll participate. Just post solution here. Thanks.

Solutions

Emacs Lisp

Detailed explanation at Emacs Lisp: Batch Script to Validate Matching Brackets.

Python

This report is incomplete. So far Raymond Hettinger's python 3 code is the only working code. None of the following works on my machine.

For the original post of this problem and the discussion, see: a little parsing challenge ☺ (2011-07-17) @ Source groups.google.com.

Thanks to the many who have written code and made helpful comments. I may come back to clean this up, in the coming weeks. If you can correct one of the following programs, please comment.

Pending Solutions

Python

Ruby

Perl

Common Lisp