Converting XML to Lisp via Perl

Gene Michael Stover

created Monday, 5 May 2003
updated Tuesday, 6 May 2003

Copyright © 2003 by Gene Michael Stover. All rights reserved. Permission to copy, transmit, store, & view this document unmodified & in its entirety is granted.

There are things in this world that shouldn't exist but that irresistibly attract us & demand our morbid attention: Elephantitus & toe jam are two examples. Unright though they are, such curios from Doctor Calligari's Cabinet are worthy of study. Or maybe they could be good for a five-minute diversion from work. So without further delay, I present a program to convert XML to Lisp ... via Perl.

The program is xml-to-lisp. It requires the XML::Parser library for Perl, which may or may not be part of the usual Perl distributions. I don't know.

It converts each XML node into a list whose first is the element type, whose second is an association list of attributes, & whose third & following items are either strings or other XML nodes. For example, with this XML input:

<noodle composition="rice" disposition="wet">yowza!</noodle>

the resulting Lisp expression would be (NOODLE ((COMPOSITION . "rice") (DISPOSITION . "wet")) "yowza!").

Here's another one. The outer XML element has no attributes, & it contains some other XML elements.

<bowl>Contains three things:
  <grain ofwhat="rice">johnathan</grain>
  <grain ofwhat="sand">swift</grain>
  <grain ofwhat="boulder">sea gull</grain>

That snippet of XML converts to Lisp as

(BOWL ()
  "Contains three things:\n "
  (GRAIN ((OFWHAT . "rice")) "johnathan")
  "\n  "
  (GRAIN ((OFWHAT . "sand")) "swift")
  "\nand\n  "
  (GRAIN ((OFWHAT . "boulder")) "sea

Notice that newlines & other white-space is preserved; I've shown newlines as "\n" in that Lisp form.

So how did this program come to exist? I wrote it by accident; I swear.

A few weeks ago at work, I was asked to write a program to mangle a new data file format. The new type of file was encoded in XML. I had already played with XML a little, but not much & none in Perl. So I did a little exploratory programming with Perl's XML parser.

One of the programs I wrote while playing around reconstructed the input file after parsing all tokens, nodes, attributes, & cdata. As I looked at that program & wondered what I would do next, I realized that, with ever so slight changes to the printing functions, the program would print lists. It would convert XML to Lisp.