tutorial ] [ example code ] [ links ]

Topic: expat

The expat XML parser by James Clark is fairly small, written in C, runs on Unix or W32, and is, well, not very documented. But boy, does it work great! And it's also the meat of the Perl XML module. I'm using it in the open-source workflow toolkit (among other things.)

The expat parser consists of two levels, like most parsers. The low level, xmltok, is an XML tokenizer, meaning that it simply looks at the file and tosses you all the pieces which have meaning in XML. The parser itself wraps a layer around that token stream which calls handlers which you write. For a description of how expat programs are written, see my little article on using expat.

Under Windows, the parser and tokenizer are wrapped up in two DLLs; under Unix, you could probably make a .so or something, but it's just as quick to link statically. The libraries are not horribly large.

The expat documentation is basically nonexistent. There is sketchy usage information in the header files, and there is a sample program which prints an indented tag structure. You can also study the code of the wf utility, which checks XML for well-formedness, but I haven't yet. I have written some command-line XML tools which are basically elaborations of the sample application, and what I learned during that is basically the content of the how-to I mentioned above. If and when I have motivation to use the tokenizer, I will document that level similarly. In the meantime, I am in the process of documenting the expat API -- it's slow going and I have a lot to do, so it's not done yet. I'll keep plugging.

The expat parser is not a validating XML processor. What this means is that you can use it to write an XML processor, but it has no smarts whatsoever about how to use a DTD to make sure that the XML it is reading conforms to the DTD. Which is fine -- there are plenty of times where you simply don't have the DTD formalized, and XML is still useful. If and when I write a validating XML processor on top of expat, I'll link to it. (Might actually happen, and I'd get paid for it. I could tell you the details, but then I'd have to kill you.)

LINKS
  • expat distro at Clark's site
    Go get your own copy! This contains everything you need.
  • expat FAQ at Clark's site
    Some basic questions and answers.
  • xmltools
    My handy-dandy little command-line XML toolkit. Written with expat, and they run beautifully under W32 and Solaris, anyway.
  • The expat API
    My ongoing effort to HTMLize the API description. It'll probably be done in another week or two. (I said this on May 10, 2000, just for the record and for the sake of humor.)





Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.