Topic: expat |
The expat XML parser by James Clark is fairly small, written in C, runs on Unix or W32, and is, well, not
very documented. But boy, does it work great! And it's also the meat of the Perl XML module. I'm using it
in the open-source workflow toolkit (among other things.)
The expat parser consists of two levels, like most parsers. The low level, xmltok, is an XML tokenizer, meaning that it simply looks at the file and tosses you all the pieces which have meaning in XML. The parser itself wraps a layer around that token stream which calls handlers which you write. For a description of how expat programs are written, see my little article on using expat. Under Windows, the parser and tokenizer are wrapped up in two DLLs; under Unix, you could probably make a .so or something, but it's just as quick to link statically. The libraries are not horribly large. The expat documentation is basically nonexistent. There is sketchy usage information in the header files, and there is a sample program which prints an indented tag structure. You can also study the code of the wf utility, which checks XML for well-formedness, but I haven't yet. I have written some command-line XML tools which are basically elaborations of the sample application, and what I learned during that is basically the content of the how-to I mentioned above. If and when I have motivation to use the tokenizer, I will document that level similarly. In the meantime, I am in the process of documenting the expat API -- it's slow going and I have a lot to do, so it's not done yet. I'll keep plugging. The expat parser is not a validating XML processor. What this means is that you can use it to write an XML processor, but it has no smarts whatsoever about how to use a DTD to make sure that the XML it is reading conforms to the DTD. Which is fine -- there are plenty of times where you simply don't have the DTD formalized, and XML is still useful. If and when I write a validating XML processor on top of expat, I'll link to it. (Might actually happen, and I'd get paid for it. I could tell you the details, but then I'd have to kill you.) |
|
|