This documentation is intended to stand alone, even though it's part of the wftk documentation set. As I've ported (parts of) the XMLAPI into Perl so that I can work with the same API, and as that has nothing whatsoever to do with wftk, I think it makes sense to arrange things this way. So if you're reading this and haven't heard of the wftk, don't panic, it's just the project during which I started working on an XML API, which later became the XMLAPI. See? But if you're figuring out how to work with the wftk, don't panic when I don't talk about it much.
Anyway, besides reading and writing to files, there are a whole lot of things you can do with these XML heap structures. I've broken these down into categories so they're not completely overwhelming.
xml_read
reads a file handle and returns an XML structure. If any error is encountered,
it bails, returning NULL. If you want error feedback, use xml_read_error
(I realized this
need after the fact, and in fact early versions of xml_read
actually wrote the error
message to stdout
, which wasn't very friendly behavior at all.) The
xml_read_error
is a perfectly normal XML reader, but instead
of returning NULL in case of error, it returns an error element of the form<xml-error code="200" message="some error message" line="40"/>The third parsing function,
xml_parse
simply takes a string buffer and parses that.
The buffer must be null-terminated. It also returns an error element in case of error (not NULL).
Writing is done fairly simply by walking the XML tree structure. Pretty printing (breaking lines) isn't supported; if you want pretty XML, you have to insert your own line breaks when building the XML in the first place.
Use the xml_writecontent
to write only the subelements of the XML passed in without writing
the enclosing element or its attributes.
XML * xml_read (FILE * in);
XML * xml_read_error (FILE * in);
xml_read
returns NULL in case of error;
xml_read_error
returns an error structure (as above).
XML * xml_parse (const char * in);
void xml_write (FILE * out, XML * xml);
void xml_writecontent (FILE * out, XML * xml);
li
, for instance, doesn't need to be closed. That sort of thing.
void xml_writehtml (FILE * out, XML * xml);
void xml_writecontenthtml (FILE * out, XML * xml);
xml_createtext
to create a non-element XML. The
xml_createtextf
function works the same, except that it takes a printf
-like
format (at the moment it understands only %s and %d).
Use xml_delete
to delete a subelement from a parent element; xml_delete
calls
xml_free
on the deleted element. Call xml_free
directly if the element isn't
a child element (it won't clean up the dangling child pointer in a parent.)
XML * xml_create (const char *name);
XML * xml_createtext (const char *content);
XML * xml_createtextf (const char *format, ...);
XML * xml_createtextlen (const char *content, int len);
malloc
. Thus you have to free the result
when you're done with it. The *content
functions, like xml_writecontent
,
don't write the enclosing element or its attributes, just its subelements. The *html
functions convert the given XML into an HTML-like form while writing. (The given XML is of course
unchanged.)
char * xml_string (XML * xml);
char * xml_stringcontent (XML * xml);
char * xml_stringhtml (XML * xml);
char * xml_stringcontenthtml (XML * xml);
xml_attrval
function returns a pointer to the named attribute or a pointer to an
empty string if the attribute isn't found, and yes, I know the function is named inconveniently. The
pointer returned is a const pointer directly into the XML structure, so don't write to it. Use
xml_attrvalnum
if you just need an integer representation of the attribute's value.
The xml_set*
functions set attributes, of course. If the attribute is already present,
it will be replaced; otherwise, a new attribute will be created. The xml_setf
function
takes a format similar to printf
's format; it recognizes only %s and %d at the moment.
The xml_set
makes a copy of the string value given it (and the attribute name, of course);
if you don't want the copy to be made, then use xml_set_nodup
. Caution:
xml_set_nodup
may only be called with malloc'd strings! Otherwise, when you attempt
to free the XML element, Bad Things will happen.
const char * xml_attrval (XML * xml, const char * name);
int xml_attrvalnum (XML * xml, const char * name);
void xml_set (XML * xml, const char * name, const char * value);
void xml_setf (XML * xml, const char * name, const char * format, ...);
void xml_setnum (XML * xml, const char * name, int value);
void xml_set_nodup (XML * xml, const char * name, const char * value);
xml_prepend
to insert a child element before any other subelement; use
xml_append
to insert the child after all others. The xml_replace
function
will replace the given element in its own parent, while xml_replacecontent
first deletes
the given element's children, then appends the new child (thus effectively replacing the element's
children.)
The xml_copyinto
function is a little weird. I've only had one use for it, really. What
is does is to take all the attributes and subelements of the source XML and write/append them,
respectively, into the target XML. If there are duplicate attributes, the source values overwrite.
Any existing information is saved otherwise.
void xml_prepend (XML * parent, XML * child);
child
as first subelement of parent
.
void xml_append (XML * parent, XML * child);
child
as last subelement of parent
.
void xml_replace (XML * target, XML * source);
target
and inserts source
in its place (assumes that
target
is already the subelement of a parent element.)
void xml_replacecontent (XML * parent, XML * child);
parent
and inserts child
as new content.
void xml_copyinto (XML * target, XML * source);
source
into target
, preserving all existing children.
Does the same with attributes, except like-named values are overwritten.
The following examples all work with this XML structure:
<record> <field name="field1" value="4"/> <comment>This is a comment, maybe a memo or something.</comment> <field name="field2" value="3"/> </record>Then the locator
record.field(1)
will find the second field element: <field name="field2" value="3"/>
. Note that we work from a base of 0, and note that the intervening
"comment" field has no effect. Neither would any intervening plain text.
The locator record.comment
will find the comment field. (Note that omission of an element
number means to return the first you find.) You can search by ID or name, too: the locator
record.field[field2]
will return the second field as well.
If no matching field is found, the xml_loc
function returns a NULL pointer.
The xml_locf
function works the same as xml_loc
except that it can build your
locator for you using a formatting scheme much like the printf
function. It only
understands %s and %d at the moment, but that's probably all you'll need in locators anyway.
Both loc
functions can ignore the element name of the topmost element if you simply
omit it; thus the locator .field
will find the first data field in the example XML.
The xml_getloc
and xml_getlocbuf
functions find a locator for the given XML;
since each XML element knows its parent, the getloc
functions can simply trace up the tree
and find a full locator. The xml_getloc
function requires that you supply a buffer; in
this case, the locator may fill the buffer. Check its length carefully upon return. The
xml_getlocbuf
function allocates the buffer for you, but you must either free the buffer
when you're finished with it, or use xml_set_nodup
to pass it back into an attribute for
later cleanup with an element. The pattern
xml_set_nodup (xml, "myloc", xml_getlocbuf(xml))
will probably be useful here and there.
void xml_getloc (XML * xml, char * loc, int len);
char * xml_getlocbuf (XML * xml);
XML * xml_loc (XML * xml, const char * loc);
XML * xml_locf (XML * xml, const char * format, ...);
elem
skip over
any plain text children and return only XML elements. This is useful because plaintext elements have
a NULL name, so if you get one by mistake and try to compare its name with something, you will regret
it. This happened to me so often I modified the API. Ha. Anyway, you can start an iteration at the
beginning of the child list or at the end, and you can move forward or backward in the list. If you
get a NULL pointer back, you know you've reached the end (or beginning) of the list. A useful pattern
for iteration is thus:marker = xml_firstelem (parent); while (marker) { /* do something */ marker = xml_nextelem (marker); }
XML * xml_first (XML * parent)
XML * xml_last (XML * parent)
XML * xml_next (XML * child)
XML * xml_prev (XML * child)
next
and prev
take the
current iterated child and get its neighbor, while first
and last
take the
parent element.
XML * xml_firstelem (XML * parent)
XML * xml_lastelem (XML * parent)
XML * xml_nextelem (XML * child)
XML * xml_prevelem (XML * child)
You can only scan attributes from the front of the list to the back (they're singly linked, so you don't
want to scan backwards anyway.) Given an attribute, you can retrieve its name or its value. The
unfortunate similarity of xml_attrvalue
(which gets the value of an attribute structure)
and xml_attrval
(which gets an attribute value from an element, treating it as a hash) is
regrettable and you have my abject apologies, but I didn't want to go back and change all my existing
code. We'll all just have to live with my poor planning.
XML_ATTR * xml_attrfirst (XML * xml);
XML_ATTR * xml_attrnext (XML_ATTR * xml);
const char * xml_attrname (XML_ATTR * xml);
const char * xml_attrvalue (XML_ATTR * xml);