This code and documentation are released under the terms of the GNU license. They are additionally copyright (c) 2001, Vivtek. All rights reserved except those explicitly granted under the terms of the GNU license. This presentation was prepared with LPML. Try literate programming. You'll like it. |
py_
tacked on the front.
xml_
prefix from all the functions, but I'm used to it. So I left it.
Anyway, this lets me do "from xmlapi import *" instead of just "import xmlapi".
It'd be nice for the string conversion to happen automatically; to do that, though, we have to define a new type, because only a dedicated type object has slots for the functions to handle things like string representation. As it happens, we get some other nice benefits from maintaining our own type, too, like an arbitrary number of fields in the struct used to store the object; this provides enough flexibility to deal with the garbage collector without having to change the XMLAPI itself. (Yet.) Now that XMLAPI is actually working from Python, I'm more disturbed about garbage collection, so let's implement xml_loc and play around with internal elements, to see what happens when a parent goes out of scope. Here's a naive xml_loc (warning: this is not the active version!)>>> from xmlapi import * >>> f = xml_parse ('<test><element attr="this"></test>') >>> xml_string (f) '<test><element attr="this"></test>' >>> xml_set (f, "testattr", "new value") >>> xml_string (f) '<test testattr="new value"><element attr="this"></test>' >>> f <PyCObject object at 00764F20>
This naive try simply called xml_loc on the locator and built a new object on the returned pointer. But the following test shows the problem with the naive approach:static PyObject *py_xml_loc(PyObject *self, PyObject *args) { PyObject * xml_obj; XML * found; char * loc; if (!PyArg_ParseTuple(args, "Os", &xml_obj, &loc)) return NULL; if (!PyCObject_Check (xml_obj)) { PyErr_SetString (PyExc_TypeError, "arg not XML object"); return NULL; } found = xml_loc ((XML *) PyCObject_AsVoidPtr (xml_obj), loc); if (found) return PyCObject_FromVoidPtrAndDesc ((void *) found, (void *) Py_None, py_xml_cleanup); else return NULL; }
>>> f=xml_parse ('[[test>[[element id="me"/>[[/test>') >>> c=xml_loc (f, ".element(0)") >>> xml_string (c) '[[element id="me">' >>> f=xml_parse ('[[blargh>') >>> xml_string (c)Boom. Python dumps core on attempting to reference 'c', because when 'f' goes out of scope it's diligently cleaned up, and 'c' was after all simply a reference into 'f'. If we do things the other way around, when 'c' goes out of scope it'll leave 'f' in an inconsistent state, and equally dire consequences will follow. So, just as I'd suspected at the outset, we have to be cleverer than this; internal references should not attempt to clean up, but should instead just work with the parent's reference count. Fortunately, CObjects have a handy place to stash an extra void pointer, and we can use that for the root of any tree; if NULL, this simply means that the current object is itself the root. This stash pointer is called the "descriptor" and is set by creating the object using a different API, PyCObject_FromVoidPtrAndDesc (instead of PyCObject_FromVoidPtr). This is somewhat awkward, as it's going to create problems when we use it for xml_append, which takes an XML object with no parent and inserts it into an existing tree. The CObject is still going to be hanging there with a reference to the XML object. The best solution is to use a special cleanup function which refuses to clean up trees with parents, probably.
for x
in list
construct.
The XMLAPI has two iterator categories: one for an element's attributes, and one for its
subelements. The latter has two flavors: one which includes text subelements, and one which
excludes them. For the attribute iterator, I think it's more appropriate to return a list to
Python, so I'm not going to expose the full set of iterator functions. Technically, this is
a loss of functionality, since duplicate attributes can't be read. However, since the XMLAPI
exposes no way to create duplicate attributes, and since they aren't supported by XML
anyway, this isn't really a weakness, in my humble opinion.