Topic: XML -- How to write a DTD[ XML index ] | ||
| ||
Defining an XML document type (i.e. writing a DTD) consists of the following steps, not
necessarily in order:
As you can see from that, the process of defining an XML document type is that of designing a set
of elements. So what kinds of things can you do with elements? An element is an object that
contains data. It can contain data in two ways; first, it has content, second it has attributes.
Using the HTML link as an example, the element's name is "a", its most useful attribute
is A definition of that much of the HTML spec could look like this:
#PCDATA thing in there? It stands for "parsable character
data" and it effectively means, this element can contain some text that you should pass on to the
application. (A note on that phrase "pass on to the application": XML is defined in terms of
the XML parser. The parser is a module which is assumed to be built into, or at least somehow
called by, an application which does the actual work of whatever the application does. When
you're writing a DTD, you're talking to the parser; when you write a document, you're using the
parser to talk to the application.)
You'll notice that the definition of the That much is easy. But the real usefulness of XML is how well it expresses arbitrary structure. To express arbitrary structure we need to have elements within elements, and that's done in pretty much the same way. Let's define something really simple, a tree structure. We'll use the obvious name for a node: "node". Each node may be named and may contain other nodes. And in fact let's make it a binary tree, i.e. each node may have a maximum of two children. Let's define:
ELEMENT
definition (that's the stuff after the name, remember). I wrote it as (node?, node?)
to signify that the content of a node may be an optional node followed by another optional
node, and nothing else. I also tossed in some optional node data for good measure.
So now we can write a little XML chunk representing a binary tree:
What if I wanted a general tree? I'd write the content model like this: Entities come in two flavors, regular entities (like the < kind of entities we know and love from HTML) and parameter entities, which can be used in the DTD definition itself. Parameter entities are prefixed with a percent sign (%) and are dandy for sequences of element content specifications which get reused throughout a DTD. I used a parameter entity in the wftk DTD to represent the types of elements which are considered actions or action-like ... things. By far the best way to discover what I'm talking about at this point is to go read the wftk DTD yourself and use this little tips to figure out what you're seeing. I've still got some work to do to make this topic more informative, but this will get you started. As always, if I'm missing something, ask. |