Within that directory is the repository definition, a single XML file which defines the data resources (and other things) used by that repository. When you start up a conversation with a wftk system, the first thing the wftk does is to read the repository definition so it knows the context in which you will be working.
For instance, when starting the command-line repmgr (repository manager) under Windows, you might use the following command:
C:\projects\sites\test>repmgr -r site.opm +000: Repository open. repmgr v1.0 listening: type 'help' for help. ++done++ |
Note the use of the -r
parameter to name the file containing the repository definition.
That file is opened and read into an XML structure, which is then kept for the duration of the session.
If you were speaking to a SOAP system, that file is read when the SOAP server is started; under AOLserver,
the file is read when the AOLserver is started. When using the C or Python API, you start by opening a repository,
and you pass the open repository structure into all subsequent calls to the API.
No matter how you end up addressing the repository, the definition file is always in the same format, and the explanation of that format is the purpose of this document.
Here is how this document is structured:
A repository's basic component is a set of lists. Each list corresponds to a set of uniquely keyed objects made up of fields. (Objects can also contain things other than fields, but we'll worry about that later.) The simplest system possible, then, is a repository with a single list. To make things very simple, we'll store the list as tab-delimited fields in a file.
<site> <list id="mylist" storage="delim:mylist.txt"> <field id="field1"/> <field id="field2"/> </list> </site< |
I'm going to assume you understand the basics of how XML works (if not, this document and the wftk are both going to be tough going).
Even if you don't, though, this is a pretty simple file. The key is the <list>
tag, which defines the list within
the repository. The id
attribute identifies the list as mylist, and the storage
attribute
defines where mylist will be stored.
If you look more closely at the storage
spec, you'll see that it consists of an adaptor name delim
, followed
by a colon, followed by a filename. This tells the wftk that the data in this list is managed using the LIST_delim
adaptor
and the adaptor then knows that the file to store the data in should be mylist.txt
. Each adaptor knows how to handle its
own storage specs; some may also use additional attributes on the list definition to store additional information about the storage
of the data.
The delimited text adaptor LIST_delim
is a simple way to store simple objects. This adaptor stores each record in the list
as a single line in the text file named. To illustrate this usage, let's assume the mylist.txt file contains the following,
where [tab] is replaced by actual tab characters:
# This is a comment line. First line [tab] second field Second line [tab] blah blah Third line [tab] blorgh # Maybe another comment Here is another [tab] hahaha |
In this file, note that blank lines are ignored, as are comment lines (those beginning with #
). Any line not a comment
line or a blank line is assumed to be a record in mylist.
So if we run repmgr from the command line, we can poke around a little. The following transcript is from Windows.
C:\projects\sites\test>repmgr -r site.opm +000: Repository open. repmgr v1.0 listening: type 'help' for help. ++done++ list +100: OK, data follows. 1 key(s) found: mylist +000: OK ++done++ list mylist +100: OK, data follows. 4 key(s) found: first_line second_line third_line here_is_another +000: OK ++done++ get tabs here_is_another +200: OK, XML follows. <rec id="here_is_another" key="here_is_another" list="mylist"> <field id="field1">Here is another</field> <field id="field2">hahaha</field> <field id="linenum">3</field> </rec> >> bye +000: Ciao ragazzo. ++done++ |
mylist.txt
, but sometimes there may be a lot more involved.
delim:
adaptor has a pretty strictly delineated object structure; other
adaptors are free to return whatever XML may be appropriate, but since delim:
is building the XML for you from the line
of text it sees, there's not a lot of flexibility. But for all adaptors, "get" is always the same: give a key, get an object.
delim:
always delivers XML that is a <rec>
element, containing a list of <field>
elements precisely as defined in the list definition. It then tacks another field on the end with the line number; this
can be used for any purpose.
The wftk allows you not only to read this data, but also to modify it and add new entries. That will be covered in the data storage portion of the User's Manual.
Although delim:
is a very easy and flexible way to slap some data into a file and get the wftk to use it, it has some
limitations. So it isn't the default storage adaptor. The default is the localdir:
adaptor (local directory), which
stores each object as a named XML file in a subdirectory of the repository's local directory. The localdir:
adaptor
can also store attachments as separate files in that same directory, which is used for versioned document management and for
storage of process definitions for workflow.
The localdir:
adaptor's XML storage is completely arbitrary. It can be anything at all. Let's look at an example.
All fine and well, but let's face it, most data in the world is in relational databases, and for a very good reason: they work really well. The wftk has adaptors for a number of different databases (ODBC under Windows, Oracle, and MySQL) and writing new database code is quite easy, so if you have a favorite database in your system environment, you can get the wftk to talk to it without a lot of hassle.
At any rate, we'll use MySQL to illustrate database connectivity, with the assurance that any other database will work in the exact same way.
The wftk doesn't have the ability to create tables in your database, so essentially what the list definition is doing for a list stored in a database is to define the storage, not establish it. This example will use a MySQL table created using the following code. (If you use a different database, of course, you'll modify this correspondingly.)
create table mylist ( field1 varchar(50), field2 varchar(50) ); |
Given that table, we can now define a list in the repository to access it, as follows: