wftk tutorial - 02 Data

The first chapter in the wftk "book" is about data for a very important reason, and one I missed for the entire first year I worked with the wftk back at the turn of the century: without an organizational principle for the data sources used by a workflow system, your expressive power is very weak. The better you can describe a variety of data sources, the more useful your workflow system will be.

Data in the wftk is organized into lists. A list is more or less equivalent to an SQL table, with a few differences: first, every record in a wftk list must have a unique key; second, the individual records are not restricted to having the same fields; and third, documents may be attached to any arbitrary record. Still, I find the SQL table to be a useful abstraction when thinking of lists, and of course the session actually exposes a DBI handle if you just want to work with SQL from the get-go.

But a list may simply be a representation of any arbitrary data structure. By implementing subclasses of Workflow::wftk::Data, we can set up nearly anything to be addressed by a workflow system. Is your data stored as rows in a file? Elements in an XML file? Individual files in a subdirectory? Actual rows in a MySQL or ODBC table? No problem; the workflow system can see that data, and manipulate it directly, once you've defined its structure. The wftk can even copy data from one source to another with a single command, or use index tables in one storage form for data in a second storage form. It understands lists of data within records in another list. The idea is to do everything.

Each item in a list is a record. A record is essentially just a hash of named values. The different records in a list will often, perhaps usually, all have the same fields -- but that is not required.

In addition to field-based data, the record may also store historical data about changes made to it, actions performed on it, or events involving it. We'll see this later, when the facility is used to store the enactment of a workflow process, but it is available to any record in the wftk. The only requirement is that you define where this historical information is stored, if the storage mode you're using doesn't do it for you. For instance, if a list is stored in a MySQL table, you'll have to put historical data elsewhere (it could be in another MySQL table, or perhaps in a separate log file elsewhere).

The wftk also understands document management. To any record, you can attach any set of arbitrary documents (although a given list may restrict your ability to do so with wanton abandon). These attachments can be anything -- from incoming faxes to source code to ... whatever. The wftk will handle version control for you if necessary. Document management can also simply be a descriptive system to track files managed externally, for instance the code files in a programming project.

Here is an index to the information in the data chapter:

(unresolved tag 02-index)