I never really officially released wftk 1.0, of course (the magnitude of
the task simply grew and grew and I became less and less certain of my
approach -- and then the recession happened.) But I've been thinking a lot
of a more reasoned approach lately, and maybe it's time to reboot the wftk
project and start more or less "from scratch".
I see the modules in this new approach more or less as the following:
- Data management
This is the basic list-and-record aspect that the repository manager
started out addressing. Now, of course, there is SQLite. So a principled
workflow toolkit would start by using SQLite for local tables, and add
"external tables" (for which the new SQLite has an API) defined in what
SAP now calls the "system landscape". It's amazing, by the way, how much
of my thinking over the past few years I see reflected in what SAP is doing
lately in their NetWeaver stuff.
- Document management
Document management, as I see it, consists of: (1) actual central storage
and versioning of unstructured data; (2) storage of metadata about documents;
(3) parsing and indexing of unstructured data to produce structured data
elsewhere in the system. The document manager should be able to work well
in either situations where it controls storage (and thus can initiate action
whenever anything is changed) or when it merely indexes a storage which can
be changed externally -- that latter might be, for instance, management of
a Website's files in the file system. Or just your system files on a Windows
machine. Periodically, the document manager could check in and see whether
things had been changed, and if so, trigger arbitrary action.
- "Action" management
A central script and code repository defines the actions that can be taken
by a system. I consider this to include versioning and some kind of
change management and documentation system, including literate programming
and indexing of the code snippets. The build process should also be managed
here, and should be capable, for instance, of taking algorithms written in
C, compiling them into DLLs or .so dynamic load libraries, and calling them
from Perl, say. Ultimately.
Actions, documents, and data would have a nested structure, by the way; there
would be global actions, application actions (a given case or project could
be an instance of an application), and project/instance actions, and the same
applies to data and documents, perhaps. Originally I'd thought of doing the
same for users or organizational units, but I really think that if you're
defining a common language of actions and data, it should be organized into
applications and, perhaps, subapplications or something. But not
differ by user! (I might be wrong, of course.)
The above three modules together allow a data-flow-oriented processing
system, but we're still
- Outgoing interfaces
This includes publishing of HTML pages, outgoing mail notifications, other
notifications such as SMS or ... whatever. Logged, all of it. It includes
report generation into the document management system or the file system,
generation of PDFs, etc.
- Incoming interfaces
Given the parsing power of the document management module, this is more an
organizational module. The system should be able to receive email, parse
it, and take action. Conversational interfaces are covered here as well,
from SMTP- and IMAP-like state machines to chatbot NLP interfaces. And
of course form submission from Websites also falls into this bucket.
Whether running on Unix with cron and at, or Windows with ... whatever the
hell Windows offers, the system should have a single unified way of dealing
with time in a list of scheduled tasks.
- Users, groups, roles, and permissions
This module would be in charge of keeping track of who is performing a
given action and whether they're allowed to do so. The original wftk already
provided a really nice mechanism which would still be nice here: when judging
permissions, any action can get the answers "yes, it's allowed", "no, it's
not allowed," and "it's allowed subject to approval." That last invokes
workflow for any arbitrary action and that would be a powerful
abstraction for nearly any system. It's essentially transaction management
on a much more abstract scale.
And finally, the icing on the cake,
The two components which make workflow workflow are a task list (tasks are
hierarchical in nature and so a task can have subtasks as a separate project)
and a workflow process definition language. The new wftk should be able to
work with any workflow formalism -- after all, the process definitions are
considered scripts in the versioned script document repository. The existing
wftk engine will almost certainly fit in here with little modification.
The primary benefit of workflow is that it allows dissociation over time.
A running workflow process isn't active on the machine for the weeks or months
it might require -- it's simply a construct in the database that gets
resurrected as required. There are a boatload of applications in general
programming, but nobody sees them as workflow because everybody "knows"
workflow is a business application. The wftk was to have changed that, and
I think the potential's still there.
There's also a case to be made for a module for
- Knowledge management
This portion of my thinking is a little less organized. I'd kind of like
to lump some kind of concept database in here, perhaps a semantic parser
or something. Originally I'd thought that AI would go in here, but I
actually think that Prolog might just be another action script language.
This is definitely a blurry line in its native habitat, and crikey, he's
not happy to see me here!
But the point of a blog is to write this stuff down as it occurs. So there
you have it, this would sit on top of the workflow. Think of it as a way
to build smart agents into your data/document/action/workflow management
And there you have it -- my plan to wrap up the thought and work of eight years.
Oh, and this time I'm not bothering with licensing requirements. Like SQLite,
wftk 2.0 will be in the public domain. I don't really care if I get credit
or not for every little thing, because frankly, anybody who counts will figure
it out. And have you noticed how everything these days uses SQLite?
It's because -- well, primarily because it works, but also because you don't
have to worry about legal repercussions of using the code.
That's where wftk document management should be, where wftk workflow should
be. Simple, easy to use, and ubiquitous.