Repository manager: CGI interface

The repository manager is the data handling portion of the wftk open-source workflow toolkit. It organizes objects (sometimes I call them entries) into lists. Such a collection of lists is called a repository. An additional feature of a repository is that it may have a Web view; the layout of this site is the other half of the repository manager's functionality. Pages are organized into a site map, and publishing links are established between lists and pages; when data is changes, static pages can be written to the Website, and those pages can be served up very quickly and easily.

All this functionality works rather well behind the scenes with the repmgr's command-line interface, but when it comes to adding and editing data, the command-line interface offers no particularly good solution, unless you genuinely enjoy typing long XML object files without any backspace. Thus a Web interface is a very handy addition to the toolkit; this CGI interface is the first such front end.

Since the repmgr already has a lot of HTML-formatting functionality built in, the CGI interface is designed to be integrated right into the Website it manages. This means that pages can be formatted in the same way as the overall site with no need to duplicate the formatting work between different systems. Moreover, since the repmgr is integrated with the overall wftk workflow system, it is very easy to set up a system to allow anonymous users (or non-anonymous users) to submit entries to lists, subject to an approval process involving responsible administrators. This is a powerful feature, and a quick-and-easy CGI interface to it means you can set up interactive Websites very easily. (Caveat: this is still alpha-quality software. So by "very easily" I may well mean something you don't find very easy. Give it some time, though, and it will be very easy.)

The CGI interface needn't provide the same level of functionality as the command-line interface. At first, all I really want to do is to provide facilities to display blank forms and add new objects, display edit forms and modify existing objects, list objects, and delete objects. And even listing objects will be the second phase of development -- the minimal level of functionality will rely on a functioning display Website to provide navigation for the editing scripts.

So. Down to business. First order of the day is to include our repmgr and CGI functionality, and get the CGI environment. I'm using my XMLCGI interface to collect CGI environment information; what a blast from the past! I haven't looked at it for two years or so, since I wrote it to support the prototype procdef editor. (Note that the procdef editor will soon be revived, based on the repository management functionality now finally working.) This framework below can be seen as the canonical framework for any CGI program. In the spirit of coming up with a universal template, I'm trying to build it as generally as possible.

#include "repmgr.h"
#include "../xmlapi/xmlcgi.h"

XML * cgi;
XML * env;
XML * query;

XML * headers; /* TODO: in case I decide to build a general CGI framework, this would be useful. */
XML * retval = NULL;

See Global variable definitions

int handle_error (const char * message);
int main(int argc, char* argv[]) {
   cgi = xmlcgi_init();
   env = xml_loc (cgi, ".environment");
   query = xml_loc (cgi, ".query");

   xml_set (cgi, "mimetype", "text/html");

   See Opening the repository definition
   See Authenticating the user
   See Figuring out what to do

   if (*xml_attrval (cgi, "redirect")) printf ("Location: %s\n", xml_attrval (cgi, "redirect"));
   printf ("Content-type: %s\n\n", xml_attrval (cgi, "mimetype"));
   if (retval) {
      xml_writehtml (stdout, retval);
      printf ("\n\n");
   } else {
      printf ("<h2>No result from script</h2><hr>No result was returned from this script.\n"); /* TODO: configuration value. */
   }

   printf ("<br><br><hr width=30%%><font size=-1 color=gray><center>Site managed brilliantly with <a href=http://www.vivtek.com/wftk/doc/code/repmgr/repmgr_cgi.html>repmgr CGI front end</a></center></font>\n\n");

   xml_free (cgi);
   return 0;
}

See Handling error display

Global variable definitions
Here are the global variables used by the rest of the program.

FILE * file;
XML * repos = NULL;
XML * list;
XML * layout;
XML * form;
XML * defn;
XML * obj;
XML * holder;
XML * mark;
const char * username;
const char * mode;
XML * action;

Opening the repository definition
The first order of business is of course to open the repository definition so that we can do everything else in the context of the repository. This is much simpler than in the case of the command-line interface, because there's no particularly nice way to pass in command-line flags to a CGI flag. So we just open the current directory's copy of site.opm. No muss, no fuss.

file = fopen ("site.opm", "r");
if (!file) {
   return handle_error ("No site definition file was found.  Please notify the site administrator.");
}
repos = xml_read (file);
if (!repos) {
   return handle_error ("The site definition file is corrupt.  Please notify the site administrator.");
}
repos_open (repos, NULL, "cgi");

Authenticating the user
This is going to be to-do for a little while. The repository manager library doesn't yet really have user-auth functions, and it'd be silly to implement it twice. Naturally, the wftk core has the user adaptor class, but I haven't looked at it in repmgr yet.

The first level of authentication will be a browser-level authentication; soon after that, however, I'll look at creating user sessions in the repository, which will be indexed via cookie. Then we can store authentication information in the session and skip the browser authentication dialog, which typically confuses most naive users a great deal.

The zeroth level of authentication will simply be to leave it to the webserver; in this case we can just look at the auth information coming in in the CGI headers.

username = xml_attrval (env, "REMOTE_USER");
list = repos_defn (repos, "_users");
if (list) {
   /* Here's where we'll do repmgr-local user authentication. TODO: do it. */
}

Figuring out what to do
Before we can do anything, we have to figure out what it is we want to do. This is governed by the "mode" variable. The default mode is probably going to be "list" -- but since the list handler is going to be in the second phase, we'll just not do a whole lot in the absence of a mode.

Useful modes (actions) will be as follows. In general, GET will be used for retrieval of information, and POST will be used wherever a change to data is being proposed. This allows a blanket user authentication to be imposed by the webserver itself in simple installations, and lets us gain some control over our data even if we haven't implemented any user authentication in the repository itself. These modes are roughly in order of priority of development.

Mode	Meth	Action
new	GET	Presents blank HTML form for adding an object
add	POST	Adds an object from said blank form
edit	GET	Presents a filled HTML form for editing an object
copy	GET	Presents a filled HTML form for copying an object
mod	POST	Edits an existing object using form input
del	POST	Deletes an object
list	GET	Lists keys from a given list
info	GET	Gets installation information or a set editor home page.
attach	POST	Uploads file attachment
retrieve	GET	Downloads file attachment
checkout	POST?	Downloads for modification (i.e. locks the attachment)
search?	GET	Search objects -- I've got some pretty detailed plans for searching

And I suppose we'll take it further from there. It's pretty obvious that the document-management aspects of the CGI interface will be pretty exciting stuff.

2004-12-15: If the form has an _action in it, we should use that for the mode. There are also now some _action standards which we need to support: and action of "state-*" is a "mod" mode which is setting the state, i.e. "state=*", where "*" is replace with whatever the actual state is. Unfortunately for us, this information is not in the query, but on stdin in the object returned from the form. Sigh. What a tangled web we weave. Anyway, this means that the "mod" handler will end up handling more than just modification. Doesn't make much sense, but at least there's historical justification for it. Or something.

mode = xml_attrval (query, "mode");
if (!*mode) mode = "info";

repos_log (repos, 5, 0, NULL, "CGI", "Call with mode %s", mode);

       if (!strcmp (mode, "new")) {
   See Displaying a blank form
} else if (!strcmp (mode, "add")) {
   See Adding an object from form input
} else if (!strcmp (mode, "edit")) {
   See Displaying a filled form for edit
} else if (!strcmp (mode, "copy")) {
   See Displaying a filled form for adding
} else if (!strcmp (mode, "mod")) {
   See Updating object content from form input
} else if (!strcmp (mode, "del")) {
   See Deleting an object
} else if (!strcmp (mode, "list")) {
   See Listing the members of a list
} else {
   xml_setf (cgi, "error", "Mode '%s' not supported.\n", mode);
   return handle_error (xml_attrval (cgi, "error"));
}

Displaying a blank form
The "new" mode requires a list name. If it doesn't get one, it throws an error.

2004-12-15: This fix has been a long time coming, but the repository manager now adds submit buttons by default in the edit form it returns for objects. If they're there, we should refrain from adding our own.

if (!*xml_attrval (query, "list")) { return handle_error ("No list specified in 'new' mode."); }

form = repos_form (repos, xml_attrval (query, "list"), NULL, "new");
if (!form) {
   xml_setf (env, "error", "List '%s' is not defined in the system.", xml_attrval (query, "list"));
   handle_error (xml_attrval (env, "error"));
}

/* Build form for output. */
retval = xml_create ("form");
xml_setf (retval, "action", "repmgr.cgi?mode=add&list=%s", xml_attrval (query, "list")); /* TODO: parametrize the URL */
xml_set (retval, "method", "POST");
if (xml_search (form, "input", "type", "file")) xml_set (retval, "enctype", "multipart/form-data");
xml_append_pretty (retval, form);

if (!xml_search (form, "input", "type", "submit")) {
   form = xml_create ("input");
   xml_set (form, "type", "submit");
   xml_set (form, "value", "Add object"); /* TODO: decorate with label of object. */
   xml_append_pretty (retval, xml_create ("br"));
   xml_append_pretty (retval, form);
}

Displaying a filled form for edit
Posting an edit form is almost exactly identical to posting an add form, except that we require a key.

2004-12-15: Same fix as for the "new" action above.

if (!*xml_attrval (query, "list")) { return handle_error ("No list specified in 'edit' mode."); }
if (!*xml_attrval (query, "key")) { return handle_error ("No key specified in 'edit' mode."); }

form = repos_form (repos, xml_attrval (query, "list"), xml_attrval (query, "key"), "edit");
if (!form) {
   xml_setf (env, "error", "Key '%s' in list '%s' can't be found.", xml_attrval (query, "key"), xml_attrval (query, "list"));
   handle_error (xml_attrval (env, "error"));
}

/* Build form for output. */
retval = xml_create ("form");
xml_setf (retval, "action", "repmgr.cgi?mode=mod&list=%s&key=%s", xml_attrval (query, "list"), xml_attrval (query, "key")); /* TODO: parametrize. */
xml_set (retval, "method", "POST");
xml_append_pretty (retval, form);
if (xml_search (form, "input", "type", "file")) xml_set (retval, "enctype", "multipart/form-data");
if (!xml_search (form, "input", "type", "submit")) {
   form = xml_create ("input");
   xml_set (form, "type", "submit");
   xml_set (form, "value", "Modify object"); /* TODO: decorate with label of object. */
   xml_append_pretty (retval, xml_create ("br"));
   xml_append_pretty (retval, form);
}

Displaying a filled form for adding
And then copying an object is basically identical to editing it, except that the URL on the form is different.

if (!*xml_attrval (query, "list")) { return handle_error ("No list specified in 'copy' mode."); }
if (!*xml_attrval (query, "key")) { return handle_error ("No key specified in 'copy' mode."); }

form = repos_form (repos, xml_attrval (query, "list"), xml_attrval (query, "key"), "edit");
if (!form) {
   xml_setf (env, "error", "Key '%s' in list '%s' can't be found.", xml_attrval (query, "key"), xml_attrval (query, "list"));
   handle_error (xml_attrval (env, "error"));
}

/* Build form for output. */
retval = xml_create ("form");
xml_setf (retval, "action", "repmgr.cgi?mode=add&list=%s", xml_attrval (query, "list")); /* TODO: parametrize the URL */
xml_set (retval, "method", "POST");
xml_append_pretty (retval, form);
if (xml_search (form, "input", "type", "file")) xml_set (retval, "enctype", "multipart/form-data");

if (!xml_search (form, "input", "type", "submit")) {
   form = xml_create ("input");
   xml_set (form, "type", "submit");
   xml_set (form, "value", "Add object"); /* TODO: decorate with label of object. */
   xml_append_pretty (retval, xml_create ("br"));
   xml_append_pretty (retval, form);
}

Adding an object from form input
When adding, we're processing an HTTP POST, meaning that our data is on stdin. After adding, we want to return a page which makes sense -- probably the processed added object makes the most sense, working on the assumption that the CGI interface is integrated with the published site.

if (!*xml_attrval (query, "list")) { return handle_error ("No list specified in 'add' mode."); }

defn = repos_defn (repos, xml_attrval (query, "list"));
if (!defn) {
   xml_setf (cgi, "error", "List '%s' undefined.", xml_attrval (query, "list"));
   return (handle_error (xml_attrval (cgi, "error")));
}
obj = xmlcgi_readstdin (cgi, defn);

repos_add (repos, xml_attrval (query, "list"), obj);

mark = xml_search (repos, "page", "displays", xml_attrval (query, "list"));
if (mark) {
   xml_set (cgi, "redirect", xml_attrval (mark, "page"));
} else {
   xml_setf (cgi, "redirect", "repmgr.cgi?mode=list&list=%s", xml_attrval (query, "list"));
}

retval = xml_parse ("<h2>Change made</h2>Your addition was made.  The server should redirect you to the main list page.");

Updating object content from form input
Again, editing looks a whole lot like adding. 2004-12-15: Except for the fact that it now handles complex actions including state changes, and (unfortunately) deletes. Or will. I frankly don't need deletes yet, so I'm not going to implement them. I have a mortgage to pay. What I do need is the ability to intercept state change requests from complex action submit buttons.

if (!*xml_attrval (query, "list")) { return handle_error ("No list specified in 'add' mode."); }

defn = repos_defn (repos, xml_attrval (query, "list"));
if (!defn) {
   xml_setf (cgi, "error", "List '%s' undefined.", xml_attrval (query, "list"));
   return (handle_error (xml_attrval (cgi, "error")));
}
obj = xmlcgi_readstdin (cgi, defn);

action = xml_search (obj, "field", "id", "_action");
if (action) {
   for (mark = xml_firstelem (action); mark; mark = xml_nextelem (mark)) {
      repos_log (repos, 6, 0, NULL, "CGI", "Action code %s", xml_attrval (mark, "id"));
      if (!strncmp (xml_attrval (mark, "id"), "state-", 6)) xmlobj_set (obj, defn, "state", xml_attrval (mark, "id") + 6);
   }
}

repos_merge (repos, xml_attrval (query, "list"), obj, xml_attrval (query, "key"));

mark = xml_search (repos, "page", "displays", xml_attrval (query, "list"));
if (mark) {
   xml_set (cgi, "redirect", xml_attrval (mark, "page"));
} else {
   xml_setf (cgi, "redirect", "repmgr.cgi?mode=list&list=%s", xml_attrval (query, "list"));
}

retval = xml_parse ("<h2>Change made</h2>Your modification was made.  The server should redirect you to the main list page.");

Deleting an object

Listing the members of a list
Now, OK, granted this doesn't scale well. I'm going to want to punt once I get something that needs to scale. But the full "list" command lists keys under the command line; under CGI it should naturally link to editors, and have deletion checkboxes. Later, as we expand into workflow, there may be other status changers than just deletion. We'll burn those bridges when we come to them.

defn = repos_defn (repos, xml_attrval (query, "list"));
if (!defn) {
   xml_setf (cgi, "error", "List '%s' undefined.", xml_attrval (query, "list"));
   return (handle_error (xml_attrval (cgi, "error")));
}

list = xml_create ("list");
xml_set (list, "id", xml_attrval (query, "list"));
repos_list (repos, list);

The scalable solution will be to specify a search object framework for lists which require it; this framework will then present a search object edit form as the initial list presentation.

Handling error display
Error handling is somewhat complicated by the fact that one of the error situations we have to handle is the absence of a repository, meaning that we have to make up a layout whole cloth.

If there is an "_error" page defined, handle_error will use that layout to generate the error page; otherwise, the default layout is used and the error message, properly formatted, is filled in for the "content" field.

TODO: implement this with layout.

int handle_error (const char * message) {
   printf ("Content-type: text/html\n\n");
   printf ("<html><head><title>Error</title></head>\n");
   printf ("<body><h2>Error</h2><hr>%s</body></html>\n", message);

   return 0;
}

Taking it further
There is plenty of stuff that the CGI interface (or any interface) should do to make the repository manager a really powerful tool. Please feel free to suggest some features. As I get more ideas, I'll include them here. The current rough to-do list is in something like this order:

Repmgr-based user authentication using _user list. (Note that this list serves as an interface for the user adaptor, so that plugging OpenLDAP in will be a fairly high priority.)
Session-based authentication using _sessions list.
"List" and "info" modes.
Document management facilities. Having a useful interface will also make it easier for me to further develop said facilities.

Note that ports of the CGI interface to other environments are not the same as features, but they are interesting ideas for further development. Some Web environments which should be supported will be: AOLserver, Zope, PHP, Perl/CGI using "use Workflow;", and so forth.

This code and documentation are released under the terms of the GNU license. They are copyright (c) 2002-2004, Vivtek. All rights reserved except those explicitly granted under the terms of the GNU license. This presentation was prepared with LPML. Try literate programming. You'll like it.