Mail reader/mail parser


The mail reader implemented in this file is really more accurately a mail parser. As such, it should be useful for a number of different purposes, not least of which will be to parse entries in a mbox-format repository list. (There is in fact already an mbox list adaptor under development.) However, the primary motivator as of this writing (26 July 2002) is to enable a wftk installation (either wftk-bare or a repmgr-based installation) to accept email as input. In this scenario, the email incoming is on stdin, and the handler is invoked to create a list entry or a workflow process which has the basic starting data included in the mail.

So basically what we want is a function which will accept a stream (or a string in memory) and return a convenient xmlobj object with values set appropriately. To make things really convenient, we'll set this up to accept a buffer at a time, and the XML object currently being built, and thus we can use the same function for stream access or for string parsing, without needing the double overhead of reading an entire (possibly large) mail into memory and the parsing in memory.

The output will be of the form:
<mail>
  <from>bob@mymail.com</from>
  <from-name>Bob</from-name>
  <to>customerservice@mycompany.com</to>
  <to-name/>
  <subject>I have a problem</subject>
  <body-text>...</body-text>
</mail>
Those are the minimum fields; we also want such functionality to handle multipart/alternative MIME formatting, and of course we want the ability handle attachments.

At first, I thought it would also be a good idea to include some sort of template parser to allow formatted data submission via mail, or maybe a command language parser a la majordomo. But after a little thought, I realized that these should be handled by either repmgr or by the appropriate procdef after submission, as script or action invocations. That allows an installation to be built as flexibly as possible, and it keeps the complexity of the front-end mail handler down.

Links to the pieces:

This code and documentation are released under the terms of the GNU license. They are copyright (c) 2002, Vivtek. All rights reserved except those explicitly granted under the terms of the GNU license. This presentation was prepared with LPML. Try literate programming. You'll like it.