Applying the wftk: lartmeister.com

LART tool central

The idea for LARTmeister has been one of the motivating impulses for wftk for me over the last two years. I first came up with the idea while active in online spam/scam fighting at the MMF Hall of Humiliation; just as a matter of clarification, the slang term LART stands for "Loser Attitude Readjustment Tool" and it's basically a question of finding out who to notify about a particular Net abuser so that relevant accounts can be terminated.

Finding the culprit in Net abuse can be an entertaining pastime, but I found it rather tedious in short order. There are certain well-known tools to be used, such as host, whois and traceroute to determine Internet configuration and responsibility for various net blocks and domain names. There are Web pages to check (and cache in case they're removed), diagnostics to run, and so forth -- all entertaining enough. But if you really want to fight spam, you have to report the spam and keep track of what you've done, and that requires organization. And organization is boring.

So back in 1999 or so I thought it'd be really nice to have a tool to help me organize my LARTs, and from there it was a natural extension to extend that notion to a free service available to the masses, which would both help combat Net abuse and also provide a natural way to track Net abuse to identify trends. I sketched out some ideas, implemented some of the tools in online form, and ... got distracted with other things, like building the Despammed.com free mail filtration service. But of course, I get a lot of abuse reports from Despammed.com users, and boy it'd be nice to organize all that. Wouldn't it?

So what I need is of course a workflow system. And thus it's no coincidence that I have written one. But until the last few months, I wasn't sure where to start building the LARTmeister, because it entails more than just a simple case object. Now, I think there is enough basic work done on wftk and the repository manager that the LARTmeister should be something approaching child's play. I'll be working it out here and documenting the time I spend, so we'll see how easy wftk makes a rather complex web site.

Overview

So it seemed to me that the logical place to start with such a system design would be to get an idea of the objects involved, the people who use the system, the processes which get things done, and the interfaces to other systems required. This is set out below in the methodology section, but first, let's muse a little on some of the conceptual universe that LARTmeister needs to understand.

First and foremost, the LARTmeister is driven by reports of abuse incidents by submitters. An incident is a case, in the workflow sense, and is the center of attention for actual activity in the system. The abuse incident may be an email spam, a newsgroup spam, or a website which promulgates a scam of some sort.

However, the case draws on a large and somewhat complex database of known information, such as a list of known spammers and known MMFools (people who participate in chain letter scams). The system also tracks ISPs and whitehats (known effective abuse fighters) for abuse reporting. It may utilize external resources such as abuse.net to manage this information.

Once an incident has been reported, there are various standard steps which can be taken. These mostly take the form of execution of standard tools. In the case of email spam, for instance, headers can be analyzed in an attempt to determine the source of the mailing or any relays abused on the way. In the case of a website, whois and host can be used to determine the owner of the site, and traceroute can be run to determine the site's hosting and upstream. Once responsible parties have been identified (if possible), then reports can be dispatched. Each report should have a return address uniquely tied to the incident, so that any and all replies or related correspondence (even by BCC) can be attached to the incident with no further ado. (Once a case is closed, its attached email address is disabled and can even be used as a spamtrap!)

If the abuse concerns a webpage (even if this is a page spamvertised in an email) then the system can automatically monitor the page in question, to track when and if it is disabled, check that it is not re-enabled, and notify the submitter if the page is changed in any way. In this case, known scam pages are established and can be screened in later incidents.

I can go on. I have reams of this kind of idea. If you've done any spam fighting, you do, too. But this is meant to be a general overview, a sketch of what is doable. So I'll stop, but I will challenge you to one thing: if you're a programmer, try to get an idea of the programming involved if you were starting from scratch using, say, CGIs and a database. Please email me with your estimate just so I can get an idea. I know this is the kind of project I characteristically underbid, so I don't trust my own judgment, except to note that until I got this far with the wftk, I didn't even start implementing this for fear it'd consume me.

I'll be logging actual time spent on the project. This overview and some other notes took me approximately 1.5 hours to create, and I started on it on Wednesday, January 16, 2002. Wish me luck.

Methodology

Not that I'm a real expert on this, but here's the methodology I'm going to follow while working on the LARTmeister example. I'll change it as I work, and link from the headings to the actual documents as they're created.

Write some use cases. (19 Jan 2002)
The use cases follow individual users through interactions with the system. By identifying the important use cases, we can identify the objects, pages, and interfaces needed to implement them. Extensions to the system can be seen as new use cases or modifications to existing ones.
Simultaneous with this, start planning Website layout.
Since the LARTmeister is going to be at lartmeister.com, it needs supporting content and a consistent look and feel. This effort can progress completely parallel with development of the active portions, of course.
Define objects and enumerate roles.
Complete object definitions will probably be long in coming, but initial layout of the schema will give us a way to toss in some test data and test our intuitions about the use cases.
Define and implement interfaces.
Since data management is already implemented as soon as the objects and roles are defined, the logical next step is to implement the ways that the system will actually be able to take action.
Define and implement processes and tools.
Tools (such as wrappers for whois and traceroute) may turn out to be the most complicated part of this system. I'm not sure yet.

What I'm only now starting to see (even though I'd always planned it) is that construction of a system using wftk is ideally no more than defining it. In other words, a careful description of a complex system of data interactions should itself be that system, because all the necessary logic is already implemented in wftk. That's the idea behind 4GL (fourth-generation languages) of course: you tell the system what to do, not how to do it, and it takes it from there. Let's hope that wftk will really make this as easy as I think.