File tagger

This little ditty is the second I've written in the App-a-week framework; the idea is to take small, interesting application ideas from the Internet community and implement them in a well-documented way. The idea is to do that in a couple of days, but in practice, this second application has dragged out for three weeks, due to the Christmas holiday and due to the fact that I've taken the time to work through my always-in-progress Python GUI framework. Good excuse to do something with that, but the reason I've never finished it is that it's hard work. Ha.

You can see the straight Python of this program here, and you can download a compiled version for Windows here. If you want to develop with this framework, um, drop me a line and I'll get more motivated with an easily downloadable version, but it's still pretty immature (although this program demonstrates that it is capable of supporting at least one application!)

Anyway, you can see the framework here at the wxpywf page, and this application is a special-purpose modification of the PyPop general-purpose application wrapper for the wxpywf framework classes. The wxpywf module relies heavily on various XML-oriented work I've done in the context of the wftk open-source workflow toolkit, with an underlying C library for XML manipulation, a Python import wrapper for that, an object-oriented Python wrapper on that, and finally the wxpywf library. I just keep getting closer to nirvana with this thing; eventually the wrappers will be stacked twenty high and I'll just say, "Computer, organize files." Yeah.

In the wxpywf framework, the UI is described with XML, and the code is all placed into so-called CLIs (command-line interfaces), which bundle actions taken into command-addressable chunks. The nice implication of the command-line framework is that actions can be logged, scripted, and performed in consistent ways through different interfaces. I find it a nice abstraction, and since this was the first application of anything like real size that I've actually "finished" with it, I discovered to my pleasant surprise that it is convenient.

So let's get down to business by importing our modules and describing our UI.

 
import wftk
import wxpywf
import string
import os.path
from wxPython.wx import *

ui = """
<ui>

<list id="files">
<field id="path" label="Full path"/>
<field id="name" label="Name or nickname"/>
<field id="description" label="Description"/>
<field id="tags" label="List of tags"/>
</list>

<menubar id="main">
  <menu label="&amp;File">
     <item cmd="add" label="&amp;Add a file" help="Add a file to the database"/>
     <item cmd="mod" label="&amp;Edit file info" help="Edit a file's information in the database"/>
     <item cmd="del" label="&amp;Delete a file" help="Delete a file from the database"/>
     <separator/>
     <item cmd="exit" label="E&amp;xit" help="Quit the program"/>
  </menu>
</menubar>

<frame id="main" menu="main" title="File tagger" onfiledrop="add %s">
  <tabset edge="bottom" field="tabshown">
    <tab label="Cloud">
       <html field="html"/>
    </tab>
    <tab label="Files">
       <splitter split="vertical" minpanesize="40" sashpos="80">
         <panel>
           <box id="box1" dir="vertical"/>
           <radio-group label="Display:" dir="vertical" field="selection" box="box1" box-weight="0">
              <radio value="all" label="All tags"/>
              <radio value="some" label="Some tags"/>
           </radio-group>
           <listbox field="tags" box="box1" box-weight="1">
              <value value="test1"/>
              <value value="test2"/>
           </listbox>
           <button label="Show" box="box1" box-weight="0" cmd="update_list"/>
           <button label="test" box="box1" box-weight="1" cmd="test blah blargh"/>
         </panel>
         <list list="files" field="filelist">
           <col label="Name" field="name"/>
           <col label="Tags" field="tags"/>
           <col label="Description" field="description"/>
         </list>
       </splitter>
    </tab>
  </tabset>
</frame>

<dialog id="add" title="Adding files" h="300" w="300">
  <box id="box1" dir="vertical"/>
  <static box="box1" format="yes">You are adding [numfiles] files.</static>
  <static box="box1">Tags to apply to all files (separate with spaces):</static>
  <text box="box1" field="tags" multiline="yes"/>
  <static box="box1">Description for all files:</static>
  <text box="box1" field="description" multiline="yes"/>
  <checkbox box="box1" field="edit_each" label="Do you want to edit each file separately?"
             on="yes" off="no"/>
  <box id="box2" box="box1" dir="horizontal"/>
  <button box="box2" label="Add" value="ok"/>
  <button box="box2" label="Cancel" value="cancel"/>
</dialog>

<dialog id="mod" title="Editing file info" h="300" w="300">
  <box id="box1" dir="vertical"/>
  <static box="box1" format="yes">Path: [path]</static>
  <box id="box1a" box="box1" dir="horizontal"/>
  <static box="box1a">Nickname:</static>
  <text box="box1a" field="name"/>
  <static box="box1">Tags (separated by spaces):</static>
  <text box="box1" field="tags" multiline="yes"/>
  <static box="box1">Description:</static>
  <text box="box1" field="description" multiline="yes"/>
  <box id="box2" box="box1" dir="horizontal"/>
  <button box="box2" label="Update" value="ok"/>
  <button box="box2" label="Cancel" value="cancel"/>
</dialog>



</ui>
"""

defn = wftk.xml (ui)

Now let's interpret the command-line parameters. The general form of the invocation is:

   filetagger1 -fcst[tag] <file> [command arg arg arg]
     -f : show file tab first
     -c : show cloud tab first
     -s : run silently (execute command and exit)

The initial values of the flags are stashed in the UI definition after loading, and will be processed when the frame is being built.

 
defn['command-line'] = sys.argv
sys.argv.pop(0)
if len(sys.argv) > 0 and sys.argv[0][0] == '-':
   flags = sys.argv.pop(0)[1:]
   while flags != '':
      if flags[0] == 'f': defn['show'] = 'Files'
      if flags[0] == 'c': defn['show'] = 'Clouds'
      if flags[0] == 's': defn['silent'] = 'yes'
      if flags[0] == 't':
         defn['selection'] = 'some'
         defn['tags'] = flags[1:]
         break
      flags = flags[1:]
try:
   defn['datafile'] = sys.argv.pop(0)
except:
   defn['datafile'] = 'default.ftg'
if len(sys.argv):
   args = []
   for a in sys.argv[1:]:
      if a.index(' ') > -1:
         args.append ('"%s"' % a)
      else:
         args.append (a)
   defn['startcmd'] = sys.argv[0] + ' ' + ' '.join(args)   # I just love this groovy syntax!

Next, let's open the data file, if it exists. If it doesn't exist or if the XML is corrupted, we'll just build a standard template. On every change to the data file, we're going to save it back out to disk -- that might be a little expensive for a larger database, but this little app isn't intended to scale all that well. If we want it to scale, we're going to have to rebuild the data access portions, a task for a later date.

Once the data file is open, we'll add it to the defn structure -- that will make it easily accessible to everything in the app.

 
try:
   infile = open(defn['datafile'])
   data = wftk.xmlobj()
   data.read (infile)
except IOError, message:
   data = wftk.xmlobj (str="""
<data>
<files/>
<tags/>
<cloud/>
</data>
""")
except wftk.ParseError, message:
   # TODO: delve into the many layers between here and expat, and format better messages; py_xmlapi isn't doing the best it could.
   defn['datafile-error'] = "%s is corrupted XML (%s).  Quit before making any changes if you think you can save it." \
                             % (defn['datafile'], message)
   data = wftk.xmlobj (str="""
<data>
<files/>
<tags/>
<cloud/>
</data>
""")

defn.append_pretty (data)

Now let's define a little index into the file list, just for ease later, and the same for tags.

 
class g: pass
dataindex = g()
dataindex.filelist = {}
dataindex.tags = ['tags']
dataindex.tag_counts = {}
def reindex():
   dataindex.filelist = {}
   dataindex.tags = []
   dataindex.tag_counts = {}
   for f in data.loc('.files').elements():
      dataindex.filelist[f['path']] = f
      for tag in f['tags'].split():
         try:
            if dataindex.tags.index(tag) > -1: dataindex.tag_counts[tag] = dataindex.tag_counts[tag] + 1
         except:
            dataindex.tags.append(tag)
            dataindex.tag_counts[tag] = 1
reindex()

The next step is to define the commands which will be called by the UI above. In a WxPyWf program, most of the code is stored in a "command-line interface", or CLI, which is attached to the frames and dialogs of the UI. The commands work with the widgets and controls defined in the UI, and any external datasources defined; the commands are thus the actual program itself.

Since most of the application logic of the file tagger application is right here in this CLI, it makes a lot of sense for me to document it in separate sections, so as to avoid potential unwieldiness. So here, you simply see the "glue", as it were, which defines the functions to be used, initializes the application, and so forth. Follow the links to the definitions of the various actual commands which do interesting things.

 
class mycli(wftk.cli):
   def __init__ (self, defn):
      self.defn = defn
      self.mode = 0
      self.commands={}
      self.commands['initialize']   = ['initialize',   self.initialize,   0,  0, '']
      self.commands['update_list']  = ['update_list',  self.update_list,  0,  0, '']
      self.commands['find_tags']    = ['find_tags',    self.find_tags,    0,  0, '']
      self.commands['update_cloud'] = ['update_cloud', self.update_cloud, 0,  0, '']
      self.commands['save']         = ['save',         self.save,         0,  0, '']
      self.commands['show_tag']     = ['show_tag',     self.show_tag,     1,  1, '[tag]']
      self.commands['add']          = ['add',          self.add,          0, -1, '[list of files]']
      self.commands['mod']          = ['mod',          self.mod,          0, -1, '[file] [changes]']
      self.commands['del']          = ['del',          self.delete,       0, -1, '[list of files]']
      self.commands['test']         = ['test',         self.test,         0, -1, '']


   def initialize(self, context, action, obj):
      self.frame.do (context, 'find_tags')
      if defn['show']      != '': self.frame.do (context, "set tabshown %s" % defn['show'])
      if defn['selection'] != '': self.frame.do (context, "set selection %s" % defn['selection'])
      if defn['tags']      != '': self.frame.do (context, "set tags %s" % defn['tags'])
      self.frame.do (context, 'update_list')
      self.frame.do (context, 'update_cloud')
      if defn['startcmd']  != '': self.frame.do (context, defn['startcmd'])
      if defn['silent'] == 'yes': self.frame.do (context, 'exit')  # TODO: should this be standard WxPyWf behavior?

   def save(self, context, action, obj):
      data.write(defn['datafile'])

   def test(self, context, action, obj):
      wxpywf.notify_user (self.frame['html'])

   def show_tag(self, context, action, obj):
      self.frame.do (context, "set selection some")
      self.frame.do (context, "set tags %s" % action['parm(0)'])
      self.frame.do (context, "set tabshown Files")
      self.frame.do (context, "update_list")

   See Updating the list control
   See Finding the list of tags and updating the listbox
   See Updating the cloud HTML
   See Adding a file or files
   See Modifying the data associated with a file
   See Deleting a file or files

Finally, we conclude by providing the minimum necessary Python wrapper code to bootstrap all the above into a WxPyWf program. Well, almost minimum -- this is the handiest place to notify the user about anything funky with the command line.
 
class App(wxApp):
   def OnInit(self):
      frame = wxpywf.frame(None, defn, defn.search ('frame', 'id', 'main'), cli_list=[mycli(defn)])
      self.SetTopWindow(frame)
      if defn['datafile-error']: wxpywf.notify_user(defn['datafile-error'])
      return true

app = App(0)
app.MainLoop()
And that, in a nutshell, is a WxPyWf program. It really couldn't be less complicated. Now let's define the commands which do all the heavy lifting.

Updating the list control
The list updater is the first time we actually use any data from the applicaton. Note that the UI of the Files tab is laid out with a list control and a panel. The panel contains controls whose current settings influence the XML structure in the context parameter to the command. For instance, the radio group "selection" can take a value of "all" or "some" depending on which radio button is selected, and the value of "tags" is whichever tag is selected in the list box. This command is called when the "Show" button is clicked. So to determine the list of files to be displayed, we first look at the radio button selection. If it's "all", we display everything, easily done with a list comprehension. If not, we scan the list of files in the data structure and build it tuple by tuple on the basis of the tags field of each file. Either way, we give the list control the list of tuples to display, and there you have it. Done.
 
   def update_list(self, context, action, obj):
      if context['selection'] == 'all':
         file_list = [(f, f['path']) for f in data.loc('.files').elements()]
      else:
         file_list = []
         for file in data.loc('.files').elements():
            keys = file['tags'].split()
            try:
               if keys.index(context['tags']) > -1:
                  #print "keys %s contains %s" % (keys, context['tags'])
                  file_list.append ((file, file['path']))
            except:
               pass
      self.frame.getboundfield('filelist').reload(file_list)


Finding the list of tags and updating the listbox
While the list control itself is bound to the field for the list control above, the listbox for the tags, as a simple control, is embedded in a panel instead. So we have to tell the panel which of its fields should be updated.
 
   def find_tags(self, context, action, obj):
      keys = dataindex.tags
      keys.sort()
      self.frame.getboundfield('tags').new_listbox_values('tags', keys)


Updating the cloud HTML
 
   def update_cloud(self, context, action, obj):
      if len(dataindex.tags) == 0:
         self.frame['html'] = "There are no files in the database.<br>Add some by switching to the Files tab below."
         return true
      sm = 3.0
      lg = 8.0
      delta = lg - sm
      links = []
      maxcount = 0
      for key in dataindex.tags:
         if dataindex.tag_counts[key] > maxcount:
            maxcount = dataindex.tag_counts[key]
      for key in dataindex.tags:
         weight = dataindex.tag_counts[key] * 1.0 / maxcount
         font = int(sm + delta * weight)
         links.append ('<a href="cmd:show_tag %s"><font size="%d">%s</font></a>' \
                       % (key, font, key))
      self.frame['html'] = '\n'.join(links)


Adding a file or files

The add handler is actually a little involved; it's essentially a wizard.

First, we get a list of files to be added from the (XML) action structure given us. If there are no files specified, then we'll pop up a standard files dialog to get the user to select one or more.

One way or the other, we now have a list of zero or more files to add. If there are zero, our action is clear. If there's one, we'll pop up a modification dialog (see below). If there's more than one, then we'll get clever: first, we present a dialog allowing the user to specify tags and a general description, then we'll let the user check a box to edit each file's information separately afterwards.

 
   def add(self, context, action, obj):
      files = [action['parm(%s)' % i] for i in range(int(action['parms']))] # Note: this syntax is scarily beautiful.  See below.
      if len(files) == 0:
         dlg = wxFileDialog(self.frame, "Choose a file or files to add to the database", ".", "", "*.*", wxOPEN|wxMULTIPLE)
         if dlg.ShowModal() == wxID_OK:
            for path in dlg.GetPaths():
               files = files + [path]
         dlg.Destroy()

      if len(files) == 0:
         return true
      elif len(files) == 1:
         filerec = wftk.xmlobj (str="")
         filerec['path'] = files[0]
         filerec['name'] = os.path.basename(filerec['path'])
         if context['selection'] == 'some': filerec['tags'] = context['tags']
         if wxpywf.call_dialog(defn.search('dialog', 'id', 'mod'), self.frame, self.frame, \
                               rec=filerec, title='Add file to database'):
            data.loc('.files').append_pretty(filerec)
      else:
         filesrec = wftk.xmlobj()
         filesrec['numfiles'] = `len(files)`
         if context['selection'] == 'some': filesrec['tags'] = context['tags']
         filesrec['edit_each'] = 'yes'
         if defn['silent'] == 'yes' or \
            wxpywf.call_dialog(defn.search('dialog', 'id', 'add'), self.frame, self.frame, \
                               rec=filesrec, title='Add multiple files to database'):
            for file in files:
               filerec = wftk.xmlobj (str="")
               filerec['path'] = file
               filerec['name'] = os.path.basename(filerec['path'])
               for field in ('tags', 'description'): filerec[field] = filesrec[field]
               if filesrec['edit_each'] == 'yes':
                  if not wxpywf.call_dialog(defn.search('dialog', 'id', 'mod'), self.frame, \
                                            self.frame, rec=filerec, \
                                            title='Adding file to database'):
                     next
               data.loc('.files').append_pretty(filerec)
      reindex()
      self.frame.do (context, 'update_list')
      self.frame.do (context, 'find_tags')
      self.frame.do (context, 'update_cloud')
      self.frame.do (context, 'save')
Note on the elegant syntax of files = [action['parms(%s)' % i] for i in range(action['parms'])]: this is what Python calls a list comprehension and it's one of those write-only things that makes LISP or Perl so hard to work with but normally doesn't trouble us when working with Python. But it's just so beautiful... Suffice it to say that the action object tells us how many parameters it's giving us, and they're all in "parm" elements. This single line extracts the contents of each "parm" element into a nice list of strings.

Granted, this is not the first list comprehension in this presentation. It is, however, the first I wrote. Then it started to get addictive.



Modifying the data associated with a file
Modification of a file requires a file to be selected in the list control, unless one is specified in the command itself (for instance, if it's invoked from the command line.) Either way, the modification dialog is displayed. If the dialog is cancelled, then the file isn't saved or anything updated.
 
   def mod(self, context, action, obj):
      try:
         rec = dataindex.filelist[context['filelist']]
      except KeyError:
         wxpywf.notify_user("There is no file selected in the file list.")
         return true

      if wxpywf.call_dialog(defn.search ('dialog', 'id', 'mod'), self.frame, self.frame, rec):
         reindex()
         self.frame.do (context, 'update_list')
         self.frame.do (context, 'find_tags')
         self.frame.do (context, 'update_cloud')
         self.frame.do (context, 'save')


Deleting a file or files
 
   def delete(self, context, action, obj):
      try:
         rec = dataindex.filelist[context['filelist']]
      except KeyError:
         wxpywf.notify_user("There is no file selected in the file list.")
         return true
      if wxpywf.ask_user("Delete file %s from the database?\n(This will not delete the file itself.)" \
                         % context['filelist'], title="Deleting file") == wxID_YES:
         rec.delete()
         reindex()
         self.frame.do (context, 'update_list')
         self.frame.do (context, 'find_tags')
         self.frame.do (context, 'update_cloud')
         self.frame.do (context, 'save')





Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.