Keyword blogmeta

2006-11-27 blogmeta

Welcome to the new blog. This is an idea I've been kicking around since the day I heard the term "web log" (we hadn't abbreviated it), but I just haven't had the time until now. You know how it is.

The idea of the blog is simple. I've been programming for a very long time now, and I like to write about it. So when I discovered that after a significant hiatus from daily programming, my Muse had reawakened, I resolved that from this point on, I would spent at least a small amount of time each day programming something. The results have been gratifying; I've completed a number of small tasks I'd been wanting to resolve for a while.

And hence this blog. This is the place where I intend to present said small tasks, on a daily basis, for your perusal and, dare I hope, your amusement. I hope you enjoy it; I know I'm going to.

Some of the things I want to cover in the near future are, in no particular order,

  • A few Word macros I've written to fix things up after TRADOS macros break them
  • My experiences getting Python to read and write TTX files (a file used by TRADOS -- I earn my money with translation nowadays, mostly, so that's why translation tools come up so often
  • This very blog, which is a freaking Rube Goldberg contraption of Perl scripts and Makefiles
  • The amazing Toon-o-Matic, which I use to spin out graphics for my Web cartoon -- remember, the Toon-o-Matic is the work of art; the strip is just a by-product, like hot dogs
  • GUI builders for wxPython, something I've been hacking around on for a while without discernable progress (but I hope that will be changing)
  • Workflow applications, of course.

That ought to keep me in posting fodder for a few months, eh?

Today's fun task was the creation of a little prototype code to format the tag cloud for the drop handler project. I did it in the context of this blog, and so first I had to get my keywords functional. I already had a database column for them, but it turned out my updater wasn't writing them to the database. So that was easy.

Once I had keywords attached to my blog posts, I turned my attention to formatting them into keyword directories (the primary motivation for this was to make it possible to enable Technorati tagging, on which more later.) And then once that was done, I had all my keywords in a hash, so it occurred to me that I was most of the way towards implementing a tag cloud formatter anyway.

Here's the Perl I wrote just to do the formatting. It's actually amazingly simple (of course) and you can peruse the up-to-the-minute result of its invocation in my blog scanner on the keywords page for this blog. Perl:

sub keyword_tagger {
   my $ct = shift @_;
   my $weight;
   my $font;
   my $sm = 70;
   my $lg = 200;
   my $del = $lg - $sm;
   my $ret = '';
   foreach my $k (sort keys %kw_count) {
      $weight = $kw_count{$k} / $max_count;
      $font = sprintf ("%d", $sm + $del * $weight);
      $ret .= "<a href=\"/blog/kw/$k/\" style=\"font-size: $font%;\">$k</a>\n";
   return $ret;

This is generally not the way to structure a function call, because it works with global hashes, but y'know, I don't follow rules too well (and curse myself often, yes). The assumptions:

  • The only argument passed is the maximum post count for all tags, determined by an earlier scan of the tags while writing their index pages.
  • $sm and $lg are effectively configuration; they determine the smallest and largest font sizes of the tag links (in percent).
  • The loop runs through the tags in alphabetical order; they are all assumed to be in the %kw_count global hash, which stores the number of posts associated with each tag (we build that while scanning the posts).
  • For every tag, we look at its post count in the %kw_count hash and split the difference in percentages between $sm and $lg -- then format the link with that font size. Obviously, this is a rather overly hardwired approach (the link should obviously be a configurable template) but as a prototype and for my own blogging management script, this works well.

For our file cloud builder, we'll want to do this very same thing, but in Python (since that's our target language). But porting is cake, now that we know what we'll be porting.

Thus concludes the sermon for today.

Just so I don't forget how this blog thang works, I want to assure each and every one of you bots who reads this blog that I've been doing lots of cool programming-type stuff... Well, OK, actually, the family and I went sledding every day for a week and a half while the weather was right, and I've been scrambling to finish up a hueueueuge job that I was neglecting during that time. So ... no excuse at all.

On the translation front, I've discovered that hotstrings (e.g. Word's AutoCorrect feature) can really speed up my typing in cases where I'm doing repetitive texts. Even the time savings of typing "s" instead of "SAP" can really add up over time. But Word's AutoCorrect has a problem -- it saves different lists for each style, and in some instances it triggers without waiting for me to finish the word; if one of my hotstrings is a prefix of another word, that can truly suck. Some Googling got me to AutoHotKey -- and AutoHotKey truly and totally rocks. A lot of what I wanted to do with PyPop is already done and ready for me to use, actually. So I'm going to start bundling AHK with PyPop for Windows systems. It's that good.

Another nice discovery this week has been SQLite -- which is not just open-source, it's actually public domain. It implements pretty much all of the SQL92 standard (except the permission model) for a lightweight local database for single-user use. Websites have been built on it with impressive performance. And the key is -- you can bundle it into anything. Anything. So it's definitely going into the wftk. Man. I'd been kind of moving towards building my own SQL parser and so on -- what's the point? It's already been done! And beautifully!

So there's life in me yet, never fear.

On that note, it's back to work for me.

I'm not dead! Just really busy.

So I've been wanting to put up some information about organic produce, the digestive system, bioplastics, and so on, but (as usual) don't have the time. And then today, xkcd comes up with this. Ha! Definitely one of the best toons out there.

More later. Really. In the meantime, if you want meaningless, content-free remixes of my pearls of wisdom, I refer you to the Markov version of my blog, my Ulysses in the Caribee. It's different, I'll give it that.

2011-01-01 blogmeta

There were no posts between 2009 and 2014 for reasons I explain in the state of the site post.
2014-04-09 blogmeta

I've finally got the blog publisher working, so all the historical posts and keyword assignments now appear where they should (mostly; still have a couple of weirdnesses to figure out).

It all gets compiled and built on my local machine, written to static HTML, and pushed via git to Github, which as you know, Bob, now serves my static content. (This statement implies that my non-static content is hosted elsewhere, which is technically true - it's still on my old box, but doesn't actually work right now. That is a can of worms for another day.)

I had a lot of fun setting things up, and eventually I intend to post about the new site publication system. But in the meantime, there are lots of other things I also want to write about, and my sabbatical week is already half over just on the blog publishing system alone. (Sigh.)

Over the past two years, I've trained myself to use a note-taking system of my own design to track programming work and ideas, and I've augmented that system to publish some notes to the blog. This is the first of those notes. I hope to rebuild the habit of technical writing now that my translation productivity has risen - which should at least potentially free up some time. Right?

Anyway. There's lots to do. I'm going to get to it.

2014-04-12 blogmeta

Between 2009 and 2014, a lot happened to me, but none of it shows in this blog. Mostly that's because I wanted to blog the house renovation and my blogging code of the day was a horrendous, unmaintainable mess that never seemed to do anything I wanted - so I switched to Blogger. (You can see the house blog here, but as we left the country again two years ago and unloaded the house last summer, it's gotten kinda boring.

In November of that year, I started thinking harder about what programming actually means. And that turned into the semantic programming blog, also at Blogger. Over the next two years, that effort resulted in a kind of neat declarative programming system for Perl, which collapsed under its own technical debt shortly thereafter.

I had family stress for the year after that, culminating in our leaving the country for Budapest in 2012 after our daughter graduated high school. The summer after that we sold the house and I discovered that my blood pressure could best be measured in psi. And now things with family, health, and finances are actually pretty good, and my thoughts wander back to writing.

The Blogger system is great for short-to-medium text and pictures that can be written quickly and don't need any internal structure. It was a fantastic medium for the house blog. It is not fantastic for code given its rigid three-column format, difficulty comprehending monospace fonts, and refusal to handle indentation, and is also less than great for any presentation longer than a few paragraphs. So to write real technical articles, I needed to revitalize the Vivtek site.

Back in the day, the site was hosted on AOLserver for historical reasons, and elements like the sidebar menu and a lot of the other bits and pieces were handled dynamically. The content was compiled on the server. But as that system aged and the box was put to additional uses, the cracks in its structure became apparent. Anything exposed to Internet input becomes an instantaneous spam archive, MediaWiki in a different site I was hosting there was a relentless processor hog, the Despammed spam filter made sure that if there was any resource problem it would be magnified tenfold as the queue backed up behind the kink, and the accreted weirdness of twelve years of haphazard Perl scripts running behind the scenes had forced me to move the site's static content to Github hosting. The dynamic content was left to hang in the weather. I had more important fish to fry.

So to get the site back into working shape, I had to reinvent my content handling code on my laptop and integrate with the static site. And there simply hasn't been enough time to think about that - especially given my penchant, when given any programming task, to think about how it should be possible to code it at a higher semantic level. I can talk to myself at length about just finishing this damn script today and worrying about entire new programming paradigms at some later date, but it doesn't help.

But at the end of last month, after working two month's worth of jobs in March, I decided that what I needed in the first week of April was a sabbatical. I wanted to write at least one article. (Actually I have a whole page of one-line ideas of things to do this week. And instead, it's Saturday night, day 9 of the sabbatical, work already queued up for Monday, and I haven't even written the article yet - but I did do the research for it, and wrote a kick-ass tool for my daily work, and that's really the point of this week.)

And lo! I have the site compiling again - and with a tool I'm not even ashamed of! One that will hopefully soon grow into a more mature writing environment in the future, including a notion I've been calling "code exegesis", about which more later. Hopefully, now that the site builder works well, I can keep things moving with less than a full week off work, and build up some momentum. I have a lot of things I want to write about, so that's not a problem. Historically, family and financial stress have been my greatest impetus sinks, and those (knock on wood) are at low ebb currently, may they stay that way forever.

And that's why the blog below goes from 2014 to 2009 with no visible transition. I am cautiously hopeful that things will start moving again now.

2014-09-06 blogmeta

Traditionally, our family has made use of summers to visit whichever of our two countries we're not currently in. For twenty years, that meant a pilgrimage to Budapest; now that we live in Budapest it means road tripping in the States. This year was no exception - from June 14 to August 30, according to my calculations, I drove a total of 4927 miles, mostly shuttling various family members around Indiana for this event and that.

Even at half-power for the paying work, the result was absolutely zero mental capacity for anything resembling coding, writing, writing about coding, coding related to translation - a complete and utter void. I didn't even take any meaningful notes, beyond the occasional recap of The Plan. (The Plan is a multi-threaded thing that starts with better file handling, goes through literate exegesis of codebases and declarative accounting structures, visits machine translation on the way, and culminates in Hofstadterian generality at its most tenuous reaches - I want to say it's been a constant companion in my life for many years, except that it is utterly mutable. It's actually been a very inconstant companion in my life.)

Anyway, now I'm back in Budapest, Facebook is blocked, I have nobody to visit, no class reunions, all known at-risk family members have already died, I have no need or means to drive people around the countryside, and I'm over the jetlag and the virus I caught in August - in short, session has resumed.

Hold onto your butts.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.