Keyword python


OCR in Python. There isn't any, to speak of. While there do exist a few open-source OCR projects (Conjecture seems to have a great deal of promise!), none of them play well with Python. I may want to rectify that at some point.

Anyway, growing bored of simply writing AutoHotKey scripts to play Tower Defense, I quickly realized that I really needed a tool to start the game for me, and track the score and other stats for later analysis.

The first part was no big deal; I whipped out a PyPop applet that could launch a URL. Since I wanted a window that was sized according to the Flash object, that required putting together a local HTML file that could run some JavaScript to pop up the window I wanted, then close itself afterwards. I'll document that when I have time (I propose the abbreviation wIht to save my time typing that phrase.)

Well. That was fun, and it worked, but I really wanted something to monitor the score for me, and timing, and ... stuff. Which meant that I would have to read the actual graphical screen, because there's no handy-dandy textual output on that Flash app.

You'd think that would be trivial in 2007. But you'd be wrong.

Getting a snapshot of the screen was easy enough. I wrapped win32gui to get a window handle by the title (I'll document this later: Idtl), then installed PIL to grab the actual graphical data and manipulate it. To warm up with that, I set a timer to grab a four-pixel chunk of the screen so I could see whether the Flash had started or not (Tower Defense goes through an ad screen, then a splash screen, and only then does the game start.) That took a little putzing around, but the result was gratifying: my little utility could tell me when the game was ready to play. As long as the window was on my primary monitor, anyway (turns out PIL is not good with multiple monitors -- who knew?) So it turned out I had to move the window before all that.

And then I could grab the sections of the screen with the numbers on them for score, bonus, lives, timer, and money... And then it all screeched to a halt, because there are no open-source Python OCR libraries. At all. And clearly I don't have the time to adapt something -- hell, I don't even have time to do all this. I don't even have time to write this blog entry.

So of course I did the natural thing. I wrote my own special-purpose OCR, because clearly that would be saving time. I saved four hours before I started falling asleep, and it still can't tell 8 from 0 (but it does a fine job on the rest of the digits.) It was a lot of fun, actually. Idtl.

So. Proposal. It would be nice to work some with Conjecture, and produce the following: (1) a Python binding that can work in memory with PIL bitmaps, (2) a Web submitter for test graphics, and (3) an online tester and test database reporter. That would be really cool. Of course, only wIht.


2009-03-20 wftk python perl ruby

So I had this really, really stupid idea a couple of days ago, but I just can't shake it. See, I'm rewriting the wftk in Perl in tutorial form, something that I've planned for a really long time.

Well, here's the thing. The Muse picked Perl, essentially because WWW::Modbot is an OOification of the original modbot stuff I wrote in Perl. And the Term::Shell approach to the modbot turned out to resonate so well with what I wanted to do, that I just ... transitioned straight from the modbot back to wftk in the same framework. But Perl -- even though I love Perl -- is not something I'm utterly wedded to, you know?

And now, I'm working in a unit-testing paradigm for the development. I've carefully defined the API in each subsection, tested it, and know where I'm going.

So here's the stupid idea. It just won't let go of me. Why stick to Perl?

Why not take each class, each unit test, and do that in selected other languages? It would be a fascinating look at comparative programming between the languages, wouldn't it? And the whole point of the wftk is not to be restrictive when it comes to your existing infrastructure -- wouldn't one facet of that unrestrictiveness be an ability to run native in Python? Ruby? Java? C? Tcl? LISP?

It just won't let go.







Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.