So I have gotten sucked into this cool game, Tower Defense -- apparently there's a whole genre, but this is the one I've been playing with the past couple of days.
It's a Flash game, and the premise is simple. You have a grid with two incoming gates and two outgoing gates. Enemies come in the incoming gates, and you have to kill them before they get to the outgoing gates. You lose one health point for every enemty that gets through. You have 20 health points and no way to get more.
So -- the point of the game is to build defense towers which run automatically. Each enemy you kill nets you a small amount of money, which you can use to build new towers, or upgrade existing ones. (You need to upgrade to keep up with the steadily increasing toughness of the enemies, of course.)
It's fun. I thought of a fun way to do some really cool programming with it. AutoHotKey can find the Flash window, and can scan the screen for whatever you need. It can click things, too. So my notion is to write an AHK front-end to play the game, with an IP connection to a strategy server. The strategy server would then use evolutionary programming to come up with board arrangements, and over time (perhaps a great deal of time, who knows?) you could watch the strategizer get really good at Tower Defense.
But it would be fun to try it.
My notion is that the instructions for a given strategy would consist of a series of builds at given coordinates, intermixed with upgrades to the towers built. Each instruction would only execute after there was enough money available for it. So you'd have a string of instructions which would express a given board setup.
To combine strategies, you'd just ... well, I guess you'd just cross the streams, so to speak, taking two (or more?) successful strategies, cut them at a random spot, and splice them together. You might want to have some cleanup rules (you can't upgrade squirter #3 if you didn't build three squirters, for instance.)
And then you'd let it run for a few weeks. You could get other people to download your script and run strategies, too.
It could be very, very cool...
So it turns out that AutoHotKey can indeed play Tower Defense; I recorded a script to set up my favorite opening last night. It occurs to me that it would be quite easy to set up a simple "strategy-description language" which would be a series of commands to be executed in order. Translating that higher-level language into an AHK script would be trivial.
So at least it would be possible to track strategies and compare them. That's step one in an evolutionary approach.
Couple that with a Python front-end to make it all more convenient, and we're off to the races!
OCR in Python. There isn't any, to speak of. While there do exist a few open-source OCR projects (Conjecture seems to have a great deal of promise!), none of them play well with Python. I may want to rectify that at some point.
Anyway, growing bored of simply writing AutoHotKey scripts to play Tower Defense, I quickly realized that I really needed a tool to start the game for me, and track the score and other stats for later analysis.
Well. That was fun, and it worked, but I really wanted something to monitor the score for me, and timing, and ... stuff. Which meant that I would have to read the actual graphical screen, because there's no handy-dandy textual output on that Flash app.
You'd think that would be trivial in 2007. But you'd be wrong.
Getting a snapshot of the screen was easy enough. I wrapped win32gui to get a window handle by the title (I'll document this later: Idtl), then installed PIL to grab the actual graphical data and manipulate it. To warm up with that, I set a timer to grab a four-pixel chunk of the screen so I could see whether the Flash had started or not (Tower Defense goes through an ad screen, then a splash screen, and only then does the game start.) That took a little putzing around, but the result was gratifying: my little utility could tell me when the game was ready to play. As long as the window was on my primary monitor, anyway (turns out PIL is not good with multiple monitors -- who knew?) So it turned out I had to move the window before all that.
And then I could grab the sections of the screen with the numbers on them for score, bonus, lives, timer, and money... And then it all screeched to a halt, because there are no open-source Python OCR libraries. At all. And clearly I don't have the time to adapt something -- hell, I don't even have time to do all this. I don't even have time to write this blog entry.
So of course I did the natural thing. I wrote my own special-purpose OCR, because clearly that would be saving time. I saved four hours before I started falling asleep, and it still can't tell 8 from 0 (but it does a fine job on the rest of the digits.) It was a lot of fun, actually. Idtl.
So. Proposal. It would be nice to work some with Conjecture, and produce the following: (1) a Python binding that can work in memory with PIL bitmaps, (2) a Web submitter for test graphics, and (3) an online tester and test database reporter. That would be really cool. Of course, only wIht.