My Secret Life as a Spaghetti Coder
home | about | contact | privacy statement
Dave Mark raises some interesting questions about artificial intelligence in games over at First, he explains that although we're seeing more and better AI in games, a common complaint heard from gamers runs along the lines of "why can't they combine such and such AI feature from game X in game Y." Then, Dave poses the questions for developers to answer:
We can only cite limited technological resources for so long.
Perhaps, from a non-business standpoint... that of simply an AI developer, we should be asking ourselves what the challenges are in bringing all the top AI techniques together into the massive game environments that are so en vogue. What is the bottleneck? Is it money? Time? Hardware? Technology? Unwillingness? Unimaginativeness? A belief that those features are not wanted by the gamer? Or is it simply fear on the part of AI programmers to undertake those steps necessary to put that much life into such a massive world?
Let me first admit that I'd wager Dave Mark knows a lot more about this stuff than me. That's how he makes a living, after all. My experience in developing game AI comes from choose your-own-adventure text-based games as a kid (where the algorithm was very deterministic, with few options), making villagers walk around in Jamaicanmon!,

and making spaceships run away from you instead of seeking you out in Nebulus: Ozone Riders.

I even asked Glitch, Wes, and Jonathan (teammates on the project) to remind me of some simple vector math and draw it out on the wet erase board for Nebulus. And I still made them go the wrong direction (which ended up being pretty cool, actually).

In other words, I haven't had much experience with AI as it's typically implemented in games, and what little experience I have had is limited to things I wouldn't (personally) classify as AI to begin with.

Still, I have had some experience in what I might call "classical" AI (perhaps "academic" is a better term). Stuart Russell and Peter Norvig wrote the Microsoft of books on Artificial Intelligence (90% market share for AI textbooks), and I've read through a fair bit of it. I've implemented a couple of algorithms, but mostly I've just been introduced to concepts and skimmed the algorithms. In addition, I've been through Ethem Alpaydin's book, Introduction to Machine Learning, which would have comparatively fewer ideas applicable to games.

I guess what I'm trying to say is: although I have some knowledge of AI, consider the fact that Dave's experience dwarfs my own when I disagree with him here: It is precisely the fact that we don't have enough processing power that gets in the way of more realistic AI in our games. Or, put more accurately, the problems we're trying to solve are intractable, and we've yet to find ways to fake all of them.

I'm not convinced you can separate the failures of AI from the intractability of the problems, or the inability to design and implement a machine to run nondeterministic algorithms for those problems in NP.

Compared to deciding how to act in life, deciding how to act in Chess is infinitely more simple: there are a finite set of states, and if you had the time, you could plot each of the states and decide which move gets you the "closest" (for some definition of close) to where you'd like to be N moves from the decision point. Ideally N would get you to the end, but even for Chess, our ability is limited to look ahead only a small number of moves. Luckily for Deep Blue (and others), it turns out that's enough to beat the best humans in the world.

Even though Chess is a complex problem whose number of game states prevent us from modeling the entire thing and deciding it, we can cheat and model fewer states - when we can make an informed decision that a particular path of the decision tree will not be followed, we can forgo computation of those nodes. Still yet, the problem will be huge.

There are other ways of "faking" answers to the AI problems that face us in development. Approximation algorithms can get us close to the optimal solutions - but not necessarily all the way there. In these cases, we might notice the computer doing stupid things. We can tweak it - and we can make special case after special case when we notice undesired behavior. But we are limited in the balance between human-like and rational. Humans are not rational, but our programs (often) are made to be.

Presumably, they give the policeman in GTA 4 a series of inputs and a decision mechanism, and he's thinking purely rationally: so sometimes the most rational thing for him to do based on the decision model we gave him is to run around the front of the car when he's being shot at. Sometimes he jumps off buildings. (Video via Sharing the Sandbox: Can We Improve on GTA's Playmates?)

It may not be smart, but it is rational given the model we programmed.

You can make the decision model more complex, or you can program special cases. At some point, development time runs dry or computational complexity gets too high. Either way, game AI sucks because the problems we're trying to solve have huge lower bounds for time and space complexity, and that requires us to hack around it given the time and equipment we have available. The problems usually win that battle.

Pac-man game screen shot Game AI has come a long way since Pacman's cologne (or maybe he just stunk), and it will get better, especially as we move more gaming super-powerful servers. Still, it's far from ideal at the moment.

What do you have to say? (Especially the gamer geeks: programmers or players)

Hey! Why don't you make your life easier and subscribe to the full post or short blurb RSS feed? I'm so confident you'll love my smelly pasta plate wisdom that I'm offering a no-strings-attached, lifetime money back guarantee!

It's a small step, but emcee-3PO can now identify the staves in an image of sheet music for my single test case of "My Darling Clementine." I need to include hundreds more test cases, and I plan to when I implement code to make the tests mark the sheet music with what emcee3po detected so I can visually inspect the accuracy.

Ichiro Fujinaga's "Optical Music Recognition using Projections" (PDF) explains the process in detail, but it turns out to be relatively simple.

To locate the staves:
  1. Do a y-projection on the image.
    A projection just reduces the number of dimensions in an image. In this case, we just take the number of dark-colored pixels in a row of the image. It's similar in theory to 3D projection, but instead of projecting three dimensions onto a plane, we're projecting a plane onto a line.

    I used a threshold of 50% to determine if a pixel was dark enough to include in the projection. So, if R+G+B < (FF+FF+FF) / 2, I count the pixel as dark.

  2. Find the local maxima.
    We want to find the places where the number of dark pixels in a row is highest - those will indicate the horizontal lines on the staff. To do that, we find all the places where the number of pixels stops growing and starts getting smaller -- or where the slope changes from positive to negative. To ignore noise, we set a threshold as Fujinaga suggests at the average of each row, so we don't include anything less than that in our collection of local maxima.

  3. Find the tightest groups of 5.
    We want to find all the places where 5 local maxima are the smallest distance apart, which should indicate the 5 lines in a staff. This part is accomplished by examining each 5-element window in the array of local maxima, and finding the one with the smallest distance between its points. Then you can remove all the windows that include any of those points, and continue until there are no more windows.

  4. Expand those indexes to collect the places where the notes fall outside the staff lines.
    I don't remember Fujinaga mentioning this in the paper I linked to above, but I'm thinking it must be in there. Essentially, since the local maxima get us only what's in between the 5 lines of the staff, we need to expand it a bit so we can get the notes that don't fall directly between the 5 lines. Right now, I've used 1/4 of the average of the rows in the projection, but I think it will need to be an even smaller threshold because I'm still not reliably getting all of the notes.
Up next: reading the notes on the staves. That's going to be cool.

Even though they have a special hero(ish) status, it's a popular pastime (some might say cliché) to complain about medical doctors making so much money when nurses and other supporting cast in the medical industry "do all the work."


The Turing Test was designed to test the ability of a computer program to demonstrate intelligence. (Here is Alan Turing's proposal of it.) It is often described as so: if a computer can fool a person into believing it too is a person, the computer has passed the test and demonstrated intelligence. That view is a simplified version.

Quoth Wikipedia about the rules:
A human judge engages in a natural language conversation with one human and one machine, each of which try to appear human; if the judge cannot reliably tell which is which, then the machine is said to pass the test.
Specifically, the test should be run multiple times and if the judge cannot decipher which respondent is the human about 50% of the time, you might say he cannot tell the difference, and therefore, the machine has demonstrated intelligence and the ability to hold a conversation with a human.

I bring this up because recently Giles Bowkett said that poker bots pass the test, and pointed to another post where he said the Turing test was beaten in the 1970s by a paranoid repeater named PARRY.

I suppose the idea of poker bots passing the test comes about because (presumably) the human players at the table don't realize they are playing against a computer. But if that is the case, even a losing bot would qualify - human players may think the bot player is just an idiot. More...

The following was generated using a 7th order Markov chain and several of my blog posts as source text:


A couple of weeks ago the UH Code Dojo embarked on the fantastic voyage that is writing a program to solve Sudoku puzzles, in Ruby. This week, we continued that journey.

Though we still haven't completed the problem (we'll be meeting again tenatively on October 15, 2007 to do that), we did construct what we think is a viable plan for getting there, and began to implement some of it.

The idea was based around this algorithm (or something close to it): More...

A couple of days ago the UH Code Dojo met once again (we took the summer off). I had come in wanting to figure out five different ways to implement binary search. The first two - iteratively and recursively - are easy to come up with. But what about three other implementations? I felt it would be a good exercise in creative thinking, and pehaps it would teach us new ways to look at problems. I still want to do that at some point, but the group decided it might be more fun to tackle to problem of solving any Sudoku board, and that was fine with me.

Remembering the trouble Ron Jeffries had in trying to TDD a solution to Sudoku, I was a bit weary of following that path, thinking instead we might try Peter Norvig's approach. (Note: I haven't looked at Norvig's solution yet, so don't spoil it for me!) More...

For background and history on partial order planners, see What is Partial Order Planning?, Selected History of POP Part 1, and POP History Part 2. Or, you can read the entire thing in PDF format.

Our goal is to give commands to the partial order planner, telling it what the goal is, the initial state (if it exists), and actions it can perform. The actions contain the name of the action, any preconditions that must be fulfilled before that action can be performed, and a set of effects the action has on the world state. After giving this information to the planner, it should output a plan if one exists.

For simplicity's sake, I've used a STRIPS-like notation, without the complexity of existentially or universally quantified variables, among other things. Further, only one possible plan is returned, rather than attempting to find all plans. The one returned is not guaranteed to be optimal, though (inadequate) tests have shown that it is correct. Plans are to improve these limitations in the future, moving to a less restrictive ADL-style syntax, and adding support for returning multiple plans. More...

If you need more context (rather than jumping straight into this), check out What is Partial-Order Planning? and Selected History of Partial Order Planning, Part 1 if you haven't already.

* * * *

As David Chapman noted in 1985, "planners of the most promising ('nonlinear') sort have been complicated, heuristic, ill-defined AI programs, without clear conditions under which they work." And since the time of NOAH, INTERPLAN, and NONLIN (and others I've left out for space reasons), there have been various improvements in the realm of partial-order planning. Chapman's program, TWEAK, is one of them, and was followed by the UCPOP and RePOP planners. More...

It was apparent as far back as 1975 that "linear planning" (or totally ordered planning, as described in What is Partial-Order Planning?) was not sufficient. Russell and Norvig relay the story of Allen Brown's experiment: that it could not solve simple problems such as the Sussman anomaly, where given 3 blocks labeled A, B, and C, with block B on the table and C on top of A which is on the table, get to the goal state of A on top of B on top of C (Russell, 410, 414) (See Figure 1). Around that time, as part of his Ph.D., Austin Tate released a paper which described INTERPLAN, a system to solve the problem of interleaving shown by Brown using Sussman's HACKER program (the Sussman Anomaly) (Tate [A] / Russell 410). More...

Last night I had the good fortune to be in attendance at a talk given by Ray Kurzweil entitled "The Web Within Us: When Minds and Machines Become One." For those unfamiliar with Ray, part of his bio as given in the program distributed at the presentation reads
Ray Kurzweil has been described as "the restless genius" by the Wall Street Journal and "the ultimate thinking machine" by Forbes. Moreover, Inc. magazine ranked him eighth among entrepreneurs in the United States, calling him the "rightful heir to Thomas Edison." ...

As one of the leading inventors of our time, Kurzweil was the principle developer of the first CCD flat-bed scanner, the first omni-font optical character recognition system, the first print-to-speech reading machine for the blind, the first text-to-speech synthesizer, the first music synthesizer capable of recreating the sound of a grand piano and other orchestral instruments, and the first commercially marketed large vocabulary speech recognition system.

For our case, we'll explore partial-order planning in a classical planning environment. Such an environment is fully observable (as opposed to only partially so) and deterministic (as opposed to having randomness, or being stochastic). Further, the space is finite and static in nature - it does not change in the middle of deliberation. Finally, the environment is "discrete (in time, action, objects, and effects)," as opposed to continuous along any of these axes (Russell, 375. For further reading on the characteristics of environments, see Russell page 41-42). More...

Wow, six posts in one day. I'm exhausted. My last link to InfoQ today comes in thanking them for the timely post Ruby Domain Specific Languages Buzz. There, I got that out of the way.

It is timely for me, because a couple of weeks ago I decided I was going to try to implement a Partial-Order Planner DSL in Ruby. I haven't yet started, nor have I decided on a full course of action. But, I do have a vague strategy outlined in my head, and while I have yet to decide if I will be using the code provided by Reginald Braithwaite in his post about "an Approach to Composing Domain-Specific Languages in Ruby," the content will probably prove helpful. Another link they put was to Creating DSLs with Ruby posted by Jim Freeze to Artima's Ruby Code and Style.

I'll let you know how my progress goes. I'll probably start a little survey of what Partial-Order Planning is - similar to my paper on k-means - and cover some research about it first and post piece by piece to build my paper this time, rather than waiting to post it all at once. I'm not married to that approach, so you might see some code first ... but I'm leaning that way.

Just finished writing a survey on some of relatively current literature on k-means, focusing on introducing it, some practical applications of it, some difficulties in it, and how to find k, the number of clusters. I'm still new to the area, so don't expect much groundbreaking to be done.

The second half focuses on my own experiment, trying to find k using two similar, but slightly different techniques. I failed, but if you'd like to go over it and either laugh at me, or perhaps figure out what I've done wrong, you are free to. =)

Obviously, this isn't going to interest many people, so I didn't take time to mark it up - it's just available as a DOC (I had planned on having a PDF version, but my PDF writer has taken a crap on me). If you don't have Word or Open Office, and would like to read it, contact me and I'll try to get the PDF for you in some way or another.

Anyway, the DOC is here if you want to read it. It's over 3600 words, so beware!

I'm interested to know if anyone has built any machine learning libraries or done anything with machine learning in Coldfusion? My immediate thought is "no way!" because I don't think Coldfusion has the performance for it. But, I wouldn't know, since I haven't tried it. Have you? What's been your experience? Drop me a line if you care to.


Picture of me

.NET (19)
AI/Machine Learning (14)
Answers To 100 Interview Questions (10)
Bioinformatics (2)
Business (1)
C and Cplusplus (6)
cfrails (22)
ColdFusion (78)
Customer Relations (15)
Databases (3)
DRY (18)
DSLs (11)
Future Tech (5)
Games (5)
Groovy/Grails (8)
Hardware (1)
IDEs (9)
Java (38)
JavaScript (4)
Linux (2)
Lisp (1)
Mac OS (4)
Management (15)
MediaServerX (1)
Miscellany (76)
OOAD (37)
Productivity (11)
Programming (168)
Programming Quotables (9)
Rails (31)
Ruby (67)
Save Your Job (58)
scriptaGulous (4)
Software Development Process (23)
TDD (41)
TDDing xorblog (6)
Tools (5)
Web Development (8)
Windows (1)
With (1)
YAGNI (10)

Agile Manifesto & Principles
Principles Of OOD
Ruby on Rails

RSS 2.0: Full Post | Short Blurb
Subscribe by email:

Delivered by FeedBurner