November 04, 2010

Why Not Learn?

The question seems to come up again and again, usually once or twice a year... why can’t we use machine learning to simply analyze a pile of player data, and learn the AI for us? Certainly doing “manual knowledge elicitation” (a fancy term for “building your AI by hand”) has its own drawbacks. It’s challenging. It’s time consuming. As the complexity of the problem space grows, the complexity of the AI typically grows in a worse-than-linear fashion. And machine learning is all the rage in academic circles. It automates all that difficult work, and simply extracts what you need from a pile of data. So why aren’t we using it more widely? Well, there are lots of reasons. Here are a few:

Reason #1: Making It Work

Machine Learning algorithms typically require four things in order to do their work:

  1. You need to define the features that the algorithm should track.
  2. You need to specify the output parameters that the algorithm needs to return.
  3. You need to define the fitness function, which tells the algorithm how good or bad a particular example is.
  4. You need a huge pile of example data for the algorithm to sift through.

So, for example, if you’re writing a learning algorithm that’s going to learn how to drive a car on the highway, the features might be things like:

  • The distance from the lines to your left and right
  • The presence or absence of cars to your left and right
  • The distance from the car in front of you
  • Your speed
  • The speed of the car in front of you
  • Brake lights of the car in front of you

And so on. The fitness function would return high scores when you have a good following distance, your speed roughly matches that of the car in front of you, you’re centered in your lane, etc. The output parameters would be settings for the brake, gas, and steering wheel. Finally, you’d need a whole bunch of data from people actually driving cars, so that the learning algorithm could plug through and figure out how to do it well. For a game, all of these can be tough to find. In particular, the fitness function and the example data can be tricky.

For the fitness function, we can’t even define “fun,” much less measure it -- how are we going to go about encoding a fitness function for it? What about “clever,” “humorous,” or “tricksy?” The characteristics that give our characters their personality, that make them memorable, that deliver the experience we want the player to have, are not things that are easy to encode.

Once we get past that, the next hurdle is that we need learning data from the game in its final state. But the game isn’t ever in its final state. Even after you ship, there will be patches, and DLC, and expansion packs, and player generated content... Every time one of those hits, you need to run the machine learning all over again -- which means you need to gather all of that data all over again. That’s not cheap. And if we make a change to the game -- even a small one -- and don’t relearn the AI, then the AI isn’t going to be able to accommodate for that change.

You might think that you can just throw some learning data together quickly, but in reality gathering the learning data can be one of the hardest parts of the task. The output of your learner will only be as good as the data you feed in, so it's important that you have a large data set which covers all possible cases. Any situation not included in the data won't be learned. What's more, the data set needs to be built in such a way that the AI won't over learn, or learn the wrong thing.

Finally, that late in the project there’s just not time to iterate. As a result, your machine learning needs to be absolutely reliable -- that is, it needs to produce top quality results with no flaws the first time you run it, every time, no matter what. That’s not a realistic expectation of any code, much less of code that’s so reliant on correctly defining the features, outputs, and fitness function, and then gathering a well balanced set of learning data -- rather arcane tasks which are the responsibility of a mere human.

Reason #2: Rigid Behavior

Learning algorithms are designed to find the single best solution that they can. There are two problems with this: the words “single” and “best.” If there is only a single solution, then every time you hit a particular situation, the AI will take the same action. This results in behavior that is rigid, repetitive, and, in the worst case, exploitable. Furthermore, the learning algorithms try to find the “best,” which is to say optimal solution. The absolute shortest path. The perfect, precise motion. The result looks robotic, which shouldn’t be surprising because it is robotic.

Real human behavior is almost never truly optimal, and humans almost never do exactly the same thing twice in a row. Humans are full of small imperfections, small variations -- and those are huge part of making your character feel real, organic, and alive.

Could we build a learning algorithm that keeps a history and intentionally adds some variation? Or could we add small imperfections as a post-processing step? Probably, with sufficient expertise in the domain -- but it’s no longer a straightforward application of machine learning at this point. Either we have a much more complex learning problem, or we have this hacked on post-processing step that may need to be updated and adjusted each time we run the learning. Either way, it significantly increases the complexity of the problem. Remember what we said above -- the learning algorithm needs to be absolutely reliable. Complexity is the enemy of reliability in this case.

Reason #3: Monolithic, Immutable AI with a Lack of Creative Control

Learning algorithms tend to be black box solutions. That is, they give you some mapping from inputs to outputs. The job of the AI, then, is to sense the inputs, look up the appropriate set of outputs, and apply them. It is generally impossible to make any changes to that mapping -- or, in many cases, even to examine the mapping and understand why it is the way it is. If you want to change the AI, the only way to do so is to change the input parameters (i.e. the features, outputs, fitness function, and learning data), and then run the algorithm again.

There are a few implications to this. First and foremost, it takes creative control away from the designers. If they want the characters to exhibit some particular sort of behavior -- perhaps something to make them realistic in some way, or something that gives them some personality -- the only way to do that is to write a fitness function for it. As I said, what is the fitness function for “cool?” Even if it’s possible to encode that, it’s going to take an expert -- which means that the designers are absolutely reliant on experts to do their tuning for them.

Second, the result is monolithic. It is impossible to change just one part of it. If there’s a localized bug -- perhaps a special case where the behavior isn’t quite right -- there’s no easy way to fix that single issue in isolation. Sure, you can hack a fix in that handles that using a hand-coded approach, but that isn’t very extensible, nor is it likely to hold up well the next time that you relearn the AI. At the end of the day, if you make too many of those hand-coded changes you might as well just have written your whole AI by hand to start with.

Finally, every time you make any change at all -- even a tiny little one -- you have to retest the entire game. Even if the change is not to the AI itself, if it could impact the AI then you need to generate a new data set, relearn, and retest. This is a huge task which makes QA essentially impossible.

Reason #4: Nobody’s Ever Done It

Back in 2002, at the AI Roundtable at GDC, and we took a survey of the room to see what people thought was going to be the most exciting upcoming technology over the next 5 to 10 years. This was back when most people were using non-hierarchical FSMs or simple scripting, we were getting at most 10% of the CPU, and most games didn’t even have a dedicated AI programmer (to say nothing of a full AI team). Almost unanimously -- there were a few dissenters, including myself -- people said “machine learning.” It was the bright new technology that was going to change everything for us.

Last year, again at the AI Roundtable at GDC, somebody said “well, what about using machine learning to create your AI?” The whole room groaned. -- and then we proceeded to have a discussion much like this blog post.

The point is, all that hope and excitement has worn off. After 7+ years of trying, we still have no good examples of learned AI. Sure, there’ve been a few niche examples of learning in games. Creatures. Black & White. More recently, Galactic Arms Race. But in all of those cases it was a specialized use of machine learning -- it wasn’t the core algorithm.

It’s not for lack of trying. Academia has attempted it. Game companies have attempted it and the abandoned it before shipping. There is even a middleware company (AI Live), populated mostly by PhD’s -- very smart folks -- dedicated to building machine learning techniques for games. They have had some success with motion recognition, particularly on the Wiimote, but I don’t know of any games that shipped with their learning approach as the core AI.

Reason #5: It’s Not Easier... So What’s The Point?

The whole point of machine learning is to save effort and cut costs -- and yet, a recurring theme through all of the above is “even if you could do it, it would be tricky.” If it’s cheaper and easier to build the AI by hand, then that pretty much takes the wind out of the sails on machine learning. In my experience, given all the challenges, it is absolutely cheaper and easier to build the AI by hand. And until that changes -- until somebody not only manages to pull off a learned AI, but also demonstrates that it actually saves development time and money -- machine learning is going to be dead in the water.

Posted by kdill4 at November 4, 2010 02:09 PM

This. My final year CS project was to work with a couple of other guys and build a machine learning AI given a stack of player data for a major online RTS (can't say more I'm afraid).

As the group AI geek I said right out at the first meeting: "this is not going to be easy, and in fact to do what they have in their heads is impossible"

Nevertheless, we did some good research, got some really interesting results from analyzing the data, applying some naive techniques (like K-means clustering) to discover high-level strategic decisions. We wrote a 60 000 word paper on why "this is not easy," and still when we presented our paper, one of the lecturers said "so wait, you never actually made an AI?"

Posted by: sconzey at November 4, 2010 03:07 PM

I give this article 5 stars out of 5.

Thanks for posting this, Kevin ... This will be very useful to link to whenever I get into discussions about ML in games in the future.

Posted by: Paul T at November 6, 2010 03:45 PM

I think one more remaining big issue is how to connect the learned AI with the game's balance, which is sometimes more something of perception rather than sheer balance.

Posted by: Raul Aliaga at November 8, 2010 04:00 PM

I don't disagree with many points raised in this post. However I must take up point 2 about rigid behaviour.

One of the main disciplines in a Machine learning approach is to not cause "over fitting" by finding potentially sub-optimal but more general solutions on the set of training data.

Also 3 is debatable as some machine learning techniques to output symbolic solutions rather than "sub-symbolic" ones. Where you would be able to go and modify the solution afterwards.

However I tend to agree that doing doing full agents from learning would be running before you could walk, or even crawl. There are plenty of places to be able to use machine learning within game AI though.

Posted by: Iain at November 15, 2010 07:46 AM