Deep learning for chess

157 points by mlla 11 years ago | 73 comments

thomasahle 11 years ago |

As the author of sunfish, I'd like to point out something about learning in chess, which is a topic that interests me a great deal.

When sunfish (with its just 111 python lines of chess logic) 'evaluates' a position, it uses the perhaps simplest known effective method: A piece-square table. The method ignores any interplay between pieces on the board and calculates the sum of Table[coord][type] for each piece on the table. E.g. a white knight on a1 may be worth 73 'units', and on f3 it may be worth 98 'units'. That's all there is to it. Any program which has greater than this level of precision, and equally precise searching, should be able to beat sunfish.

The above may sound naive - and it is - but actually most of the advanced ideas used in chess evaluation functions, can be generalized from this method. "Rook connection" is just a measure that includes two pieces instead of one, and "pawn shield" is the generalization to three pieces. Experiments with grandmasters reveal they "recall" positions in 'chunks' of connected pieces. And this memory is what they use to guide their search. (Papers like 'Perception in chess' and lots of newer research).

So, the role of machine learning in modern engines is to tune the parameters for evaluation and search pruning (deciding what positions are worth examining deeper). For the actual decision of which piece to move to where, you still need search algorithms to crunch millions of positions per second.

erikbern 11 years ago | |

Sunfish is really impressive work. From my (brief) understanding of Sunfish, the evaluation function is essentially equivalent to a hardcoded 1 layer network in Deep Pink.

You're right that everything else equal, a better evaluation function should lead to a better chess engine. However in practice I think better evaluation functions means slower evaluation function. So there's some really interesting trade-off there. I doubt humans evaluate more than a few thousand positions, so it seems like a slow but more accurate evaluation function could play chess pretty well

thomasahle 11 years ago | | |

One interesting line of research, I think, is using 1 or 2 layered networks to 'simulate' more complex evaluation functions. If you could train such a network to get within a 10% error of Stockfish's evaluation, then you might be able to distil that network as a faster evaluator to plug back into Stockfish for an even stronger engine. As you say, one hard problem is probably finding actually interesting positions to sample for the training.

Anyhow, it's fun to see how engines like these battle it out. It may also be that your approach can yield a more 'fun to play' engine for us mortals.

tonetheman 11 years ago | |

Thanks for sunfish, great work!

halfcat 11 years ago |

This has been tried many times before, with better-but-still-lackluster results. Sunfish is impressive because it's written in Python and in a tiny number of lines while still being readable. I LOVE Sunfish, but it is among the weakest chess engines in existence. That deep learning could not break even against Sunfish seems rather unimpressive.

The author seems to have a not-very-deep understanding of computer chess. Some examples:

>Better search algorithm. I’m currently using Negamax with alpha-beta pruning, whereas Sunfish uses MTD-f

MTD-F is not better, just a different way to accomplish more-or-less the same thing. MTD-F is a binary-search equivalent of the alpha-beta family of search. In fact, naively switching to MTD-F will probably result in worse playing ability. It takes some time to get it tuned right, and even then it is not objectively better.

>Better evaluation function...By generating “harder” training examples (ideally fed from mistakes it made) it should learn a better model

This is what every beginning chess programmer on the Computer Chess Club message boards and rec.games.chess.computer has wanted to try for the last 20+ years. It has been empirically demonstrated that for best results, the evaluation function should remain simple and fast. Improving evaluation rarely fixes "dumb mistakes". That's what search is for. Efficient search makes up for a multitude of evaluation mistakes.

>Faster evaluation function: It might be possible to train a smaller (but maybe deeper) version of the same neural network

If the evaluation function was reduced to literally take zero time to execute, it would not help significantly. It's a linear improvement being thrown at an exponential problem.

I would LOVE if there was a new approach to computer chess, but the current "smart brute force" approach is so far advanced and successful, it is hard to imagine another approach being competitive.

maaaats 11 years ago |

> Still, even an amateur player probably makes near-optimal moves for most time.

This is far from true. An AI that only looks one-two moves ahead to make sure it doesn't hang a piece or allow mate in one would beat many amateur players (at least with ~5min time control). That's essentially what Sunfish, what he's comparing against, does. Note that Sunfish isn't a particular "good" AI, it would be more interesting comparing it to a "proper" chess AI.

maaaats 11 years ago | |

But with that said, I think the concept here is cool. The fact that it doesn't really know the rules of chess but still can play is interesting. I just think that it maybe should have selected a different database for its games; a master-database instead of one filled with amateur games. Of course, there are far many more amateur games, so such a masterbase is much smaller which may be a problem.

bainsfather 11 years ago |

It would be nice to know how strong the program was. What was its grade?

Saying it can beat a terrible player doesn't mean much.

Saying it can beat Sunfish (a python program with grade of ?maybe ELO 1100? (i.e. not at all strong)) sometimes, when it has a time advantage, is not impressive.

I'd really like to know how much better (if at all) the evaluation function is - e.g. can the program beat itself, if one side uses a 'standard' evaluation function?

Machine Learning is big on measuring outcomes. It is odd that the one outcome that is important here is not measured!

Some caveats: I realise this is someone's hobby project - I do not mean to rubbish it. I'm just saying that the work&writeup could have been much improved by adding this information.

wodenokoto 11 years ago | |

The problem here is very much effectiveness per time unit, as the author writes. It would probably need to be re-implemented in optimized C code in order to truly test it against other optimized engines.

I think the take away here is how relatively easy it is to make a decent AI for a complex game using neural networks.

bainsfather 11 years ago | | |

I do not think that is correct. With chess, you face exponential branching that quickly overwhelms your language-dependent speedup.

Is your python code e.g. 10x times slower than C code? With a branching factor of e.g. 10, that would mean you only search to depth 9 ply whilst C would search to 10 ply [a]. You are not losing that much! A ballpark ELO grade would not vary qualitatively between python and C.

[a] With alpha-beta search, maybe it is 8 ply vs 10 ply. Also the branching factor is larger than 10. But the point remains valid.

ogrisel 11 years ago | | |

It's using theano that generates CUDA code. Before reimplementing I would start with profiling the generated code to spot the main computational bottlenecks.

no_gravity 11 years ago |

A little nitpicking: Everytime the author writes "infinite", a more accurate word would be "enough". For example:

    if you had infinite computing capacity,
    you could actually solve chess.

The statement is correct. But you do not necessarily need infinite capacity to solve chess. Just enough capacity.

Would be interesting to estimate how much capacity.

_fizz_buzz_ 11 years ago | |

I think "infinite" was used in the engineering-sense of the word and not in the stricter mathematical-sense. For me, as an electrical engineer, infinity is often somewhere beyond 1 ms.

howeman 11 years ago | |

People write that chess has ~ 10^42 board positions. If you read the paper about solving checkers (which you should because it's excellent), they evaluated about sqrt(N) positions, and the final solution took cuberoot(N) board positions. IF that extrapolates to chess, it means that the weak solution would take 10^21 evaluations and 10^14 storage. That's big, but in the realm of feasibility.

maaku 11 years ago | |

More than you would have if the entire universe were transformed into computonium.

So effectively infinite.

no_gravity 11 years ago | | |

Not sure the universe is incapable of doing this.

The number of states a chess field can have is 13. It's either empty or has one of the 6 different pieces in black or white on it.

So 13^64 is an upper limit for the number of positions.

We could solve chess if we could put these 13^64 positions into a tree, right?

13^64 = 10^71.

The number of atoms in the observable universe is estimated to be 10^80.

So even the observable universe might be big enough to form this tree. Even if we use big clunky objects like atoms.

We do not have any idea of the size of the unobservable universe.

And I don't know how many states an atom can have. Who knows, maybe a single atom can solve chess if it's programmed correctly? According to quantum theory, pretty small objects like electrons can store and process an amazing (infinite?) amount of information in certain ways.

sytelus 11 years ago |

It's surprising that it is able to win 1/3rd of the time. The problem here is that input does not lie in any continuous space. I mean, you may have 1 billion board states in your training but is it possible to interpolate values of other states using this? For example, for one vector representing certain board state, even a slight change may have completely different outcome. I would think most learning methods, including deep learning, would excel when there is some sort of interpolatable continuity in inputs on which prediction is desired. Therefore the challenge would be transform discontinuity in one board state to another to more continuous space.

phreeza 11 years ago | |

Doesn't this hold true for other domains where deep learning seems to be successful, for example natural language processing? My intuition would be that initial layers of a deep network actually learn smooth-ish representations of the input space.

V-2 11 years ago | |

My thoughts exactly. As an exercise it's surely interesting, but neural network approach is inherently unsuitable for chess.

Chess requires 100% accuracy and in which just because positions are similar, doesn't mean that best moves in these positions have to be in any way similar too.

On the other hand, it sort of mimics the way human player thinks, in terms of recognizing certain patterns. After all, even grandmasters do not bruteforce their way through all possible combinations. We use a hybrid approach: recognize certain strategic patterns first (to drastically narrow down the search tree), and perform calculations on the top of that.

Chess engines can wipe the floor with any player where tactics is involved; the trick of beating a computer is to close the game and take advantage of the fact that it's not able to formulate a long-term PLAN (whose consequences are beyond its horizon).

See how Nakamura repeatedly beat Rybka in blitz games a few years ago, eg.: http://www.chessgames.com/perl/chessgame?gid=1497429 - very instructive :)

sushirain 11 years ago | | |

> just because positions are similar, doesn't mean that best moves in these positions have to be in any way similar too.

To alleviate this, one can add more abstract/heuristic information about the position to the input (indicators for complex relations between several pieces). This kind of high-dimensional vector would be more robust to small changes, and make the objective function more smooth. Perhaps the non-linearities introduced by the three layers cannot do this as effectively.

leeber 11 years ago |

I wrote a program to solve chess once. After I realized that it would take a massive amount of computing resources to finish in my lifetime, I abandoned the project.

Most interesting to me is that it really isn't that hard to create a program to solve chess (i.e. the logic behind it), it just would take too much time/money to actually do it.

It's much more difficult to create AIs and approximations like this.

Kinda weird once you realize that fact...approximating a solution to chess is much more difficult, logic wise, than actually solving chess.

Though I wouldn't be surprised if chess is solved in the next couple decades or so.

tmalsburg2 11 years ago |

I really like this work but the performance of the network against Sunfish is not particularly informative. What I'd like to know is whether this evaluation function captures any non-trivial properties of the board. If it only captures simple heuristics such as "more pieces are better," that's not very interesting. I think it would be worth trying to find out what is actually captured in the network. If the evaluation function is really smart, i.e. capturing non-trivial properties of the position, it could guide a much more focused and thus more efficient search. This is basically what humans do. That, however, would require a modified version of the network that has a continuous output value telling how promising a position is compared to the alternatives. If the evaluation function doesn't play well, that may be really interesting, too, if the mistakes are psychologically plausible. It seems at least possible that this is the case because the network was trained on data sets containing human errors. In general, I think the value of this approach lies in the potential for investigating and replicating human performance rather than developing a stronger chess engine. The problem of playing strong is pretty much solved. What's more interesting now is to develop chess engines that play bad but in psychologically plausible ways.

sehugg 11 years ago | |

Well, the problem of playing strong is pretty well solved if you have lots of computing resources. The problem of playing very strongly on a battery-powered mobile device (for example) is not yet solved. This is where an insanely-accurate evaluation function would come in handy.

tmalsburg2 11 years ago | | |

Ok, agreed. However, I'd guess that engines like Stockfish and Shredder beat most of the chess playing population even when running on an iPhone. Advanced players may not be impressed as they are familiar with methods specifically developed to beat engines but that's not relevant for the majority of players. These people want engines that play at their level without making ridiculous artificial mistakes. Playing against current engines is completely pointless for beginning and intermediate players. Much progress could be made there.

jeremysalwen 11 years ago |

It seems like you could get some improvements by simply training the evaluation function on the output of the entire Deep Pink system including the negamax search.

This would be a very easy way of getting more training data, and is actually very nice theoretically. Assuming enough training time and a complex enough evaluation function, etc, you'd eventually solve chess.

I may check out the code and try this myself...

svantana 11 years ago |

Interesting work, however training on data seems unnecessary; chess would be perfect for unsupervised learning - initially it could be trained against an existing chess program, but as the models improve, they could start competing against eachother. Although one would probably need some way of scoring any given board position (compare with DeepMind's Atari playing).

kylebrown 11 years ago | |

If you input a score ("{-1,0,1} on final positions") its effectively a label, that makes the training supervised rather than unsupervised. See [1] for good reasons to be skeptical of unsupervised learning in general.

See [2] for a twist on the DeepMind Atari player. They use Monte Carlo Tree Search (MCTS of automated Go playing fame) to generate training data. By feeding that more carefully generated gameplay data into the deep q-learning net, they exceed DeepMind's (non-MCTS-coupled) performance.

1. http://karpathy.github.io/2014/07/03/feature-learning-escapa...

2. http://www-personal.umich.edu/~rickl/pubs/guo-singh-lee-lewi...

tmmm 11 years ago | |

Chess's too big for this.

toolslive 11 years ago |

the Houdini (and I think Rybka too) evaluation function is tweaked by letting the engine play zillions of micro games against itself in a tournament. One such micro game lasts a few seconds, and each of the players has a different setting of the parameters for the evaluation function (material, position, ...). You could apply the same meta strategy here.

tkirby 11 years ago | |

Stockfish has a distributed network to allow anyone to donate computer time to test new patches. Currently running nearly 400 games/minute.

http://tests.stockfishchess.org/tests

mattxxx 11 years ago |

Super cool; training game players is super appealing to me as a math-software-engineer-person.

My babble: if DeepPink can gage its uncertainty on a move, it'd be cool to see a hybrid system in-action. Plus, "DeepFish" has a cool name.

Either way - nice! And thanks for putting the source on GitHub; I will have a goof with it!

jfoster 11 years ago |

A naive brute force of Chess wouldn't end. Consider the case where both players make moves that perpetuate the game rather than working toward an ending.

jefffoster 11 years ago | |

Chess has a 50 move stalemate condition (http://en.wikipedia.org/wiki/Fifty-move_rule) that prevents this happening.

skc 11 years ago |

Silly question, why does the author only approximate the number of possible positions in chess?

tromp 11 years ago | |

It's not like Go where you can actually count the number of reachable positions.

Deciding whether a single chess position is reachable can be really hard. There is a whole class of so called "retrograde" chess problems focussed on that.

For example, is the following position reachable?

White: Kc3 Ba4 Black: Kd1 Rb5 Bd5

joelthelion 11 years ago | |

Because the real number is unknown?

mooneater 11 years ago |

Loved it, and very impressed with how concise the source code is.

yangzx 11 years ago |

A neat and elegant neural architecture for learning chess.

heroku 11 years ago |

Chess AI is a myth, bruteforce is the only way to win.

kylebrown 11 years ago | |

How do you know that humans don't use intelligent bruteforce approximations to win?

anonfunction 11 years ago |

Site is down. I highly recommend using CloudFlare or another cacheing solution to avoid server overload in times of high traffic.