Top Chess Engine Championship(tcec-chess.com) |
Top Chess Engine Championship(tcec-chess.com) |
They are current playing to lowest league--the qualification league. There will be several leagues culminating in a superfinal. 100 matches between the two best engines.
It will be exiting to see how the conventional tree search, neural network and NNUE engines compare this year.
As in, real time to the computation? Why though.
The SC2 bots don't compete in real time, many of them can't, most of those that can are clearly hampered when playing against a human (necessarily real time) versus a machine (not real time), but you just watch the matches as if they'd happened in real time after the game is finished.
Why not just have a fixed pace unrelated to how long the moves "really" took to calculate, with a (no longer real time) clock shown to indicate how the machines used their allotted time?
Why not? What have SC2 bots (or their abilities) to do with chess bots?
> Why not just have a fixed pace unrelated to how long the moves "really" took to calculate, with a (no longer real time) clock shown to indicate how the machines used their allotted time?
You can't show moves that haven't been played yet, so either your pace is much too slow, or you have to wait until the game is finished before displaying it. And it still wouldn't make sense to display the moves in a fixed pace when the pace (and time pressure) is in fact part of the game (see also the recent human world championship where many blunders were often played just before the 40th move because of time pressure).
But if you're really interested in the TCEC as a competitive sporting event, you might prefer to watch the drama in real time. How many people record the Superbowl and play it back at 100% speed the next day?
As for the SC2 bots, it sounds like they aren't very good. Chess bots, on the other hand, are, and these ones are playing a standard classical time control that is often used by human players. Time management is a super important part of this (indeed there are often separate neutral networks that are used to determine how much time should be spent thinking about a given move). The game inherently takes place live over a given window of time, and so long as that's true, why not stream it live too?
Basically, you're asking why they're not playing blitz instead of classical - it's just fundamentally a different format.
From a coding perspective - I think it could get really interesting to see people try to make super small engines. I've seen a couple and am blown away that a pretty small program w/o an ML model can destroy me in Chess.
It'd be cool to see how good super simple programs could get.
We know supercomputers can outplay any human, nowadays any smartphone can. But what about 286? 8080?
This would be a whole lot more engaging with more explanation, at least hover tips over all the fields and so forth.
Generally, their audience is "people who play chess" - and those people can be assumed to know this. If you don't play chess, watching a broadcast on Twitch or YouTube will be more enjoyable.
[1] How AlphaZero Completely CRUSHED Stockfish https://www.youtube.com/watch?v=8dT6CR9_6l4
LC0 is the open source implementation of Alpha Zero’s Monte Carlo tree search + neural nets for static evaluation approach.
Do both chess engines have equal machine power, cores, IOPS, CPU model, RAM, etc?
They're showing a game with an eval bar. How does it evaluate the position, don't you need a chess engine to evaluate and provide a score in the first place? Perhaps we're seeing two chess engines play against each other with a third one evaluating for the viewers? Is it the average evaluation between the two engines?
Yes, this has been something engines have started to optimize. Some engines that don't do this well can be beaten by "flagging" (i.e. making pointless moves rapidly) once they run low on time.
> Do both chess engines have equal machine power, cores, IOPS, CPU model, RAM, etc?
Last I remember, the goal was to give engines roughly equal computational capacity. But it would be tailored to the engine, e.g. AlphaZero getting something with more GPU(s) and StockFish getting (at the time) more CPU(s) - at least for the final matches.
> They're showing a game with an eval bar
For me, it shows the current evaluation of both engines (numerically), as well as a graph with up to four engine evaluations (the two contestants + up to two "commentators").
Time management is based on some heuristics built into the engine. Some positions that are being evaluated are more dynamic and engines have rules to evaluate positions which are settled down.
As for the hardware if 2 CPU engines are playing they have exactly the same resources. The problem arises when one engine is GPU based and the other is CPU. In this situations balancing the compute power is hard but even then they normally have the same time allocated.
Games are played on the same computer, with only one engine analyzing at a time.
The graph shows each engine's evaluation of the position for each move in the game. The red line is from a 3rd engine "observing" the game.
In a constrained environment like microcontrollers, the smallness of the space of all possible programs makes it faster for finding a good one. There is one catch, though: It may fail in unexpected ways.
It's a sixty minute football game. So to me 100% is sixty minutes. Whereas when broadcast it's a four hour TV show.
This makes sense for human players, but the Chess Engines aren't human, so their matches can proceed in parallel. Whereupon a human audience isn't actually watching them in real time anyway. AIUI The engines are not learning during tournament play, so unlike a human (who may discover an opponent's weaknesses during play over the course of a tournament) they're only getting updates at specific points between play.
"That's not how humans do it" is a pretty weak excuse even if not for the fact that TCEC has a completely inhuman design. Humans don't start from positions chosen more or less arbitrarily to reduce the number of draws, whereas TCEC does.
Uhm, yes they are.
TCEC can't afford to run more games in parallel. And if they did, they'd use all that processing power to run one game instead. The point is to produce the highest quality chess possible.
Why would you delay showing the moves that happen when you can just... not do it? You can always go through the games at your own pace afterwards anyway.
And if they wanted to they could show multiple matches in parallel. After all that's what happens in the case of a tournament such as the recent Tata Steel tournament. I assume that the issue is that they want to maximize resources for the engines and for that you need to run at most a few engines (and therefore matches) in parallel.
If I read things correctly it's all running on a single server with "only" 96GiB of RAM per engine (https://wiki.chessdom.org/TCEC_Season_Further_information#TC...), running more matches in parallel would be detrimental to the quality of the chess played.
The framework being used was developed for Google DeepMind's AlphaStar, which is a learning AI approach although obviously very different from their approach to chess.
But today the framework is used by rules-based bots largely competing against each other. This means unlike AlphaStar, which set out specifically with a human-like approach to beat excellent human players, the amateur bots are entirely focused on winning versus other bots by any means necessary. The most successful tend to have sprawling multi-theatre conflict as their end game, maximum army size, and a half dozen or more different small skirmishes happening at once, hard for the human observer to be sure who is better until suddenly there's a decisive outcome. That wouldn't be compatible with AlphaStar's mission at all, obviously human players can't fight these battles with success.
Their most obvious defect is they don't resign. A competent human player resigns hopeless positions in SC2 knowing quickly that they have lost, but most bots will stay in the game until destroyed which would be very rude for a human. They can be indecisive, attacking then pulling back, then attacking again in seconds, and they are much more easily thrown off by unexpected situations than a human - but overall they're a match for a good human player unless that player has prepared specifically to exploit a known weakness of a particular bot. (e.g. there are bots that do not understand why an enemy Nydus Worm in your base can't be allowed to complete... since that basically never happens)
Just because your situation is even more limited doesn't mean that they don't also have limitations. That's just gatekeeping.
I appreciate that nobody has unlimited resoureces but, for example, the latest DeepMind paper posted on HN (on AlphaCode) had a couple dozen authors. By comparison I do all my work myself, with the help of my advisor and so do his other PhD students. And all for a PhD student's stipend, when DeepMind researchers are paid at least at post-doc levels, presumably. There's no comparison in the resources that I and DeepMind can throw at a problem.
I mean, please, give me DeepMind's limitations. I'd be so happy!