Yet people will still tell that worrying about AI taking over is like worrying about overpopulation on Mars, and that this is a problem at least 50 years out.
Edit: I do understand that the techniques used to implement Alphago can be used to implement other single-function solvers. That doesn't make it a general purpose strong AI.
People said for years that Go would never be beaten in our lifetime. They said this because Go has a massive search space. It can't be beaten by brute force search. It requires intelligence, the ability to learn and recognize patterns.
And it requires doing that at the level of a human. A brute force algorithm can beat humans by doing a stupid thing far faster than a human can. But a pattern recognition based system has to beat us by playing the same way we do. If humans can learn to recognize a specific board pattern, it also has to be able to learn that pattern. If humans can learn a certain strategy, it also has to be able to learn that strategy. All on it's own, through pattern recognition.
And this leads to a far more general algorithm. The same basic algorithm that can play Go, can also do machine vision, it can compose music, it can translate languages, or it could drive cars. Unlike the brute force method that only works one one specific task, the general method is, well, general. We are building artificial brains that are already learning to do complex tasks faster and better than humans. If that's not progress towards AGI, I don't know what is.
See this interview between Kurzweil and Minsky: https://www.youtube.com/watch?v=RZ3ahBm3dCk#action=share
We don't know that, actually. Maybe GAI isn't one shining simple algo, but cobbling together a bunch of algorithms like this one.
Despite my optimism, the writing is on the wall. AlphaGo and algorithms like it will only improve as you throw more CPU time at them. I actually want Lee Sedol to win, not because it would uphold some kind of human supremacy but because I want to see the AI guys put some more effort (and CPU time) into it. It would be a real shame if they'd win on their first attempt.
One way of looking at the significance of this is that it might tell us that relatively simple machine learning algorithms can capture key aspects of the versatile human cortical capacity to learn things like Go using sheer pattern recognition. (It's amazing that human visual cortex can do that.) If the human brain were more mystical in its power, then human-level ability to recognize Go patterns wouldn't have been penetrable at all to a comparatively simple neural algorithm like deep learning.
From another standpoint, this could show that we're reaching the point where, when an AI algorithm starts to reach interesting levels of performance, a much-encouraged Google dumps 100,000 tons of GPU time and 5 months of a few dozen researchers' time to improve the algorithm right past the human performance level. In N years from now when it's a more interesting AI system doing more interesting cognition, we could see a more interesting result materialize when a big organization, encouraged by some key milestone, invests 5 months of time and massively more computing power to further improve an AI system.
This 10 years to beat go prediction shows that our time estimates are wildly ignorant.
As hard as writing AI for a problem space like "how to win at Go" is, it's several orders of magnitude easier than creating a general AI with the self-awareness required to see us as a threat.
That's the difference between a neural net -- which until about an hour ago was a phrase that reliably set off my snake-oil detector -- and a simple tree search. DeepMind just beat a Go master... and nobody can say exactly how it did it.
That's a big deal.
We should be taking steps to outlaw black box algorithms right now but I'm sure we won't.
I for one did not say to not worry, and I bet $1500 vs. $2250 on Sedol winning the match before getting cold feet and arbing my outstanding bets down to $400 vs. $700.
Infinity is a real problem. When you try to learn from examples, you first need to see "enough" examples of whatever you're trying to learn. If there are infinitely many such examples, no matter how clever you are in tackling your search space there will always be infinitely many examples of infinitely many situations you've never come across, and that you won't be able to learn.
The typical example of this is language. You could give a learner all phrases of a given language every produced and it would still be missing an infinite amount of necessary examples. Somehow (and it's freaky when you stop to think about it) humans get around this and we can produce and understand parts of infinity, without sweating it.
Machine learning is simply incapable of generalising like that and anyone who thinks AGI is just around the corner has just failed to consider what "general" really, really means.
Though to be fair, now that I had my little rant I have to admit that you don't need to go "general intelligence" before you can be really, really dangerous. Even if AI doesn't "take over" it can do a lot of damage, frex if we start using autonomous weapon systems or hand over critical infrastructure maintenance to limited and inflexible mechanical intelligence.
Congratulations to the team at Deepmind, and I'm wishing good luck for Sedol in the remaining matches - if he wins we would certainly get to see a second series rematch some months down the line, and that would be very exciting for go fans everywhere.
there should be like a North American Go Nationals or something like that televised on twitch
Anyone putting money down on Sedol? He said it will be either 5-0 or 4-1 in his favor.
What handicap? There was no handicap?
To my untrained eye, AlphaGo was already way ahead by move 29 in the match tonight with black having a weak group in the upper side, while black wasted a lot of moves on the right side as white kept pushing (Q13, Q12), which white erased later because those pushes were 4th line for black and the area was too big too control. Black never had a chance to recover this bad fight. After those reductions and invasion on right side white came back to the 3-3 at C17 which feels like solidified the win.
Some people are asking what was the losing move for Lee Sedol? I wanted to joke and say "the first one.." but maybe R8 was too conservative being away from the urgent upper side where white started all the damage.
Incredible, and in my opinion a little terrifying.
We'll still always have Calvinball! https://xkcd.com/1002/
What surprises me about it is that connect four was only solved in 1995; that seems relatively late for a 6x7 grid with only 7 possible moves per turn.
Massive search +
Hand-coded search heuristics +
Hand-coded board position evaluation heuristics [1]
AlphaGo:
Search via simulations (Monte Carlo Tree Search) +
Learned search heuristics (policy networks) +
Learned patterns (value networks) [2]
Human strongholds seem to be our ability to learn search heuristics and complex patterns. We can perform some simulations but not nearly as extensively as what machines are capable of.
The reason Kasparov could hold himself against Deep Blue 200,000,000-per-second search performance during their first match was probably due to his much superior search heuristics to drastically focus on better paths and better evaluation of complex positions. The patterns in chess, however, may not be complex enough that better evaluation function gives very much benefits. More importantly, its branching factor after using heuristics is low enough such that massive search will yield substantial advantage.
In Go, patterns are much more complex than chess with many simultaneous battlegrounds that can potentially be connected. Go’s Branching factor is also multiple-times higher than Chess’, rendering massive search without good guidance powerless. These in turn raise the value of learned patterns. Google stated that its learned policy networks is so strong “that raw neural networks (immediately, without any tree search at all) can defeat state-of-the-art Go programs that build enormous search trees”. This is equivalent to Kasparov using learned patterns to hold himself against massive search in Deep Blue (in their first match) and a key reason Go professionals can still beat other Go programs.
AlphaGo demonstrates that combining algorithms that mimic human abilities with powerful machines can surpass expert humans in very complex tasks.
The big questions we should strive to answer before it is too late are:
1) What trump cards humans still hold against computer algorithms and massively parallel machines?
2) What to do when a few more breakthroughs have enabled machines to surpass us in all relevant tasks?
Note: It is not entirely clear from the IBM article that the search heuristics is hand-coded, but it seems likely from the prevalent AI technique at the time.
[1] https://www.research.ibm.com/deepblue/meet/html/d.3.2.html [2] http://googleresearch.blogspot.com/2016/01/alphago-mastering...
More generally, scientists using AI for research will probably have to do research on the research, to understand what the AI discoveries mean. Maybe they mean something we can't grasp at all, in which case they go completely over our heads, like ants trying to learn about the finer points of financial markets. We will probably have to learn new concepts and even new languages designed by the AI to convey the meaning.
Here's a very good article: http://dustycloud.org/blog/sussman-on-ai/ (A conversation with Sussman on AI and asynchronous programming)
Tl;dr: I, for one, welcome our robot overlords (so long as they don't behave like our robot overlords).
That is why we need to invest much more research efforts on Friendly AI and trustworthy intelligent systems. People should consider contribute to MIRI (https://intelligence.org/) where Yudkowsky, who helped pioneer this line of research, works as a senior fellow.
Another way to look at is, is how fast they were able to make it a lot better in a few months.
You're making the mistake of assuming anything about how the human brain learns.
Look- take Monte Carlo methods. You can sample a very big number of events and hope to get some useful information from that. If you sample infinity, though, what do you get? Infinity.
I can't understand the call to regulate a technology when it is decades (much more that 10-20 years) away from possibly existing in a state that we can't even imagine. Add to that legislators with zero technical literacy. That's essentially advocating for shutting down all research into a technology that can improve millions of lives.
Basically any algorithm where you are training it with tons of samples and then it creates a logic tree that is not written by a human is a black box algorithm.
By the time AI technology gets to the point that we realize we should regulate it we will be too late.
"We've beaten some hard problems more quickly than expected therefore we'll likely beat other hard problems more quickly than expected" is logical induction. It's equivalent to "I just flipped a coin and got heads. I'll probably get heads again on the next flip." Don't do that. :)
Do you really believe there's less than a 50% chance strong AI won't be invented in your lifetime?
I don't know. I don't really see the value in speculating.
Which is corrrect, if you don't know the true probability of flipping heads. You might find, for example, that it's a trick coin with two heads and no tails.
You absolutely can predict future progress from past progress. E.g. Moore's law held true for decades after the observation was made. If you see a technology advancing rapidly, then there is no reason to say it will stop in the near future!
>I don't really see the value in speculating.
Because everything depends on this prediction. The invention of AI will be the most significant event in the history of humanity. It will totally change the world. Or likely, destroy it. Being prepared for it is absolutely necessary.
By your logic, you can predict the failure of AGI predictions by past failures of (every) AGI prediction.
There is all kinds of value in speculating. We do it all the time in things like war games. 'If neighbor $x attacked us, what would happen?, would they win?, what can we do to prevent this?'.
Of course this may in your mind hold very little value now, but I promise if and when it occurs you will change your mind quickly on that topic.
Beyond that, it seems reasonable to call for discussion about how AI is used [as he also does in that article].
[EDITED to add:] Oh, I see, your point is that it looks like AlphaGo is doing better than an "AI moves fast" advocate expected, rather than (per Teodolfo) no better than some "AI moves slowly" advocates expected. Fair enough.
What actually happened was the reverse of that; AI moved faster than I publicly bet a large sum of money on it moving, and I was already worried before then.
I'm not aware of the reverse-reverse having happened.
If you're claiming something as a successful advance prediction to bolster belief in a general model, it's fair play to ask for a record of that advance prediction.
Monte Carlo Tree Search was necessary and itself a massive improvement over minimax but not sufficient for creating a Go program to challenge professional players. The true innovation here is the neural networks. Without those networks to guide it AlphaGo plays far worse than existing programs.
The fact that those networks are sufficient is pretty incredible. We already knew that by inventing them we had created a very pure form of pattern recognition, but it's surprising that the pattern recognition coupled with some tree search seems to be all you need to play Go as well as humans.
It's not impressive to you that we've now reproduced a piece of human intelligence, "intuition", which was previously considered out of reach?
Is it really all we need? Or it is more that they threw a lot of hardware to it? What if if a part of its efficiency is because they threw a lot of GPUs with a huge network, rather than having a NN efficient by itself?
We see that: "AlphaGos Elo when it beat Fan Hui was 3140 using 1202 CPUs and 176 GPUs. Lee Sedol has an equivalent Elo to 3515 on the same scale (Elos on different scales aren't directly comparable). For each doubling of computer resources AlphaGo gains about 60 points of Elo."
It's a lot of hardware.
Heh, nice one.
Remember that "solve computer vision" was considered a summer project.
And they're pretty much there. Have you seen some of the latest results in that field?
We certainly got nearly there, but it was nearly 50 years later, not 3 months. Similarly, something that might look somewhat simple to us right now, might also be a lot more difficult.
Good chess programs definitely use self-play to improve themselves, eg to tune parameters and heuristics.
It's still an incredibly achievement - but it's important to be accurate.
While the higher portions do share some similarities with the Atari system, at a basic level this is a machine that was designed and trained to play Go. AlphaGO is 'essentially the same' as the Atari system in the same way that all Neural Networks are 'essentially the same.'
Is this an extremely impressive accomplishment? Yes. However, doesn't seem to qualify as anything close to generalizable.
[1] http://googleresearch.blogspot.com/2016/01/alphago-mastering...
The Atari-playing AI watched the pixels indeed, but it was also given a set of actions to choose from and more importantly, a reward representing the change in the game score.
That means it wasn't able to learn the significance of the score on its own, or to generalise from the significance of the changing score in one game, to another.
It also played Atari games, that _have_ scores, so it would have been completely useless in situations where there is no score, or a clear win/loss situation.
AlphaGo is also similarly specialised to play Go. As is machine learning in general: someone has to tell the algorithm what it needs to learn, either through data engineering, or reward functions etc. A general AI would learn what is important on its own, like humans do, so machine learning has not yet shown that it can develop into AGI.
Even humans have utility functions. For example, we get rewards for having sex, or eating food, or just making social relationships with other humans. Or we have negative reinforcement from pain, and getting hurt, or getting rejected socially.
You can come up with more complicated utility functions. Like instead of beating the game, it's goal could be to explore as much of the game as possible. To discover novel things in the game. Kind of like a sense of boredom or novelty that humans have. But in the end it's still just a utility function, it doesn't change how the algorithm itself works to achieve it. AGI is entirely agnostic to the utility function.
No, what I'm really saying is that you can't have an autonomous agent that needs to be told what to do all the time. In machine learning, we train algorithms by giving them examples of what we want them to learn, so basically we tell them what to learn. And if we want them to learn something new, we have to train them again, on new data.
Well, that's not conducive to autonomous or "general" intelligence. There may be any number of tasks that your "general" AI will need to perform competently at. What's it gonna do? Come back to you and cry every time it fails at something? So then you have a perpetual child AI that will never stand on its own two feet as an adult, because there's always something new for it to learn. Happy little AI, for sure, but not very useful and not very "general". Except for a general nuisance, maybe.
Edit: I'm saying that machine learning can't possibly lead to general AI, because it's crap at learning useful things on its own.
Why would you have a strong prior belief about the invention of AGI? Now you are claiming to have far more certainty than I am.
>By your logic, you can predict the failure of AGI predictions by past failures of (every) AGI prediction.
This logic is extremely flawed. First not every prediction was wrong. Many people predicted it would happen in 2045, a few in 2030.
Second there's no reason past predictions represent the accuracy of future predictions about the same thing. Predictions should get more accurate over time, and early predictions are expected to be wildly wrong.
And third there's anthropic bias. If they were right, then we wouldn't be here to speculate about it. We can only ever observe negative outcomes, therefore observing a negative outcome shouldn't update your priors at all.
The vast majority of predictions were wrong.
Yes, the logic is flawed, that's why I said it was your logic.
As far as I know there is nothing particularly novel about AlphaGo, in the sense that if we stuck an AI researcher from ten years ago in a time machine to today, the researcher would not be astonished by the brilliant new techniques and ideas behind AlphaGo; rather, the time-traveling researcher would probably categorize AlphaGo as the result of ten years' incremental refinement of already-known techniques, and of ten years' worth of hardware development coupled with a company able to devote the resources to building it.
So if what we had ten years ago wasn't generally considered "true AI", what about AlphaGo causes it to deserve that title, given that it really seems to be just "the same as we already had, refined a bit and running on better hardware"?
10 years ago no one believed it was possible to train deep nets[1].
It wasn't until the current "revolution" that people learned how important parameter initialization was. Sure, it's not a new algorithm, but it made the problem tractable.
So far as algorithmic innovations go, there's always ReLU (2011) and leaky ReLU (2014). The one-weird-trick paper was pretty important too.
[1] Training deep multi-layered neural networks is known to be hard. The standard learning strategy— consisting of randomly initializing the weights of the network and applying gradient descent using backpropagation—is known empirically to find poor solutions for networks with 3 or more hidden layers. As this is a negative result, it has not been much reported in the machine learning literature. For that reason, artificial neural networks have been limited to one or two hidden layers
http://deeplearning.cs.cmu.edu/pdfs/1111/jmlr10_larochelle.p...
If you asked people 10 years ago before the moon landing if it was possible, I too would agree it's impossible. But after that breakthrough it opened up the realm is possibilities.
I see AlphaGo more of an incremental improvement than a breakthrough.
A decade ago I was trying and failing to build multi-layer networks with back-propagation-- it doesn't work so well. More modern, refined, training techniques seem to work much better... and today tools for them are ubiquitous and are known to work (especially with extra CPU thrown at them :) ).
By that standard there's nothing particularly novel about anything. Everything we have today is just a slight improvement of what we already had yesterday.
World experts in go and ML as recently as last year thought it would be many more years before this day happened. Who are you to trivialize this historic moment?
Those arguments absolutely are wrong. For one thing it's classic hindsight bias. When you make a wrong prediction, you should update your model, not come up with justifications why your model doesn't need to change.
But second, it's another bias, where nothing looks like AI, or AI progress. People assume that intelligence should be complicated, that simple algorithms can't have intelligent behavior. That human intelligence has some kind of mystical attribute that can't be replicated in a computer.
Whenever AI beats a milestone, there are a bunch of over-optimists that come out and make predictions about AGI. They have been wrong over and over again over the course of half a century. It's classic hindsight bias.
I know this is true, because there are already a lot of people that think the Turing test isn't valid. They believe it could be beaten by a stupid chatbot, or deception on the part of AI. Just search for past discussions on HN of the Turing test, it comes up a lot.
There is no universally accepted benchmark of AI. Let alone a benchmark for AI progress, which is what Go is.
No one claimed that Go would require a human level AI to beat. But I am claiming that beating it represents progress towards that goal. Whereas passing the Turing test won't happen until the very end. Beating games like Go are little milestones along the journey.
Many of the chatbot Turing test competitions have heavily rigged rules restricting the kinds of questions you're allowed to ask in order to give the bots a chance.
(the answer is Rocky Road by the way)
It is the same question for the data they used. Facebook, Google and others seem to agree that, at the end, the quantity and quality of data are more important than the algorithm itself. So how much is it at play here? Knowing that will be able to show us why it is performing well and how much we can appreciate their work.
Basically, the ("lots of hardware") distributed implementation gets ~3100 points in the Elo rating against ~2900. ~2900 is still sufficient to win against Fan Hui. So I would say, that yes, this algorithm has most of the merit here.
When you're talking about some other kind of evaluation function, such as Monte Carlo rollouts, you usually prefix that to the tree search (in the case of Monte Carlo rollouts, "Monte Carlo tree search" or MCTS) to indicate that besides the basic fundamental task common to almost all AIs (finding the optimal branches in a decision tree) it functions completely differently from the expert systems.
So is the case with this program, which (in a first pass) approximates subtrees by a trained neural net, rather than Monte Carlo rollouts or an expert system. So using terminology that suggests classical expert system tree search is bound to cause confusion (as you noticed).
Really?
Why not?
What if you train an AI to play an RTS where matter, energy, and time are the resources and the goal is to take over the world?
And the optimists are being proven right. AGI is almost here.
A light bulb is just a metal wire encased in a non-flammable gas and you run electricity through it. It was long known that things get hot when you run electricity through them, and that hot things burst into fire, and that you can prevent fire by removing oxygen, and that glass is transparent. It's not a big deal to combine these components. A lot of people still celebrate it as a great invention, and in my opinion it is! Think about how inconvenient gas lighting is and how much better electrical light is.
Same thing with AlphaGo. Sure, if you break it down to its subcomponents it's just clever application of previously known techniques, like any other invention. But it's the result that makes it cool, not how they arrived at it!
All algorithms are incremental improvements of existing techniques. This isn't a card you can use to diminish all progress as "just a minor improvement what's the fuss".
People have used neural nets as function approximators for reinforcement learning with MCTS for game playing well before AlphaGo (!!).
Your lightbulb example actually supports my point. The lightbulb was the product of more than a half-century of work by hundreds of engineers/scientists. I have no problem with pointing to 70 years of work as a breakthrough invention.
Likewise NNs are uncaring what application you put them into. Give them a different input and a different goal, and they will learn to do that instead. Alphago gave it's NN's control over a monte carlo search tree, and that turned out to be enough to beat Go. They could plug the same AI into a car and it would learn to control that instead.
Note that even without the monte carlo search system, it was able to beat most amateurs, and predict the moves experts would make most of the time.
There is also unsupervised and semi-supervised learning, which can take advantage of unlabelled data. Even supervised learning can work really well on weakly labelled data. E.g. taking pictures from the internet and using the words that occur next to them as labels. As opposed to hiring a person to manually label all of them.
I don't know what situation you are imagining that would make the AI "come back and cry". You will need to give an example.
Of course they did. They trained it with examples of Go games and they also programmed it with a reward function that led it to select the winning games. Otherwise, it wouldn't have learned anything useful.
>> There is also unsupervised and semi-supervised learning, which can take advantage of unlabelled data.
Sure, but unsupervised learning is useless for learning specific behaviours. You use it for feature discovery and data exploration. As to semi-supervised learning, it's "semi" supervised: it learns its own features, then you train it with labels so that it learns a mapping from those features it discovered to the classes you want it to output.
>> I don't know what situation you are imagining that would make the AI "come back and cry"
That was an instance of humour [1].
Yes, but it doesn't need to be trained with examples of Go games. It helps a lot, but it isn't 100% necessary. It can learn to play entirely through self play. The atari games were entirely self play.
As for having a reward function for winning games, of course that is necessary. Without a reward function, any AI would cease to function. That's true even of humans. All agents need reward functions. See my original comment.
>That was an instance of humour
Yes I know what humour is lel. I asked you for a specific example where you think this would matter. Where your kind of AI would do better than a reinforcement learning AI.
I'm generally considered to be way over optimistic in my assessment of AI progress. But wow.. that's pretty optimistic!
These things might seem like "small iterative refinements", but they add up to 100x improvement. Even when you don't consider hardware. And you should consider hardware too, it's also a factor in the advancement of AI.
Also reading through old research, there is a lot of silly ideas along with the good ones. It's only in retrospect that we know this specific set of techniques work, and the rest are garbage. At the time it was far from certain what the future of NNs would look like. To say it was predictable is hindsight bias.
EDIT: clarified my language to address below reply.
http://arxiv.org/abs/1509.02971
The fact that the (reinforcement) learning problem is hard or not is not directly related to whether the observation and action spaces are discrete or continuous.
The best Go program before AlphaGo was CrazyStone, ranked at 5-dan ("high amateur" range).
That's reinforcement learning and it's even more "telling the computer what to do" than teaching it with examples.
Because you're actually telling it what to do to get a reward.
>> Without a reward function, any AI would cease to function.
I can't understand this comment, which you made before. Not all AI has a reward function. Specific algorithms do. "All" AI? Do you mean all game-playing AI? Even that's stretching it, I don't remember minimax being described in terms of rewards say, and I certainly haven't heard any of about a dozen classifiers I've studied and a bunch of other systems of all sorts (not just machine learning) being described in terms of rewards either.
Unless you mean "reward function" as the flip side of a cost function? I suppose you could argue that- but could you please clarify?
>> your kind of AI
Here, there's clearly some misunderstanding because even if I have a "my kind" of AI, I didn't say anything like that.
I'm sorry if I didn't make that clear. I'm not trying to push some specific kind of AI, though of course I have my preferences. I'm saying that machine learning can't lead to AGI, because of reasons I detailed above.
No one tells the computer what to do. They just let it do it's thing, and give it a reward when it succeeds.
>Not all AI has a reward function. Specific algorithms do. "All" AI?
Fine, all general AI. Like game playing etc. Minimax isn't general, and it does require a precise "value function" to tell it how valuable each state is. Classification also isn't general, but it also requires precise loss function.
Sure they do. Say you have a machine learning algorithm, that can learn a task from examples, and let's notate it like so:
y = f(x)
Where y is the trained system, f the learning function and x the training examples.
The "x", the training examples, is what tells the computer what to learn, therefore, what to do once it's trained. If you change the x, the learner can do a different y. Therefore, you're telling the computer what to do.
In fact, once you train a computer for a different y, it may or may not be really good at it, but it certainly can't do the old y anymore. Which is what I mean by "machine learning can't lead to AGI". Because machine learning algorithms are really bad at generalising from one domain to another, and the ability to do so is necessary for general intelligence.
Edit: note that the above has nothing to do with supervised vs unsupervised etc. The point is that you train the algorithm on examples, and that necessarily removes any possibility of autonomy.
>> Fine, all general AI. Like game playing etc.
I'm still not clear what you're saying; game-playing AI is not an instance of general AI. Do you mean "general game-playing AI"? That too doesn't always necessarily have a reward function. If I remember correctly for instance, Deep Blue did not use reinforcement learning and Watson certainly does not (I got access to the Watson papers, so I could double-check if you doubt this).
Btw, every game-playing AI requires a precise evaluation function. The difference with machine-learned game-playing AI is that this evaluation function is sometimes learned by the learner, rather than hard-coded by the programmer.
Today lots of people-- ones with even less background and putting in less effort-- try and are successful.
This is not a small change, even if it is the product of small changes.
>The "x", the training examples, is what tells the computer what to learn, therefore, what to do once it's trained. If you change the x, the learner can do a different y. Therefore, you're telling the computer what to do.
But with RL, a computer can discover it's own training examples from experience. They don't need to be given to it.
>I'm still not clear what you're saying; game-playing AI is not an instance of general AI.
But it is! The distinction between the real world and a game is arbitrary. If an algorithm can learn to play a random video game, you can just as easily plug it into a robot and let it play "real life". The world is more complicated, of course, but not qualitatively different.
- discrete spaces such as atari games and go, - continuous spaces such as driving a car, controlling a robot or bid on a ad exchange.
Tonight, that happened. Google's DeepMind AlphaGo defeated the world Go champion Lee Sedol. An amazing testament to humanity's ability to continuously innovate at a continuously surprising pace. It's important to remember, this isn't really man vs machine, as we humans programmed the algorithms and built the computers they run on. It's really all just circuitous man vs man.
Excited for the next "impossible" things we'll see in our lifetimes.
Sadly as I write this my uncle and personal hero who spent 17 years of his life working towards a Ph.D. on abstraction hierarchies for use in Go artificial intelligence, has been moved into hospice care. I'm just glad that in the few days that are left he has a chance to see this happen, even if it is not the good old-fashioned approach he took.
[1] He recently started rewriting the continuation of this research in golang, available on Github: https://github.com/Ken1JF/ah
Finally, the sixth attempt is written in the right language! Now it will succeed for sure.
AlphaGo's architecture resembles much closer to how humans think and learn.
I initially learned Go to be able to have some chance of an AI. I then had some transformative experiences that coincided with my early kyu learning of basic Go lessons. On of the big lessons in Go is to learn how to let go of something. Taking solace in anything on the Go board is one of the blocks you work through when you develop as a Go player.
I had already known about two years ago that just the Monte Carlo approach was already scalable. If Moore's Law continues, it was a matter of time before the Monte Carlo approach would start challenging the professional ranks -- it had already gotten to the point where you just needed to throw more hardware at it.
AlphaGo's architecture adds a different layer to it. The Deep Learning isn't quite as flexible as the human mind, but it can do something that humans can't: learn non-stop, 24/7 on one subject. We're seeing a different tipping point here, possibly the same kind of tipping point when we witnessed the web browser back in the early 90s, and the introduction of the smartphone in the mid '00s. This is way bigger (to use a Go terminology) than what happened with chess.
> During the match against Fan Hui, AlphaGo evaluated thousands of times
> fewer positions than Deep Blue did in its chess match against
> Kasparov; compensating by selecting those positions more intelli-
> gently, using the policy network, and evaluating them more precisely,
> using the value network—an approach that is perhaps closer to how
> humans play. Furthermore, while Deep Blue relied on a handcrafted
> evaluation function, the neural networks of AlphaGo are trained
> directly from gameplay purely through general-purpose supervised and
> reinforcement learning methodsThe AlphaGo that beat the 2p European champion five months ago was not as strong as the AlphaGo that beat Lee Sedol (9p). I don't think this was just the AlphaGo team throwing more hardware. I think they had been constantly running the self-training during the intervening months so that AlphaGo was improving itself.
If that is so, then the big thing here isn't that AlphaGo is the first AI to win an official match with the currently world's strongest Go player. It's that within less than half a year, AlphaGo was able to learn and grow to go from challenging a 2p to challenging the world's strongest player. Think about that.
I am tremendously unfamiliar with recent A.I developments.
Can anyone provide some written references to this effect? Last time I searched (extensively), I couldn't really find anyone saying this.
To add to that, in Godel Escher Bach, Hofstadter in 1979 predicted that no chess engine would ever beat a human grandmaster player. It just goes to show how hard it is to predict what is, and also will remain, impossible for machines!
"We always overestimate the change that will occur in the next two years and underestimate the change that will occur in the next ten. Don't let yourself be lulled into inaction."
This stuff is happening fast, and we might have found ourselves, historically, in a place of unintelligible amounts of change. And possibly undreamt of amounts of self-progression.
But who made that machine?
I'd say a more precise evaluation would be that the ability to program a machine to assist in playing chess outdid the ability to play chess without such assistance.
Several top commentators were saying how AlphaGo has improved noticeably since October. AlphaGo's victory tonight marks the moment that go is no longer a human dominated contest.
It was a very exciting game, incredible level of play. I really enjoyed watching it live with the expert commentary. I recommend the AGA youtube channel for those who know how to play. They had a 9p commenting at a higher level than the deepmind channel (which seemed geared towards those who aren't as familiar).
To my mind, this is a really significant achievement not because a computer was able to beat a person at Go, but because the DeepMind team was able to show that deep learning could be used successfully on a complex task that requires more than an effective feature detector, and that it could be done without having all of the training data in advance. Learning how to search the board as part of the training is brilliant.
The next step is extending the technique to domains that are not easily searchable (fortunately for DeepMind, Google might know a thing or two about that), and to extend it to problems where the domain of optimal solutions is less continuous.
What? They certainly trained the algorithm on a huge database of professional go games. It's even in the abstract. [1]
[1]: http://www.nature.com/nature/journal/v529/n7587/full/nature1...
Exactly
They used the game database to learn the value network, then reinforcement learning of the policy network was performed on self-play games. I.e., the machine learned to play from existing data, then played against itself to learn the search heuristics (the policy network) without the need for expert data.
Some quick observations
1. AlphaGo underwent a substantial amount of improvement since October, apparently. The idea that it could go from mid-level professional to world class in a matter of months is kinda shocking. Once you find an approach that works, progress is fairly rapid.
2. I don't play Go, and so it was perhaps unsurprising that I didn't really appreciate the intricacies of the match, but even being familiar with deep reinforcement learning didn't help either. You can write a program that will crush humans at chess with tree-search + position evaluation in a weekend, and maybe build some intuition for how your agent "thinks" from that, plus maybe playing a few games. Can you get that same level of insight into how AlphaGo makes its decisions? Even evaluating the forward prop of the value network for a single move is likely to require a substantial amount of time if you did it by hand.
3. These sorts of results are amazing, but expect more of the same, more often, over the coming years. More people are getting into machine learning, better algorithms are being developed, and now that "deep learning research" constitutes a market segment for GPU manufacturers, the complexity of the networks we can implement and the datasets we can tackle will expand significantly.
4. It's still early in the series, but I can imagine it's an amazing feeling for David Silver of DeepMind. I read Hamid Maei's thesis from 2009 a while back, and some of the results presented mentioned Silver's implementation of the algorithms for use in Go[2]. Seven years between trying some things and seeing how well they work and beating one of the best human Go players. Surreal stuff.
---
1. https://news.ycombinator.com/reply?id=11251526&goto=item%3Fi...
2. https://webdocs.cs.ualberta.ca/~sutton/papers/maei-thesis-20... (pages 49-51 or so)
3. Since I'm linking papers, why not peruse the one in Nature that describes AlphaGo? http://www.nature.com/nature/journal/v529/n7587/full/nature1...
It's a cool win but despite the way the titles are being presented, this isn't over yet.
The position evaluation heuristic was developed using machine learning, but it was also combined with more 'traditional' algorithms (meaning the monte-carlo algorithm). So it was built specifically to play go (in the same way deep blue used tree searching specifically to play chess.....though tree searching is applicable in other domains).
I am a Go enthusiast!
The game played last night was a real fight in three areas of the board and in Go local fights affect the global position. AlphaGo played really well and world champion (sort of) Lee Sedol resigned near the end of the game.
I used to work with Shane Legg, a cofounder off DeepMind. Congratulations to everyone involved.
Really amazing moment to see Lee Sedol resign by putting one of his opponent's stones on the board.
Just a question to throw out there - does anyone feel like statements like this one "But the game [go] is far more complex than chess, and playing it requires a high level of feeling and intuition about an opponent’s next moves."
… seem to show a lack of understanding of both go and chess?
I understand there may be some cross-sports trash talking, but chess, played at a high level by humans, relies on these things as well. The more structured nature of chess means that it is (or at least was) more amenable to analysis by brute force computer algorithm, but no human evaluates and scores hundreds of millions of positions while playing chess or go.
Eh, the mainstream media is going to say this regardless, and I suppose it's just unrealistic to expect them to draw a distinction between complex for humans and amenable to brute force computation but statements like this always seemed to show a remarkable lack of awareness of how people actually play these games (though I am not an especially skilled chess or go player).
On a time to learn these skills... going from zero (computer rolls off assembly line) to mastery, the computer wins.
Actually maybe the computer wins even on the caloric level, if you consider all the energy that was required to get the human to that point (and all the humans that didn't get to that point, but tried).
The next step is to reduce the training time/samples for the computer to get the same performance.
That's not obvious at all. I don't think you appreciate how rigorous and demanding the training of a Go world champion is, how utterly devoted to Go they need to be: http://lesswrong.com/lw/n8b/link_alphago_mastering_the_ancie...
There are other implications that make this AlphaGo progress super exciting though. Go captures strategic elements that go well beyond the microcosm of one nerdy board game.
That's the real reason Go has been around for >2,000 years, and why this AI progress is relevant, despite its limited "game domain".
I wrote about it here, from my perspective of an avid Go player & machine learning professional [1].
In October of this year AlphaGo beat a 5dan player, bringing it into the range of CrazyStone. Only ~6 months later it beats a 9dan player which means it is now ~400 Elo higher. This means the new version would be predicted to beat the old version ~99% of the time.
Such incredible consistent progress of a problem considered somewhat intractable is notable and exciting. Imagine where this machine will be in 6 more months.
Maybe Go has way more moves possible and emergent strategies or something I'm not taking into account.
I watched two 9d pro commentaries, Redmond's and Kim Myungwan's. Redmond was obviously being charitable in saying the game was close near the end. Myungwan said the victory was apparent several moves before the resignation, and Myungwan also said AlphaGo was clearly stronger than himself.
Either way, even if this game should be considered close, it's still not clear if AlphaGo was holding back in order to hold a secure win. It's possible it can play at a higher level, but it wasn't needed. We can't really know AlphaGo's strength until (if) it is beaten. The following matches will be very interesting.
What would be shocking is to find out that a famous writer, musician or scientist is in fact, just an alias for an advanced AI system :) It needs a little trick, because people should be tricked into believing that there's a real person behind the name.
Oh wait, I just remembered that there's a (mediocre) movie made on the subject: S1m0ne ( http://www.imdb.com/title/tt0258153/ )
Are you saying it won't happen? Think of the guys saying the same of go :)
so, Milli VanAIlli?
I don't really know that much about AI, but hopefully some experts can tell me - how different are the networks that play go vs chess for example? Or recognise images vs play go?
What I mean is - if you train a network to play go and recognise images at the same time, will the current techniques of reinforcement learning/deep learning work or are the techniques not sufficient at the moment?
If that works, then it really does seem like a big step towards AGI.
So, they use a combination of techniques. And they're doing well at it.
According to Hui's recall, the defeat all came down to these things: the state of the mind, confidence and human error. The gaming psychology is a big part of the game, without the feelings of fear of being defeated and almost never making mistakes like humans do, machine intelligence beating human at the highest level of competitive sports/games is inevitable. However, to truly master to game of Go, which in ancient Chinese society, it's more of an philosophy or art form than a competitive sport, there is still a long way to go.
There were a ton of details Hui cannot speak of due to the non-disclosure agreement he signed with DeepMind, but those were the gist of the interview.
In the end, AlphaGo match is 'a win for humanity', as Eric Schmidt put it. [2]
[1] http://synchuman.baijia.baidu.com/article/344562 (In Chinese)
Google Translate: https://translate.google.com/translate?hl=en&sl=zh-CN&tl=en&...
[2] http://www.zdnet.com/article/alphago-match-a-win-for-humanit...
I was actually thinking about playing a game with another total noob, just for fun, since the rules can be explained in 1 minute (unlike chess).
It is indeed very interesting to play against another new player just to see what you come up with, then do some reading and solve some basic problems (it may even be a good idea to have a look at the easier problems before playing your first game), play more games, read more advanced books, join KGS... It is a very nice rabbit hole to fall into.
I suggest starting on a 9x9 or 13x13 board. The regular 19x19 has too much strategic depth and noobs feel lost on it.
That's actually the recommended way to get started. Learn the rules, and then play a bunch of games with another beginner.
For the folks who aren't as familiar with the game, how did you find the commentary (for any channel)? What would you be interested in hearing for events like these?
However it was infuriating that many times they switched randomly between video feeds, so I couldn't actually see what the commentators were talking about on their board. Once it even got stuck on "Match starts in 0 minutes" for a couple minutes!
I've read a few different reviews and watched Michael Redmond's live commentary as well, who obviously has a slower Japanese style of play than Myungwan, and his variations all exhibited a very thorough style and sensibility, but I think he missed the key moment, and Myungwan called it -- the bottom right just killed Lee Sedol, and it was totally unexpected.
And, Sedol was thinking about it too, because right after he resigned, he wanted to clear out that bottom right corner and rework some variations. I presume that's one frustration playing with a computer -- they'll have to instrument AlphaGo to do a little kibbitzing and talking after a game. That would be just awesome.
If you are very, very inspired by AlphaGo's side of this, it's really incredible to imagine, just for a moment, that building that white wall down to the right was in preparation for the white bottom right corner variation. The outcome of that corner play was to just massively destroy black territory, on a very painful scale, and it made perfect use of the white wall in place from much earlier in the game.
If AlphaGo was in fact aiming at those variations while the wall was being built, I would think at a fundamental level, Go professionals are in the position that chess grandmasters were ten years ago -- acknowledging they will never see as deeply as a computerized opponent. It's both incredibly exciting, and a blow to an admirable and very unusual group of worldwide game masters.
I loved every minute!!
I'd love to see one day a live commentary, with an extra window showing what computer is thinking at the moment.
Where did you find this 9p AGA commentary? I don't see it in the list of AGA videos on youtube.
It either seems like the earlier match vs Euro 3p didn't show AlphaGo's full strength, or it has improved much in the interim. Other takes?
You can check for more information : https://en.wikipedia.org/wiki/Go_and_mathematics
Related : https://en.wikipedia.org/wiki/Shannon_number
From these two links, the game tree complexity of chess is estimated at 10^120 while for Go it is 10^700.
Not really in the same ballpark.
I am really excited about the Deepmind though. Looking forward to tomorrow's game!
EDIT: good postmortem here https://gogameguru.com/alphago-defeats-lee-sedol-game-1/
EDIT: but this postmortem [2] of the game is far more nuanced and doesn't reach the same conclusion.
[1] https://www.youtube.com/watch?v=6ZugVil2v4w [2] https://gogameguru.com/alphago-defeats-lee-sedol-game-1/
For instance, I read a while back (approximating and paraphrasing to follow...) that top chess players can think up to 10 moves ahead along a very few branches. So let's say that in chess, there are 30 million possible positions to evaluate, and in go, there are 300 trillion. They're both such an order of magnitude different for humans that it makes really no difference in terms of how we play the game, so intuition takes over. For computers, it's a different story.
Realistically speaking, there aren't that many moves you can do in chess. Most of them are just blunder that would get you insta-killed by a good player. Contrast that to Go where there are so many good moves. This is why I think the Go AI is more impressive.
Part of me thinks that at some point in the future, we'll have Chess "solved". Not in a "That computer is too good for humans", but more in a mathematical sense where all avenues will have been explored and at any point you mathematically know from a position whether you 100% win. So, the computer will make a move, a second computer will make a move, and then they will agree on a draw. To be able to achieve this, I think it will be some kind of rainbow table for chess (on a much bigger scale obviously), where you can represent one position by a hash and just brute-force all possible solution from the "end-game" to the initial board. So, it's not even about AI, more about bruteforce and hardware. I know it's not possible to do this at the moment but quantum computing would be.
The search space for Go is much larger, so while brute force searches are critical in tight fighting, and in endgame play, something more has to happen to play go well in the middle game.
Chess fell to a much earlier generation of A.I. While Go held out until A.I. as a field had advanced as well several generations/decades as well.
I would tend to agree that there is something interesting and new at work here, though, in that computers didn't get better than humans at go simply by applying the same brute force algorithm, just with more processing power. It does suggest that at least some of what we previously thought required "intuition" can be modeled through a random forest (I think that's what they're using, if not RF, then some other combination of ML).
You can imagine a means of interpreting intermediate layers of alphago's weighting function similar to the second image in [1] (not the best example, I apologise) that would produce images or other abstract representations of the strategy that layer was encoding, similar to how a human might classify moves or patterns into categories.
The AI google designed is architected similarly to how the brain works in dual process theory: you have a NN providing intuition like system 1 and a supervisor much like system 2 which double checks system 1
Well, it depends. During the game last night there were quite a few forced moves that require an immediate and unique reply. The motivation for the reply is quite clear. But that's the simple stuff.
Unfortunately, God is not readily available for comparison, so we'll use the best human players instead.
How many links are there in that chain? The more there are, the more there is to learn about the game, and hence the deeper and more sophisticated the game is. (So you might think, anyway.)
If you rate players using the Elo system, beating someone 2/3 of the time corresponds to being about 150 points stronger. A complete beginner at chess might have an Elo rating of 500, compared with the world champion somewhere around 2900, giving 16 links in the chain.
In go, beating someone 2/3 of the time corresponds to being about one kyu/dan rank stronger. A complete beginner might be 30 kyu; the best players are stronger than 9 amateur dan, so that's at least 40 links in the chain. (Lower-numbered kyu ranks are stronger; after 1 kyu comes 1 dan, and then higher-numbered dan ranks are stronger.)
So by this measure -- which you may or may not find convincing -- go is a more sophisticated game than chess.
Here is the best argument I know against this definition. Define the game of "tenchess" as follows. To play a game of tenchess, you play ten games of chess and the winner is whoever wins more games (a draw if the same number). Then it's easy to see that tenchess has a longer chain, as defined above, than chess; if I win 2/3 of my chess games then I win 79% of my tenchess games, so I can win 2/3 of my tenchess games with a smaller advantage. (I am ignoring the existence of draws for this calculation, just for simplicity.) But surely tenchess isn't a deeper game than chess; it's just longer. Perhaps go's longer chain is just the result of its being a longer game.
This isn't true. One kyu/dan rank stronger means being 1 stone stronger (so winning 50% of the time when playing White with reverse komi). In practice this may correspond to winning 2/3 of the time with normal komi for high dan players, but that doesn't hold for low kyu players. A 29k has maybe a 51% chance of winning against a 30k because both will make huge mistakes. So although the 29k can score on average 13-15 more points than the 30k in a given game, this advantage is swamped by the large standard deviation of scores in beginner games, turning the win/loss outcome into essentially a coin flip.
A has 150 more Elo rating than B in chess. Elo says A has a 2/3 EV on the game result, and B has 1/3.
In tenchess, A will get 20/3 points on average and B will get 10/3 points. A will have more points than B in 79% of tenchess games, but the Elo ratings will not change. Elo doesn't consider winning and losing as binary. (This is why draws behave sensibly.) Just as a tie between these players cause A to lose rating, so too would a marginal win from A.
[Source: http://www.nature.com/news/google-ai-algorithm-masters-ancie...]
Thanks for the info.
208168199381979984699478633344862770286522453884530548425639456820927419612738015378525648451698519643907259916015628128546089888314427129715319317557736620397247064840935
to be precise. Which is more than even the square of the number of atoms in the universe, showing how silly that comparison is...Btw, the quoted "the average 150-move game contains" makes no sense at all, since such a game contains only 151 positions.
A program can play pretty good chess on modern hardware just by alpha-beta searching with a fairly simple evaluation function for the leaves of the search tree.
The best programs are cleverer than that; they have sophisticated evaluation functions, they prune and extend their searches, etc. But at heart, what makes them so strong is that they can search deeply.
That approach doesn't work so well for go. The board is 6x the size, games are 4x as long, the "branching factor" (number of moves available in a given position) is 10x as large. (All figures very crudely approximate.) If you try to make a fairly-dumb searcher in go, it will play very badly.
So how do humans manage to play well in go? By smarter searching, with a better idea of what moves are worth considering; by thinking strategically; by having a feel for the shape of a position ("moving here is likely to be very valuable").
Those are all things that feel like they are harder to make a computer do, and come closer to actual intelligence, than doing well at chess just by doing an enormous search.
The first of those is certainly correct. AlphaGo (like most modern go programs) organizes its searches in quite a different way from a typical chess program. It's not clear how far it deserves to be called smarter, though, since a lot of what it's doing is playing out lots of games fairly stupidly[1] and seeing how they go on average.
[1] Compared with how it actually plays. One of the achievements of AlphaGo, I think, is that it can reasonably quickly select moves for its playouts that are actually pretty good.
The second is more debatable. But, e.g., AlphaGo selects and evaluates moves using neural networks trained on a large amount of high-quality play, and the effect of this is that given a position it can quickly "see" how good it thinks the position is and what moves might be effective, without doing any searching, as a result of feeding the position through a big neural network that does some mysterious calculation we don't understand well. Which is, at that level of abstraction, pretty similar to what you might say about a human go player.
Whether any of this has any bearing on more general artificial intelligence is an entirely different question, which I will not attempt to get into.
I guess it depends on your idea of a "long way". Using only your simple evaluation a program would be rated somewhere around 1000-1200. It would lose every single game to an average tournament player.
There are no flying DeLoreans (Back to the Future). There are no hotels on the Moon (Mad Man). People overestimate long-term (20+ years) change.
It just shows once more that for any maxim there is a maxim with the opposite meaning.
Of course, there are many ways you can do the comparison:
Time to build: The AlphaGo team didn't have billions of years of evolutionary tinkering to work with in refining biological heuristic/learning systems.
Hardware limits: though still more efficient at search than previous designs, AlphaGo still has a lot more storage space and inter-component bandwidth than a human brain, plus better latency. Will the algorithms improve to the point that they can perform well on an extremely restricted architecture?
Starts at 42:00 https://www.youtube.com/watch?v=l-GsfyVCBu0
Btw. There's a concept in Go called "overplaying". That means selecting a move that isn't objectively the best you could come up with, but that is most confusing, considering the level of the opponent. It's generally thought of as a bad practice, and if you misestimated the level of your opponent, she can punish you by exploiting the fact you didn't play your best move.
I've seen this written by many people but is there any solid evidence/study that proves this?
Edit: seems like Pocket Fritz and Komodo are easily able to beat grandmasters.
The tree search wasn't even the novel part of the algorithm... the authors even cite others who had used the identical technique in previous Go algorithms.
They definitely need training data to learn the value function, but training the policy network is based on self-play. While MCTS is not new, I believe bootstrapping reinforcement learning with self-play to train a policy network that guides the MCTS is novel.
An Amateur can learn plenty from slightly weaker version on less hardware already.
Some units are balanced by the fact that no human can manipulate them to their full potential. Once you remove that restriction, the AI can abuse the speed of execution, acting as a force multiplier that will cover any strategic lackings.
If they want DeepMind to really "play" Starcraft in the traditional sense, i.e. make it win based on decision making and reasoning about the game, then they'll need to artificially rate limit its APM.
1. The computer can discard all its current best ideas and flip through new ones so fast, it would be a flickering blur to humans.
2. Even if we put a speed limit on it, the move being considered is itself the result of considering a lot of slight variations.
3. The ability to _articulate_ in a human language what makes the move nice is itself a "hard problem" closely related to natural language processing.
4. Even just having some color codes or symbols and grouping related ideas has some serious problems: now the visualization is pretty technical to begin with, the computer is still able to memorize and compare moves at an unbelievable rate, and it's still fundamentally not the same as the method Go masters use to find a solution.
Even with all that thinking output on the screen, the computer would still soundly beat myself and another (intermediate) player.
Here are some screenshots to illustrate what I'm talking about:
"At the US Congress 2008, he [Myungwan Kim] also played a historic demonstration game against MoGo running on an 800 processor supercomputer. With a 9 stone handicap, MoGo won by 1.5 points. At the 2009 congress, he played another demonstration game against Many Faces of Go running on 32 processors. Giving 7 stones handicap, Kim won convincingly by resignation."
(Kim Myung Wan (born 1978) is a 9d Korean professional who has taken up residence in the Los Angeles area as of 2008)
More information here, with a nice graph:
http://senseis.xmp.net/?ComputerGo
http://i.imgur.com/RvQsf6v.png
You can see progress seemed to be slow at 2012.
Then people hit on using Monte Carlo which was the big step forward you show in your graphs. But then, that progress seemed to stall to the degree that various people were quoted in a Wired article a couple years ago about how they weren't sure what was going to happen.
Yet, here we are today.
TD-Gammon was at that point for a while in the early 90s, but the experts caught up, and this changed the generally accepted Backgammon strategies.
In particular, you don't get the total number of points you'd have got by playing the chess games individually. You get 0, 1/2, or 1. In particularly particular, A doesn't get any fewer points from a marginal win than from a blowout. (Just as, when playing chess, you don't get fewer points from taking 100 moves to grind out a tiny positional advantage than from a 20-move brilliancy.)
So A and B don't get 20/3 and 10/3 points on average from a game of tenchess; that's the number of "chess points" they get on average, but the average of the number of chess points isn't a thing that actually matters when they're playing tenchess.
(If A wins 2/3 of the time at chess and they never draw, then it turns out that A gets about 0.855 points per tenchess game.)
But I don't think you know much about Go, if you can say Fan Hui is "just" 2 dan professional. What do you reckon the strength difference is between 2p and 9p?
Nitpick: while AlphaGo today is certainly stronger than AlphaGo last October, it doesn't follow in any way from the fact that both programs beat their respective opponents. A > B, C > D, D > B, therefore C > A? By "400 ELO", no less?
You can use that table to calculate the win probability for a 9dan player versus a 2dan player.
For your info: professional ranks do not reflect strength. They are honorary and based (typically) on achievement and seniority.
That's not exactly true
You only need to play a few rounds of Atari Go, say 30 minutes to an hour to get a grasp of the capturing rules and then you can move to a 9x9 or 13x13. I'd go straight for the 13x13 because it's not that much bigger but it has much more depth into it without being overwhelming. And many Go boards have 19x19 on the other side and 13x13 on the other.
When played on a small enough board, the games take about as long time as capture go games.
I definitely agree. Just a few games (ie. just a few minutes) of Atari Go every now and then should be enough to teach that and then move on the the real thing.
Your game variant sounds interesting, btw!
Having played a bit with some toy models, I've changed my mind a bit; my guess is that p=2/3 is a reasonable approximation for few-dan and few-kyu amateurs, but that outside, say, the 5k-5d range it's far enough off to make a substantial difference.
So, what does this do to those (anyway fairly bogus) "depth" figures? My crappy toy model suggests that for a 2/3 win probability you need a 3-rank difference around 24k, a 2-rank difference around 12k, a 1-rank difference around 2d, a 0.5-rank difference around 8d. And I estimate God at 15 amateur dan (if Cho Chikun is 9p and needs 4 stones from God then God is 21p; if, handwavily, 9d=3p and one p-step is 1/3 the size of one d-step, then God is 21p = (3+18)p = (9+6)d = 15d). So we need maybe 20 steps from God to 5d, then maybe 10 from there to 5k, then maybe 5 from there to 15k, then maybe 5 from there to 30k. That's 40 steps -- not so very different from what we get just by pretending one rank = one "2/3 win probability" step, as it happens.
edit: some actual estimates. Deep Blue had 11.38 GFLOPS[1]. According to the paper in Nature, distributed AlphaGo used 1202 CPUs and 176 GPUs. A single modern GPU can do between 100 and 2000 double precision GFLOPS[2]. So from GPUs alone AlphaGo had access to 4-5 orders of magnitude more computing power than Deep Blue did.
1] https://en.wikipedia.org/wiki/Deep_Blue_(chess_computer)
2] https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_proces...
AlphaGo went way beyond that. It actually learned more like how a Go player does. It was able to examine and play a lot of games. That's why it was able to beat a 2p pro, and within less than half a year, challenge a 9p world-class player at least on even terms.
The big thing isn't that AlphaGo is able to play Go at all at that level, but that learned a specific subject much faster than a human.
While it's fun to hate on IBM, it's not really fair to say Deep Blue was throwing hardware at the problem but AlphaGo isn't. Based on the paper AlphaGo will perform much worse in terms of ELO ranking on a smaller cluster.
[0] http://www.economist.com/news/science-and-technology/2169454...
AlphaGo utilizes the "Monte Carlo tree search" as its base algorithm[1]. The algorithm has been used for ten years in Go AIs, and when it was introduced, it made a huge impact. The Go bots got stronger overnight, basically.
What novel thing AlphaGo did, was a similar jump in algorithmic goodness. It introduced two neural networks for
1) predicting good moves at the present situation
2) evaluating the "value" of given board situation
Especially 2) has been hard to do in Go, without playing the game 'till the end.
This has a huge impact on the efficiency of the basic tree search algorithm. 1) narrows down the search width by eliminating obviously bad choises and 2) makes the depth at where the evaluation can be done, shallower.
So I think it's not just the processing power. It's a true algorithmic jump made possible by the recent advances in machine learning.
This is what struck me as especially interesting, as a non-player watching the commentary. The commentators, a 9-dan pro and the editor of a Go publication, were having real problems figuring out what the score was, or who was ahead. When Lee resigned the game, it came as a total surprise to both of them.
Just keeping score in Go appears to be harder than a lot of other games.
The incentive structure of the game leads to moves that firmly define territory usually being weaker, so the better the players, the more they end up playing games where territory is even harder to evaluate.
It's obvious by just reading Hacker News.
Fitting analogy. There was a line in the film Blood & Donuts about the moon being ruined when they landed on it, which I couldn't really feel until today.
This happens in chess too, of course, but in Go their value is decided only based on how they are used. The sophistication is about the same, but the rules are simpler.
Yes, the fascination lies in the strategies that emerge from the simple base.
EDIT: But come to think of it this is a bad example, because you don't need any training data at all to learn to play a game well. Computer programs can play against themselves and rediscover strategies that work well. It's just an advantage.
Of course, if there are many samples, the computer can go through those faster, but if there are no samples already and the computer has to learn example by example as humans do as well, humans may still have an advantage.
Of course, this advantage will diminish as well as AI advances.
What I mean is that I am more impressed by anyone of anything that can do a task (go, golf, chess, learning a foreign language, doing the dishes even) well with just a single example, or e.g. an hour of training.
Being able to train in solitude is an advantage indeed. You need two humans to do this, but you also need two AlphaGo-instances as well.
I don't understand how the AGA live stream didn't appear there for me?!
Andrew Jackson's role is invaluable in clarifying MyungWan Kim's thoughts: the infamously opaque "play this one, and then this one", or his white/black colour mix ups...
I personally think they're a good combo. Andrew is getting gradually better at only jumping in when necessary.
He inevitably asks questions you want Myungwan to answer.
But nevertheless, fitting so much computing power in such a small device is a great achievement.
I don't think any mass comparison is really meaningful, mind, but it's not that simple.
edit: according to the livestream
During the play, the computational requirements are vastly less (but I don't know the figures). It's still probably more than is feasible to put in a smartphone in the near future. Assuming we get 3x improvement in perf per watt from going to ~20nm chips to ~7nm chips (near the theoretical minimum for silicon chips), I don't think this will work on a battery powered device. And CPUs are really bad at perf per watt on neural networks, some kind of GPU or ASIC setup will be required to make it work.
> Evaluating policy and value networks requires several orders of magnitude more computation than traditional search heuristics. AlphaGo uses an asynchronous multi-threaded search that executes simulations on CPUs, and computes policy and value networks in parallel on GPUs. The final version of AlphaGo used 40 search threads, 48 CPUs, and 8 GPUs. We also implemented a distributed version of AlphaGo that exploited multiple machines, 40 search threads, 1202 CPUs and 176 GPUs.
In fact, according to the paper, only 50 GPUs were used for training the network.
This is equivalent to one person expending 500 years solely to learn Go.
Fotland and others tried to figure out how to modify their programs to integrate full-board searches. They met with some limited success, but by 2004, progress stalled again, and available options seemed exhausted. Increased processing power was moot. To run searches even one move deeper would require an impossibly fast machine. The most difficult game looked as if it couldn’t be won."
http://www.wired.com/2014/05/the-world-of-computer-go/
The article then goes on to discuss how Monte Carlo was the real breakthrough.
Nonetheless, the quoted estimate in the article (mentioned twice, including in the second sentence) is "I think maybe ten years", ie 2024, which while inaccurate is probably "in our lifetimes".
Not quite what you are after, but it's pretty clear that he didn't think it would be beating the world champion in 14 years.
[1] NY Times, 2002, http://www.nytimes.com/2002/08/01/technology/in-an-ancient-g...
http://www.weforum.org/agenda/2016/03/have-we-hit-a-major-ar...
Given the rules, and a big book containing every professional go game ever played, and no other instruction, it's not entirely clear to me that Lee Sedol would be able to reach his current skill level in 500 years.
Not to mention that we suddenly forgot that computers have their own units of measurement, such as clock speed (hertz) and memory size (bytes).
Is it? The problem here is it is really hard to compare the TCO. For example prime human computation requires years and years of learning and teaching, in which the human cannot be turned off (this kills the human). A computer can save its state and go in a low or even a zero power mode.
>such as clock speed (hertz) and memory size (bytes).
Which are completely meaningless, especially in distributed hybrid systems. Clock speed is like saying you can run at 10 miles per hour, but it doesn't define how much you can carry. GPUs run a far slower clock speed than CPUs, but they are massively parallel and are much faster than CPUs on distributed workloads. Having lots of memory is important, but not all memory is equal and hierarchy is even more important. Computer memory is (hopefully) bit perfect and a massive amount of power is spent keeping it that way. That is nice when it comes to remembering exactly how much money you have in the bank. Human memory is wonderful and terrible at the same time. There is no 'truth' in human memory, only repetition. A computer can take a picture and then make a hash of the image, both of which can be documented and verified. A human can recall a memory, but the act of recalling that memory changes it, and the parts we don't remember so well are influenced by our current state. It is this 'inaccuracy' that helps us use so little power for the amount of thinking we do.
Are the units I proposed perfect for the job? Of course not, just look how much you wrote. But I bet that if you do the same "thoroughly" analysis for measuring computing by weight you'll be able not only to write a fat paragraph such as your last one, you can write a whole book on who wrong/meaningless/stupid it is (not that anyone would read such book though).
And AI is just one strand. There are several strands that are as deeply changing, that is happening simultaneously.
I remember someone speaking about the shift between classical hard sci fi and more current sci-fi authors like Neal Stephenson or Peter Hamilton. The classical authors like Heinlein or Asimov might do world building where they just change one thing. What would the world be like if that one thing changed? After a certain point though, things were changing so fast that later authors didn't do that. There were too many things that changed at the same time.
Except if a big solar flare hits us ;}
Various commentators mentioned how both players, human and synthetic, made a few mistakes. Even I caught a slow move made by the AI. So whether Lee Sedol was at the top of his peformance, or not, is a bit of a debate. But the AI was clearly on the same level, whatever that means.
It was an intense fight throughout the game, with both players making bold moves and taking risks. Fantastic show.
Fan Hui said the machine played extremely consistently 6 months ago. He said playing the computer was "like pushing against a wall" - just very strong, very consistent performance.
Also, the people working on it flat out told the world that today's version of AlphaGo beats October's version literally all the time.
There are different strategies depending upon how much emphasis is placed upon early territorial gains as opposed to "influence" which is used for later later territorial gains.
Similarly, playing "passive" moves that make territory without starting a "fight" versus agressively contesting for every piece of territory available.
For this particular example, training a system involves (1) analysis of every single game of professional go that has been digitally recorded; and (2) playing probably millions of games "against itself", both of which require far more computing power than just playing a single game.
That is a gigantic over-simplification. All machines are application specific, even machine-learning based ones. They all require human supervision, whether through goal setting or fixing errors.
There are some areas where machines are better than humans, and playing Go is now one of them, but that doesn't mean machines will replace humans in all facets at any given point in time. We grow, our tools grow, and the cycle repeats.
[1] https://www.youtube.com/watch?v=gy5g33S0Gzo&ab_channel=RLLbe...
I wonder how it would deal with a teddy bear or stray piece of underwear in the pile of towels?
I see nothing that might be able to tell us why gravitational mass is the same as inertial mass, for example, or any moves in that direction. This "AI" is good at simple games.
My point is, humans "in the wild" likely didn't have any equivalent to chess, because they didn't have sufficient leisure time. Chess is a product of an environment that's just as "artificial" as the one which produced cell phones.
I think the entire analogy is stretched a little thin of the players requiring all of this, but I also think the original attack on the Go AI based on it's mass is off base as well.
The response of the Iranian sages was the invention of Backgammon, to highlight the role of Providence in human affairs.
[p.s. not all Iranians are willing to cede Chess to the sister civilization of India: http://www.cais-soas.com/CAIS/Sport/chess.htm] ;)
Plus, not far away in the future we will be able to connect an smartphone to a 3D circuit printer and print a new one, to achieve 'self-replication'
> Pocket Fritz 4 won the Copa Mercosur tournament in Buenos Aires, Argentina with 9 wins and 1 draw on August 4–14, 2009. Pocket Fritz 4 searches fewer than 20,000 positions per second. This is in contrast to supercomputers such as Deep Blue that searched 200 million positions per second. Pocket Fritz 4 achieves a higher performance level than Deep Blue.[3]
The first steps are always the most inefficient. Make it work, make it right, make it fast.
[1]: https://en.wikipedia.org/wiki/Deep_Blue_%28chess_computer%29... [2]: http://cdn.slashgear.com/wp-content/uploads/2008/10/htc_touc... [3]: https://en.wikipedia.org/wiki/Human%E2%80%93computer_chess_m...
I find this overly optimistic because of the huge amount of power required to run the Go application. Remember, we're getting closer and closer to the theoretical lower limit in the size of silicon chips, which is around 4nm (that's about a dozen silicon atoms). That's a 3-4x improvement over the current state of the art.
The computer to run AlphaGo requires thousands of watts of power. A smartphone can do about one watt. A 3-4x increase in perf per watt isn't going to cut it.
If there will be a smartphone capable of beating the best human Go players, my guess is that it won't be based on general purpose silicon chips running on lithium ion batteries.
On the other hand, a desktop computer with a ~1000 watt power supply (ie. a gaming pc) might be able to do this in a matter of years or a few decades.
I already know that your answer will be: "but this time it is a fundamental physics limit". Whatever. I'm jaded by previous doomsday predictions. We'll go clockless, or 3D, or tri-state or quantum. It'll be something that is fringe, treated as idiotic by current standards and an obvious choice in hindsight.
That previous constraints have been beaten in no way supports the argument that we will beat the laws of physics this time.
Finally, the hardware we are using to run these programs is insane. Sure the silicon is approaching some hard physical limits, but your processor spends most of that power trying to make old programs run fast...
My prediction is that with enough ressources it is possible to write a Go AI which runs on general purpose hardware that's manufactured on current process nodes and fits in your pocket.
If you look at http://googleresearch.blogspot.com/2016/01/alphago-mastering... you'll find that Google's estimate of the strength difference between the full distributed system and their trained system on a single PC is around 4 professional dan. Let's suppose that squeezing it from a PC to a phone takes about the same off. Now a pocket phone is about 8 professional dan weaker than the full distributed system.
If their full trained system is now 9 dan, that means that they can likely squeeze it into a phone and get a 1 dan professional. So the computing power on a phone already allows us to play at the professional level!
You can get to an unbeatable device on a phone in 10 years, if self-training over a decade can create about as much improvement they have done in the last 6 months, AND phones in 10 years are about as capable as a PC is today. Those two trade off, so a bigger algorithmic improvement gets you there with a weaker device.
You consider this result "overly optimistic". I consider this estimate very conservative. If Google continues to train it, I wouldn't be surprised if there is a PC program in a year that can beat any Go player in the world.
It'll likely be hardware that can be generalized to run any kind of deep net. The iPhone 5S is already capable of running some deep nets.
As a friend mentioned, it isn't the running of the net, it's the training that takes a lot more computational power (leaving aside data normalization). A handheld device that is not only capable of running a deep net, but also training one -- yeah, that will be the day.
There are non von Neuman architectures that are capable of this. Someone had figured out how to build general-purpose CPUs on silicon made for memory. You can shrink down a full rack of computers down into a single mother board, and use less wattage while you are at it.
This really isn't about having a phone be able to beat a Go player. Go is a transformative game that, when learned, it teaches the player how to think strategically. There is value for a human to learn Go, but this is no longer about being able to be the best player in the absolute sense. Go will undergo the same transformation that martial arts in China and Japan has gone through with the proliferation and use of guns in warfare.
Rather, what we're really talking about is a shot at having AIs do things that we never thought they could do -- handle ambiguity. What I think we will see is -- not the replacement of blue collar workers by robots -- but the replacement of white collar workers by deep nets. Coupled with the problems in the US educational system (optimizing towards passing tests rather than critical thinking, handling ambiguity, and making decisions in face of uncertainty), we're on a verge of some very interesting times.
I just don't see a 1000x+ decrease in the power required happening in a decade or two without some revolutionary technology I can't even imagine. Is this what you meant? I'm sure most people couldn't imagine modern silicon chips in the 1950s vacuum tube era. But now we're getting close to the theoretical, well-understood minimums in silicon chips, so another revolutionary step is required if another giant leap like that is to be achieved.
(2000 kilocalories / day -> ~100W; the brain uses about a quarter of your calories.)
Not necessarily the same kind, and, if I had to make the call, I would say they aren't of the same kind.
I don't think that's quite true as a description of what we knew about computer Go previously, though it depends on what precisely you mean. Recent systems (meaning the past 10 years, post the resurgence of MCTS) appear to scale to essentially arbitrarily good play as you throw more computing power at them. Play strength scales roughly with the log of computing power, at least as far as anyone tested them (maybe it plateaus at some point, but if so, that hasn't been demonstrated).
So we've had systems that can in principle play to any arbitrary strength, if you can throw enough computing power at them. Though you might legitimately argue: by "in principle" do you mean some truly absurd amount, like more computing power than could conceivably fit in the universe? The answer to that is also no; scaling trends have been such that people expected computer Go to beat humans anywhere from, well, around now [1], to 5 to 10 years from now [2].
The two achievements of the team here, at least as I see them, are: 1) they managed to actually throw orders of magnitude more computing power at it than other recent systems have used, in part by making use of GPUs, which the other strong computer-Go systems don't use (the AlphaGo cluster as reported in the Nature paper uses 1202 CPUs and 176 GPUs), and 2) improved the scaling curve by algorithmic improvements over vanilla MCTS (the main subject of their Nature paper). Those are important achievements, but I think not philosophical ones, in the sense of figuring out how to solve something that we previously didn't know how to solve even given arbitrary computing power.
While I don't agree with everything in it, I also found this recent blog post / paper on the subject interesting: http://www.milesbrundage.com/blog-posts/alphago-and-ai-progr...
[1] A 2007 survey article suggested that mastering Go within 10 years was probably feasible; not certain, but something that the author wouldn't bet against. I think that was at least a somewhat widely held view as of 2007. http://spectrum.ieee.org/computing/software/cracking-go
[2] A 2012 interview though that mastering Go would need a mixture of inevitable scaling improvements plus probably one significant new algorithmic idea, also a reasonably widely held view as of 2012. https://gogameguru.com/computer-go-demystified-interview-mar...
This is exactly the opposite of my sense based on following the computer go mailing list (which featured almost all the top program designers prior to Google/Facebook entering the race). They said that scaling was quite bad past a certain point. The programs had serious blindspots when dealing with capturing races and kos[1] that you couldn't overcome with more power.
Also, DNNs were novel for Go--Google wasn't the first one to use them, but no one was talking about them until sometime in 2014-2015.
[0] Not the kind of weaknesses that can be mechanically exploited by a weak player, but the kind of weaknesses that prevented them from reaching professional level.
That means that the problem is exponentially hard. EXPTIME, actually. You couldn't possibly scale it much.
To be fair, a lot of the progress in recent years has been due to taking a different approach to solving the problem, and not just due to pure computing power. Due to the way go works, you can't do what we do with chess and try all combinations, no matter how powerful of a computer you have. Using deep learning, we have recently helped computers develop what you might call intuition -- they're now much better at figuring out when they should stop going deeper into the tree (of all possible combinations).
Play strength scales roughly with the log
of computing power
The rumor I have heard is that the new Deep Mind learning algorithm really improves on this and scales linearly with computing power.The achivement was a leap towards the human level of play (and quite possibly over it). There might be additional leaps, which will take AIs WAY beyond humans, but none of those will scale linearily in the end. (And yeah, I guess you didn't want to say that either)
If the normalcy bias was in effect, they wouldn't be spending that money.
It's certainly possible that we'll break more barriers with clever engineering and new scientific breakthroughs. But that doesn't mean the Normalcy Bias isn't in play here.
However, I'm talking hundreds of billions spent on R&D to specifically to solve problems associated with chip manufacture. It took on the order of 25 years to solve each of the problems listed in the grandparent's post. Nobody would spend that kind of money or time on something that they think somebody else would solve.
Say you spent a hundred billion dollars to extinguish the sun- that wouldn't work. How much money you spend is irrelevant when you're up against what people call "hard physical limits".
I've read several articles saying that different cancers are not exactly the same disease, but more like different diseases with the same symptom (uncontrolled tumor growth) and different etiology, even sometimes different from person to person, not just from tissue to tissue. This was said to be a reason that a general cancer cure is so elusive. But is it really thought of as impossible, not just elusive?
Maybe our inability to extinguish the sun is also a limitation of knowledge more than a hard physical limit!
Even if I'm right about this, your description of the situation would still be accurate in that there would be no way to simply throw more money at the problems and guarantee a solution; there would need to be qualitative breakthroughs which aren't guaranteed to happen at any particular level of expenditure. If people had spent multiples of the entire world GDP on a space program in the 1500s, they would still not have been able to get people to the moon, though not because it's physically impossible to do so in an absolute sense.
Yep, that's my point, thanks. Sorry, I'm not in my most eloquent today :)
> physics and experience in working in semiconductors
> without some revolutionary technology I can't even
> imagine
I suspect (in the nicest possible way) that in a lineup of your imagination (on current assumptions) vs the combined ingenuiety of the human race driven by the hidden hand, the latter wins.> I find this overly optimistic
exDM69 never said it's not gonna happen, he just said that it's not going to happen in ten years, and I agree with him. Revolutions never occurs that quickly. To achieve that we don't just need an improvement of the current state of the art, we need a massive change and we don't even know what it's going to look like yet ! This kind of revolution may occur one day but not in ten year.
And it could even never happen, remember that we don't have flying cars yet ;)
The optimistic position is a bit like saying: "I 've lived 113 years, I'm not going to die now!". It's entirely possible for a trend to reverse itself. If machine learning has taught us something is that background knowledge (in this case, of processor technology) gives you much better results than just guessing based on what happened in the past.
Stacked 3D chips (HBM, etc), Heterogenous computing (OpenCL, Vulkan), Optical computing, Memristors, Graphene-based microchips, Superconductors, Spintronics, Quantum computers, Genetic computers (self-reconfigurable)
The rest of the technologies you mention have great potential but will they be available in a smartphone in one decade? I don't think so.
It might as well slow down again and we have to remember that most humans in history saw little to no advances in technology over their lifetime.
I'm excited for the possibilities modern science opens up but I also think we might reach a point where fundamental progress stalls for a century or two.