Google's Intelligence Designer(technologyreview.com) |
Google's Intelligence Designer(technologyreview.com) |
I think the article must have overlooked significant activity in training learning systems to play games well. The glaring omission for me was Neurogammon (1987), later TD-Gammon (1992), developed by Gerry Tesauro and colleagues (http://en.wikipedia.org/wiki/TD-Gammon).
Neurogammon was, at the time, a sensation at the same conference the article coyly refers to as "a leading research conference on machine learning." The paper has almost 1000 citations. A curious omission.
It's really a sad comment on the state of reporting at MIT Tech Review that you learn more about the tech from a youtube video than from an article.
(My complaint is not with the DeepMind people, it's with the article, which should put the work in context.)
I mean, to what extent did they restrict what it means to be an "arbitrary game"? I highly doubt their software can play Pictionary, for instance, but I haven't found anything that really explains their limitations.
Because of this, I am leaning towards the cynical and assume it's just hype, and not actually that incredible.
It's not any less impressive though, to my knowledge no one had done anything like that before. That is, beating video games with raw video data and reinforcement learning.
http://arxiv.org/abs/1312.5602
I'll just quote their introduction instead of trying to summarize the paper:
"Our goal is to create a single neural network agent that is able to successfully learn to play as many of the games as possible. The network was not provided with any game-specific information or hand-designed visual features, and was not privy to the internal state of the emulator; it learned from nothing but the video input, the reward and terminal signals, and the set of possible actions—just as a human player would. Furthermore the network architecture and all hyperparameters used for training were kept constant across the games. So far the network has outperformed all previous RL algorithms on six of the seven games we have attempted and surpassed an expert human player on three of them."
The paper does a good job going over related work (section 3), beginning with the example I gave.
Murphy created an agent that can play arbitrary games by inspecting the RAM and attempting to maximize the score.
See also this writeup on Ars Technica - http://arstechnica.com/gaming/2013/04/this-ai-solves-super-m...
What a time to be alive.
Or, industry types are looking for the next big thing, after "big data," and have rebranded neural networks as "deep learning"?
I don't mean to be too cynical, but I still don't understand if "deep learning" represents any meaningful advance besides the ML and EE communities finding the benefits of a certain amount of structure, which is already well established in other lines of research.
IIRC (forgive me, I read the paper a few weeks ago) the solution is at its core a reinforcement learning system, with the deep net only making up the component that predicts reward from a (state, action) pair. With that in hand, there remains the non-trivial RL problem of balancing "exploration vs exploitation" in learning good strategies to play the game(s). While NN's have been used in this capacity before, I believe that, as other comments have mentioned, using a deep net to learn to map a high-dimension state-action space (e.g,the state of the game represented as pixels of the screen at a particular time) to expected reward in real time was indeed an advance, both theoretical and technical.
And, oh yeah, I just remembered that a University of Texas research group is doing work in this area too (there was a recent paper [2] from Peter Stone and others).
(Edited for clarity)
(Edited again to suggest another paper).
[1] - http://arxiv.org/pdf/1312.5602.pdf
[2] - http://www.cs.utexas.edu/~pstone/Papers/bib2html-links/TCIAI...
This enables hierarchical learning of increasingly complex concepts – building new concepts upon less complex concepts from previous layers. Deep architectures are thus able to learn high abstractions, as in [1], for instance.
If you have not yet done so, I would strongly urge you to read some papers on the subject from the last decade (e.g. Hinton, Bengio or LeCun), or even just skim through the Wikipedia entry [2].
[1] http://www.technologyreview.com/view/532886/how-google-trans...
I feel compelled to point out that the only connection between the "MIT" tech review and MIT is that the magazine licenses the name from the alumni association. It's how the alumni associations funds itself and every MIT grad gets a lifetime subscription to a version of the magazine with the alumni notes bound into the back. I doubt many of us read it. I don't know how many people other than MIT grads read it, but I would imagine vanishingly few.
A friend of mine calls it "the magazine of things that will never happen" which I think is dead on. It's a shame because the editor, Jason Pontin, as actually a good guy so it's surprising that the magazine continued to suck after he took it over.
There are many reasons to criticize MIT (don't I know it!) but you can't judge the institute by this magazine.
Personally, I think that Jason has brought a lot of positive changes to a magazine that, for a long time, tended toward a technology policy wonkish orientation.
So I think it's fair that a lot of what's written about "will never happen." But I'm not sure that's really avoidable if you cover cutting-edge research.
Also, it didn't use an elaborate set of features and heuristics adapted for backgammon, just a simple representation of the state of the board (a list of 0/1 variables encoding how many pieces of each color are on each position).
This is pretty close to "from scratch", and I think the article would have done well to point out what is actually new here.
Point is, the science was there since the 80s, and not much has changed.
Well, then we're in agreement about the meaning of the term. Deep Learning, then, would be Machine Learning using any of these deep architectures – be they Restricted Boltzmann Machines, or otherwise.
But yes, the available computing power has been a huge limitation for much AI research.