“Rock Paper Scissors” Trained in Browser AI/ML(heartbeat.fritz.ai) |
“Rock Paper Scissors” Trained in Browser AI/ML(heartbeat.fritz.ai) |
Identifying the hands is pretty slick, but I'd actually find a model of ideal play to be even slicker, because there is no "optimal strategy" for RPS, but there is an "optimal strategy" against each individual opponent. Making an AI that can learn that strategy would be pretty impressive.
IIRC, the idea was that people tend not to repeat the same move, and then of the other two, it's slightly more likely they'd pick the one on their mind already - the one they were hoping their opponent would've chosen in the prior round.
* http://www.rpscontest.com/ * https://daniel.lawrence.lu/programming/rps/
Playing rock paper scissors is about predicting what the opponent will play next given a history of moves.
Prediction is the dual problem of compression. For example if the history of moves is "RRR", you can try to compress the strings "RRRR", "RRRP", and "RRRS". Clearly the first one ("four Rs") is more compressible than the other two, so it is most likely that the next move in the sequence is also an R.
However, there is no universal compression [1][2] so there is indeed no universal optimal RPS player.
[1] https://en.wikipedia.org/wiki/Kolmogorov_complexity [2] http://mattmahoney.net/dc/dce.html
Wish I had kept it training with the data from people who used it but it's on a static github pages site.
http://modelai.gettysburg.edu/2013/cfr/cfr.pdf
I started converting the Java to Python a while back, but got distracted by other work:
https://github.com/RichardKelley/cfr/blob/master/rps.py
That will learn optimal play against a fixed opponent. If you change the code to make both players use regret minimization, it should still converge.
I think it considered all players the same opponent and used something like a markov model for predicting moves.
If you took it one game at a time it was really hard to beat. Apparently we all kinda feel now is the right time to switch to scissor. My only strategy for not loosing was to decide on the next 5 moves as randomly as possible and stick with it, as it would choose it’s moved based on wins and losses and I wouldn’t, making me less predictable.
there's a nash equilibrium? does that not count?
The main result for players Red and Blue for RPS is: Red plays each of Rock, Paper, Scissors with probability 1/3rd and independent of everything else in the known universe. Can do this by using, say, an ordinary die with six sides. Then just from the strong law of large numbers, in the long run Red and Blue will both break even, no matter what Blue did, does, or will do.
To win, Red needs some means of predicting Blue's play -- no good predictions, no winning. I.e., for Red to win, Blue has to be predictable in some sense.