Pluribus poker AI is the greatest AI and poker news of 2019. We can find out how Pluribus plays poker. To great fanfare, Facebook and Carnegie Mellon University announced the domination of AI at 6max NLHM poker. With marketing savvy, the article in venerable Science magazine was published at the peak of the WSOP championships of poker on 11 July 2019. Domination was demonstrated in mbb/100 using AIVAT, a sophisticated variance (luck) removal algorithm that works for AI machines, not for human. So luck removed for Pluribus, maintained for humans, sort of thing.
Funny enough, the researchers omitted to publish actual results, ie 70k losses this time, but did not hesitate to publish them in their previous venture when Libratus beat pros at heads up NLHM 2 years ago. So although a bit is playing, we witness very human behaviour. It’s like any random poker player, when she wins, she brags about the results, when he loses, he blames it on variance or bad beats! I guess the difference is these guys are actually able to prove it.
Anyway the thing is, AVAIT is way beyond my understanding but widely recognised as valid in AI academic circles. So let’s admit it Pluribus AI does dominate.
It just shows the extraordinary variance of poker. In other words, considering reasonable 6 hour sessions, being the best poker player in the world, after 50 live sessions or 16 online sessions, you can easily be down 70k and worry whether it is bad play or bad luck. Yeah, don’t quit your day job.
The thing is, previous advances in AI, by University of Alberta Cepheus for limit heads-up and Deepstack for heads up no limit and CMU Libratus were a revolution for all decent poker players introducing GTO and Nash theorem to the masses and yielding innumerable threads on GTO vs exploitative strategies.
So the cool thing is CMU did post the 10k hands, and they’re all in my database, so I’ll be writing a series of posts analysing how Pluribus plays poker (and the pros play). 10k hands is not that many, but believe me there are learnings, like opening ranges, bet sizings, easy SB opening strategy, irrelevance of Cbet? Etc.
By the way, Pluribus doesn’t play GTO strictly speaking and its algos don’t converge to a Nash equilibrium. Still it’s trying not to be exploited, not to exploit, although I do have some doubts on the post blueprint searches if they’re based on hands against humans, maybe that’s slightly exploitative, but this is total speculation and the researchers say it doesn’t take other players actions into account.
In the first post, I’ll focus on LJ RFI ranges (pre-flop first to speak, what do we open with)
And for now, just one stat, VPIP (voluntarily put money in pot, ie percentage of hands where pluribus decided to play/risk money) is 26.4%, no big surprise here, many of us are around 24% and if we played postflop perfectly, sure we’d open a bit more…