Participants: 2016

Details: Category: Participants; Published on Thursday, 17 January 2013 23:52; Written by Super User; Hits: 2945

The 2016 competition had 9 different agents in the heads-up no-limit Texas hold'em competition. As in previous years, agents were submitted by a mixture of universities and individual hobbyists from 5 different countries around the world.

Competitors in the 2016 Annual Computer Poker Competition were not required to supply detailed information about their submission(s) in order to compete, but some information about team members, affiliation, location, high level technique descriptions, and occasionally relevant papers were supplied. This page presents that information.

Heads-up No-Limit Texas Hold'em

Act1

Team Name: Act1
Team Members: Tim Reiff
Affiliation: unfoldpoker
Location: Las Vegas, USA
Non-dynamic Agent
Technique:

Act1 was trained by an experimental distributed implementation of the Pure CFR algorithm. A heuristic was added to occasionally avoid some game tree paths, reducing the time spent per training iteration. To compensate for imperfect recall, a distance metric that considers features from all postflop streets was used to construct the card abstraction on the river. Several bet sizes were omitted because they offer little benefit against other equilibrium opponents while requiring a disproportionate amount of resources to train and store.

The strategy consists of 159 billion information sets (430 billion information set-action pairs) and completed 5.15 trillion iterations.

Hugh

Team Name: Hugh
Team Members: Stan Sulsky
Affiliation: Independent
Location: New York, USA
Non-dynamic Agent
Technique: Just a rule-based engine.

KEmpfer_cfr

Team Name: KEmpfer
Team Members: Julian Prommer, Patryk Hopner, Suiteng Lu, Eneldo Loza Mencia
Affiliation: Knowledge Engineering Group, TU Darmstadt
Location: Darmstadt, Germany
Non-dynamic Agent
Technique:

This bot implements a CFR strategy. For training the policy, we used the Open Pure CFR implementation and adapted it to no-limit heads-up. In addition, we implemented some more advanced techniques such as cards and bucket clustering.

Nyx

Team Name: Nyx
Team Members: Martin Schmid, Matej Moravcik
Affiliation: Charles University
Location: Prague, Czech Republic
Non-dynamic Agent
Technique:

Equilibrium approximating agent
Small computational resources
Very compact strategy representation ( only 2GB for uncompressed strategy )
Imperfect recall action abstraction with up to 16 possible bets in an information state
Abstraction as well as the strategy are continuously learned during self-play
Heavily modified CFR utilizing dynamic programing to handle non-stationary imperfect action abstraction with many action

Automatic public card abstraction for the flop round - Schmid, M., Moravcik, M., Hladik, M., & Gaukroder, S. J. (2015, January). Automatic Public State Space Abstraction in Imperfect Information Games. In Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence.

Proteus

Team Name: Queen's Automated Poker Team (QAPT)
Team Members: Chris Barnes, Spencer Evans, Austin Attah, Robert Sun
Affiliation: Queen's University
Location: Kingston, Canada
Mostly static, non-equilibrium Agent
Technique:

We have attempted to create a generic model by mining the logs of matches from
previous matches, especially well performing bots. In future work we plan on
implementing in-game modelling of the opponents.

Rembrant5

Team Name: Rembrant5
Team Members: Gregor Vohl
Affiliation: FERI
Location: Maribor, Slovenia
Static Agent
Technique:

History games are used to calculate equity of the current hand and the current board cards. Bot is making random decisions with probability of actions from the history games. Because the pure random is not very effective there are also couple of hardcoded rules the but must consider before making a single action.

Slumbot

Team Name: Slumbot
Team Members: Eric Jackson
Affiliation: Independent Researcher
Location: Menlo Park, USA
Static Agent
Technique:

Slumbot is a large Counterfactual Regret Minimization (CFR) implementation. It uses the external sampling variant of MCCFR (Monte Carlo CFR) and employs a symmetric abstraction. Some statistics about the size of the abstraction:

4.5x10^11 information sets
1.1x10^12 information-set-action pairs
1.5*10^6 betting sequences

We used a distributed implementation of CFR running on eleven r3.4xlarge Amazon EC2 instances.

More details can be found in my paper to be presented at the 2016 Computer Poker Workshop at AAAI.

BabyTartanian8

Team Name: Tartanian
Team Members: Noam Brown, Tuomas Sandholm
Affiliation: Carnegie Mellon University
Location: Pittsburgh, USA
Static Agent
Technique:

BabyTartanian8 plays an approximate Nash equilibrium that was computed on the San Diego Comet supercomputer. For equilibrium finding, we used a new Monte Carlo CFR variant that leverages the recently-introduced regret-based pruning (RBP) method [Brown & Sandholm NIPS-15] to sample actions with negative regret less frequently, which dramatically speeds up convergence. Our agent uses an asymmetric action abstraction. This required conducting two separate equilibrium-finding runs.

Noam Brown and Tuomas Sandholm. Regret-Based Pruning in Extensive-Form Games. In Neural Information Processing Systems (NIPS), 2015.

Noam Brown, Sam Ganzfried, and Tuomas Sandholm. Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Hold'em Agent. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2015.

Participants: 2014

Details: Category: Participants; Published on Thursday, 17 January 2013 23:52; Written by Super User; Hits: 13701

The 2014 competition had 12 different agents in the heads-up limit Texas hold'em competition, 16 agents in the heads-up no-limit competition, and 6 agents in the 3-player limit competition. As in previous years, agents were submitted by a mixture of universities and individual hobbyists from at least 11 different countries around the world.

Competitors in the 2014 Annual Computer Poker Competition were not required to supply detailed information about their submission(s) in order to compete, but some information about team members, affiliation, location, high level technique descriptions, and occasionally relevant papers were supplied. This page presents that information.

Heads-up Limit Texas Hold'em

Cleverpiggy

Team Name: Cleverpiggy
Team Members: Allen Cunningham
Affiliation: Independent
Location: Marinal del Rey, CA, US
Non-dynamic Agent
Technique: Cleverpiggy is the progeny of 20 billion iterations of chance sampled CFR run on a quad core intel with 48gigs of ram. She uses a card abstraction with 169, 567528, 60000, 180000 hand types for each respective street. Flop hands are essentially unabstracted. For the turn and river, board types are established by dividing all flops into 20 categories, each of which branches into 3 turns which branch into three rivers, resulting in 60 turn and 180 river distinctions. Hands for each board type are then divided into 1000 buckets based on earth mover and OCHS clustering.

Regret Minimization in Games with Incomplete Information 2007
Evaluating State-Space Abstractions in Extensive-Form Games 2013

Escabeche

Team Members: Marv Andersen
Affiliation: Independent
Location: London, UK
Non-dynamic Agent
Technique: This bot is a neural net trained to imitate the play of previous ACPC winners.

Feste

Team Name: Feste
Team Members: Francois Pays
Affiliation: Independent
Location: Paris, France
Dynamic Agent
Technique: The card abstraction uses respectively 169, 1500, 400 and 200 buckets for preflop, flop, turn and river. The buckets are computed using k-means clustering over selected hand parameters such as expected values, standard deviation and skewness at last round.

The resulting abstraction is represented using sequence form with the imperfect recall extension and has 1.3 billion game states. It is solved using a custom interior point solver with indirect algebra [1]. The solver runs on a mid-range workstation and is GPU-accelerated with CUDA.

Feste has two strategies at its disposal: a defensive one, close to the equilibrium but still slightly offensive, and a second strategy, definitely aggressive and consequently off the equilibrium. During the course of the game, Feste uses Thompson sampling to select the best adapted strategy for the opponent.

[1] Francois Pays. 2014. An Interior Point Approach to Large Games of Incomplete Information. Proceedings of the AAAI-2014 Workshop on Computer Poker.

Hyperborean

Team Name: Univeristy of Alberta
Team Members: Michael Bowling, Duane Szafron, Rob Holte, Nolan Bard, Neil Burch, Richard Gibson, John Hawkin, Michael Johanson, Trevor Davis, Josh Davidson, Dustin Morrill
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Dynamic Agent
Technique: Hyperborean2014-2pl is an implicit modelling agent [2] consisting of four abstract strategies. All strategies were generated using the Counterfactual Regret Minimization (CFR) algorithm [8] with imperfect recall card abstractions. Buckets were calculated according to public card textures and the k-means Earthmover and k-means OCHS buckets recently presented by Johanson et al [6]. By forgetting previous card information and rebucketing on every round [7], this yields an imperfect recall abstract game.

The portfolio of strategies for the agent consists of:

1) An asymmetric equliibrium strategy

An asymmetric equilibrium strategy was generated to exploit mistakes that can be made by equilibrium based agents using smaller abstractions of the game [3]. The abstraction for the final strategy uses 169, 1,348,620, 1,521,978, and 840,000 buckets on each round, respectively. During training with CFR, the opponent uses a smaller abstraction with 169 buckets on the pre-flop, and 9,000 buckets on each subsequent round.

2) Three data biased robust counter strategies based on prior ACPC competitors

Three strategies in the portfolio are designed to exploit specific players from prior ACPC events. Each response was created using data biased robust counter strategies [5] to data from a particular competitor in prior ACPC events. An asymmetric abstraction is used for the frequentist model used by the data biased response [3], placing observations of the player in a smaller abstraction than the regret minimizing portions of the strategy.

A mixture of these strategies is dynamically generated using a slightly modified Exp4-like algorithm [1] where the reward vector for the experts/strategies is computed using importance sampling over the individual strategies [4].

[1] P Auer, N Cesa-Bianchi, Y Freund, and R.E Schapire. Gambling in a rigged casino: The adversarial multi-armed bandit problem. Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 1995.

[2] Nolan Bard, Michael Johanson, Neil Burch, Michael Bowling. Online Implicit Agent Modelling. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.

[3] Nolan Bard, Michael Johanson, Michael Bowling. Asymmetric Abstractions for Adversarial Settings. In Proceedings of the Thirteenth International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2014.

[4] Michael Bowling, Michael Johanson, Neil Burch, and Duane Szafron. Strategy Evaluation in Extensive Games with Importance Sampling. In Proceedings of the 25th Annual International Conference on Machine Learning (ICML), 2008.

[5] Michael Johanson and Michael Bowling. Data Biased Robust Counter Strategies. In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS), 2009.

[6] Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. Evaluating state-space abstractions in extensive-form games. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 271–278, 2013.

[7] Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. A Practical Use of Imperfect Recall. In Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.

[8] Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. Regret minimization in games with incomplete information. In Advances in Neural Information Processing Systems 20 (NIPS), 2007.

Lucifer

Team Name: PokerCPT
Team Members: Luis Filipe Teofilo
Affiliation: University of Porto, Artificial Intelligence and Computer Science Laboratory
Location: Porto, Portugal
Dynamic Agent
Technique: The base agent's strategies are a nash-equilibrium (ne). Several ne strategies were computed and the agent switches between ne to difficult opponent modeling (especially on Kuhn3P). To compute the ne strategy, an implementation of cfr was used. This implementation greatly reduces the game tree by removing decisions at chance nodes where the agent knows that it has a very high or very low probability of winning. For multiplayer Poker, the cfr implementation abstracts game sequences. The methodology to group card buckets was based on grouping buckets by their utility on smaller games. As for no-limit, the actions were also abstracted into 4 single possible decisions.

PokerStar

Team Name: PokerStar
Team Members: Ingo Kaps
Location: Frankfurt, Hessen, Germany
Dynamic Agent
Technique: The PokerStar Bot is written in Pascal.

The PokerStar Bot plays at Preflop: 2.5% fold, 95% call, 2,5% raise,
so other bots could not easy modelling.

After Preflop PokerStar Bot calculates a opponent based squered weighted hand strength
to use an optimized static bucket CFR Table.
If the opponent RaiseRatio is to low a rule based Strategy will be used.
If the opponent RaiseRatio is very low additional a Selby Preflop Strategy will be used.
If the opponent RaiseRatio is to high Pokerstar Bot will ever call when the opponent is
raising, or will raise if the opponent checks.

ProPokerTools

Team Name: ProPokerTools
Team Members: Dan Hutchings
Affiliation: ProPokerTools
Location: Lakewood, Colorado, US
Non-dynamic Agent
Technique: This hulhe agent was created using established methods; regret minimization,
partial recall, etc. etc.

Last year, I gave myself a constraint in building my AI agents; all agents were
created on a single machine that costs less than $1,000. This year, I have loosened that constraint to allow $1,000 worth of rented compute time in the 'cloud'.

This year's entry has been improved in three different areas; size (9 times larger), build time (9 times longer), and abstraction quality. Tests using the 2013 benchmark server show this agent would likely have placed third or fourth in the instant run-off competition if it were entered last year. Additional improvements have been developed but were not ready in time for this year's competition.

Slugathorus

Team Name: Slugathorus
Team Members: Daniel Berger
Affiliation: University of South Wales
Dynamic Agent
Technique: The agent combines a precomputed approximate Nash equilibrum strategy generated by public chance sampled MCCFR with a new modelling technique designed to identify when the opponent is making mistakes and exploit them.

"Modelling Player Weakness in Poker". (Berger, 2013)
"Efficient Nash Equilibrium Approximation through Monte Carlo Counterfactual Regret Minimization". (Johanson, 2012)
"Regret Minimization in Games With Incomplete Information". (Zinkevich, 2007)

SmooCT

Team Name: SmooCT
Team Members: Johannes Heinrich
Affiliation: University College London
Non-dynamic Agent
Technique: SmooCT was trained from self-play Monte-Carlo tree search, using Smooth UCT [2]. The resulting strategy aims to approximate a Nash equilibrium. The agent uses an imperfect recall abstraction [1] based on an equidistant discretisation of expected hand strength squared values. The abstraction uses 169, 2000, 1000 and 600 buckets for the four betting rounds respectively.

[1] Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. "A Practical Use of Imperfect Recall". Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.
[2] Johannes Heinrich and David Silver. "Self-Play Monte-Carlo Tree Search in Computer Poker". To appear in 2014.

Terrible Terrance

Team Members: Jon Parker
Affiliation: Georgetown University and John Hopkins University
Non-dynamic Agent
Technique: The solution was built using sparse dynamic programming. The "dynamic programming" portion of the solution is due to reusing precomputed "PayoffMatrix" objects and "StrategyBlock" objects. Saving the PayoffMatrices allows the bot to accurately estimate upstream payoffs with relatively little computation. Saving the StrategyBlock objects allows "new" Blocks to be initialized with solutions that already reflect some CFR style iterations. No board abstraction is used because the "best" StrategyBlock to use for a betting round is found by looking up the nearest StrategyBlock given the relevant BettingSequence and BoardSequence

My bot folds very few hands preflop (when compared against winners from prior HULHE competitions) and it also open calls a decent fraction of the time. I am somewhat concerned that the open calling behavior is indicative of an error in the bot somewhere. However, I have searched very hard for an error and haven't found one. Moreover, the overall "game value" I computed is almost identical to the "game value" Eric Jackson (winner from 2012) found (I asked him in an email about this).

Heads-up No-limit Texas Hold'em

Feste

Team Name: Feste
Team Members: Francois Pays
Affiliation: Independent
Location: Paris, France
Non-dynamic Agent
Technique: The card abstraction uses respectively 169, 400, 50 and 25 buckets for preflop, flop, turn and river. The buckets are computed using k-means clustering over selected hand parameters such as expected values, standard deviation and skewness at last round. The betting abstraction is quite coarse: half pot (only at first bet), pot, quad-pot and all-in.

The abstraction is represented using sequence form with the imperfect recall extension. The resulting abstraction has only 300 million game states. It is solved using a custom interior point solver with indirect algebra [1]. The solver runs on a mid-range workstation and is GPU-accelerated with CUDA. The solver has been tested up to 3 billion game states and can therefore handle abstractions ten times larger, but interestingly, either finer card or betting abstractions did not result in stronger no-limit players.

Since Feste is not yet able to gather accurate enough information from its opponent in 3000-hand games, there is no dynamic adaptation. The instant runoff player follows a defensive strategy and the total-bankroll player, a slightly more aggressive one.

[1] Francois Pays. 2014. An Interior Point Approach to Large Games of Incomplete Information. Proceedings of the AAAI-2014 Workshop on Computer Poker.

HibiscusBiscuit

Team Name: Cleverpiggy
Team Members: Allen Cunningham
Affiliation: Independent
Location: Marinal del Rey, CA, US
Non-dynamic Agent
Technique: Hibiscus Biscuit is comprised of separately trained big blind and button strategies with knowledge of different bet sizes. In each case the hero uses only a couple sizes but defends against many. Both sides use a card abstraction of 169, 20000, 18000 and 17577 buckets. These consist of board card distinctions, earth mover clustering for the flop and turn, and clustering over (wins, ties) for the river.

Regret Minimization in Games with Incomplete Information 2007
Evaluating State-Space Abstractions in Extensive-Form Games 2013

Hyperborean (instant run-off)

Team Name: Univeristy of Alberta
Team Members: Michael Bowling, Duane Szafron, Rob Holte, Nolan Bard, Neil Burch, Richard Gibson, John Hawkin, Michael Johanson, Trevor Davis, Josh Davidson, Dustin Morrill
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Non-dynamic Agent
Technique: Hyperborean2014-NL-IRO is a Nash equilibrium approximation trained using PureCFR [1, Section 5.5],
a recent CFR variant developed by Oskari Tammelin. Its card abstraction is symmetric
and uses imperfect recall, with 169 (perfect) preflop buckets, 18630 flop buckets,
3700 turn buckets and 1175 river buckets, using the k-means Earthmover and k-means OCHS
buckets recently presented by Johanson et al [2]. The betting abstraction is asymmetric,
and has different bet sizes and limits for the opponent and the agent. The opponent is anticipated
to have a large number of bets, including min-bets or tenth-pot bets, with higher limits
on how many bets of each type can be made in each round than the agent. The agent has a similar set of bets,
including 0.1-pot and 0.25-pot but not including min-bets, with bet sizes 1.5-pot and less being
restricted to its first two actions. This asymmetric betting abstraction gives the agent the ability to
interpret many actions of the opponent in order to limit the impact of translation errors, while still having
a few unusual bet sizes (0.1, 0.25, 0.65) that may cause translation errors in the opponents.

Since this agent is asymmetric, computing its strategy required solving two abstract games.
The game with the opponent in seat 1 had 9,765,306,248 information sets and 26,879,972,986 infoset-actions,
and the game with the opponent in seat 2 had 10,986,105,934 information sets and 30,285,810,764 infoset-actions.
While the average strategy is the component of CFR that converges to an equilibrium, for
this set of strategies we only ran 80 billion and 84 billion iterations of PureCFR respectively, and
for this size of game we anticipated improvement up to 300 billion or more iterations. Instead of
the average strategy, this agent uses the current strategy which does not converge to equilibrium,
but has been demonstrated by Gibson to improve much more quickly in in-game performance [1, Section 4.4.3].

[1] Richard Gibson. Regret Minimization in Games and the Development of Champion Multiplayer Computer Poker-Playing Agents. PhD Thesis. University of Alberta, 2013.
[2] Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. Evaluating state-space abstractions in extensive-form games. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 271–278, 2013.

Hyperborean (total bankroll)

Team Name: Univeristy of Alberta
Team Members: Michael Bowling, Duane Szafron, Rob Holte, Nolan Bard, Neil Burch, Richard Gibson, John Hawkin, Michael Johanson, Trevor Davis, Josh Davidson, Dustin Morrill
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Dynamic Agent
Technique: Hyperborean2014-2pn-TBR is an implicit modelling agent [2] consisting of three abstract strategies. All strategies were generated using the Counterfactual Regret Minimization (CFR) algorithm [10] with imperfect recall abstractions [9]. We also abstract the raise action to a number of bets relative to the pot size. Both strategies makes raises equal to 0.5, 0.75, 1, 1.5, 3, 6, 11, 20, or 40 times the pot size, or go all-in. The portfolio of strategies for the agent consists of:

1) A Nash equilibrium approximation

To create our abstract game for the strategy, we first partitioned the betting sequences into two parts: an "important" part, and an "unimportant" part. Importance was determined according to the frequency with which one of our preliminary 2-player nolimit programs was faced with a decision at that betting sequence in self-play, as well as according to the number of chips in the pot. Then, we employed two different granularities of abstraction, one for each part of this partition. The unimportant part used 169, 3700, 3700, and 3700 buckets per betting round respectively, while the important part used 169, 180,000, 1,530,000, and 1,680,000 buckets per betting round respectively. Buckets were calculated according to public card textures and the k-means Earthmover and k-means OCHS buckets recently presented by Johanson et al [8]. By forgetting previous card information and rebucketing on every round [9], this yields an imperfect recall abstract game. The strategy profile of this abstract game was computed from approximately 498 billion iterations of PureCFR [5, Section 5.5], a recent CFR variant developed by Oskari Tammelin. This type of strategy is also known as a "dynamic expert strategy" [6].

2) A data biased response to aggregate data of 2011 and 2012 ACPC competitors

One exploitive response in the portfolio was created using data biased robust counter strategies [7] to aggregate data from all of the agents in the 2011 and 2012 heads-up no-limit ACPC events. It uses the same betting abstraction as the above Nash equilibrium approximation, but the card abstraction consists of 169, 9000, 9000, and 3700 k-means Earthmover and k-means OCHS buckets per betting round uniformly across the game tree. An asymmetric abstraction is used for the frequentist model used by the data biased response [3]. The model's abstraction ignores card information and only models agents on their abstract betting.

3) A data biased response to aggregate data of some 2013 ACPC competitors

The second exploitive response uses the same abstract game as the previous DBR, but only uses data from agents that weren't beaten by the 2013 Hyperborean TBR entry for at least 750 mbb/g.

A mixture of these agents is dynamically generated using a slightly modified Exp4-like algorithm [1] where the reward vector for the experts/strategies is computed using importance sampling over the individual strategies [4].

[1] P Auer, N Cesa-Bianchi, Y Freund, and R.E Schapire. Gambling in a rigged casino: The adversarial multi-armed bandit problem. Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 1995.

[2] Nolan Bard, Michael Johanson, Neil Burch, Michael Bowling. Online Implicit Agent Modelling. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.

[3] Nolan Bard, Michael Johanson, Michael Bowling. Asymmetric Abstractions for Adversarial Settings. In Proceedings of the Thirteenth International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2014.

[4] Michael Bowling, Michael Johanson, Neil Burch, and Duane Szafron. Strategy Evaluation in Extensive Games with Importance Sampling. In Proceedings of the 25th Annual International Conference on Machine Learning (ICML), 2008.

[5] Richard Gibson. Regret Minimization in Games and the Development of Champion Multiplayer Computer Poker-Playing Agents. PhD Thesis. University of Alberta, 2013.

[6] Richard Gibson and Duane Szafron. On Strategy Stitching in Large Extensive Form Multiplayer Games. In Proceedings of the Twenty-Fifth Conference on Neural Information Processing Systems (NIPS), 2011.

[7] Michael Johanson and Michael Bowling. Data Biased Robust Counter Strategies. In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS), 2009.

[8] Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. Evaluating state-space abstractions in extensive-form games. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 271–278, 2013.

[9] Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. A Practical Use of Imperfect Recall. In Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.

[10] Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. Regret minimization in games with incomplete information. In Advances in Neural Information Processing Systems 20 (NIPS), 2007.

ArizonaStu (KEmpfer)

Team Name: KEmpfer
Team Members: Eneldo Loza Mencia, Julian Prommer
Affiliation: Knowledge Engineering Group - Technische Universität Darmstadt
Location: Darmstadt, Germany
Non-dynamic Agent
Technique: The agent implements a list of expert rules and follows these. Additional opponent statistics can be collected and be used in the rules, but we currently do not make use of this option. The backup strategy if no expert rule is found is to play according to the all-in equity and pot-odds.

Little Rock

Team Name: Little Rock
Team Members: Rod Byrnes
Affiliation: Independent
Location: Goonellabah, NSW, Australia
Non-dynamic Agent
Technique: External sampling MCCFR approach, virtually the same as last year but with a few enhancements to the card abstraction and action abstraction techniques.

Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. In Advances in Neural Information Processing Systems 22 (NIPS), pp. 1078-1086, 2009.

Lucifer

Team Name: PokerCPT
Team Members: Luis Filipe Teofilo
Affiliation: University of Porto, Artificial Intelligence and Computer Science Laboratory
Location: Porto, Portugal
Dynamic Agent
Technique: The base agent's strategies are a nash-equilibrium (ne). Several ne strategies were computed and the agent switches between ne to difficult opponent modeling (especially on Kuhn3P). To compute the ne strategy, an implementation of cfr was used. This implementation greatly reduces the game tree by removing decisions at chance nodes where the agent knows that it has a very high or very low probability of winning. For multiplayer Poker, the cfr implementation abstracts game sequences. The methodology to group card buckets was based on grouping buckets by their utility on smaller games. As for no-limit, the actions were also abstracted into 4 single possible decisions.

Nyx

Team Name: Nyx
Team Members: Martin Schmid
Affiliation: Charles University
Location: Prague, Czech Republicx
Non-dynamic Agent
Technique: Improved version of the previous Nyx. Better action abstraction and new automatic public card abstraction

PijaiBot

Team Name: PijaiBot
Team Members: Ryan Pijai
Affiliation: Independent
Location: Orlando, FL, USA
Non-dynamic Agent
Technique: PijaiBot is an Imperfect Recall, Approximate Nash Equilibrium agent with novel approaches to card-clustering, bet-translating, and opponent-trapping. All non-isomorphically-similar card situations are grouped together using K-Means Clustering with Bhattacharyya Distance of Expected Hand-Strengths as the distance measure rather than more traditional distance measures. When opponent bet sizes do not match any of the sizes in PijaiBot's betting abstraction, PijaiBot interprets bet sizes using Soft Translation of Geometric Similarity based on pot-odds, rather than on pot-relative or stack-relative bet sizes as has been done in the past.

PijaiBot has special-case logic for translating small bets that do not match any of its abstraction bet sizes into checks and calls, and is able to patch up its internal abstract betting history as needed to make those translations valid action sequences. PijaiBot attempts to exploit other agents that do not handle this and other types of similarly tricky situations properly by occasionally overriding PijaiBot's own Nash Equilibrium strategy suggestions with potentially more damaging actions that test and confuse its opponents by inducing them into misrepresenting the true state of the game.

Prelude

Team Name: Prelude
Team Members: Tim Reiff
Affiliation: Unfold Poker
Location: Las Vegas, NV, USA
Non-dynamic Agent
Technique: Prelude is an equilibrium strategy that implements several published techniques, including the training algorithm Pure CFR, the opponent bet translation method, and a card abstraction based on k-means clustering over hand strength distributions. I had hoped to test an agent with some ambitious importance sampling, but that one is converging slowly... Thus, I whipped up a more conservative entry, with a streamlined betting tree partly designed for faster training. I snuck in a few minor experiments still, including modified EV histograms and OCHS categories for the card abstraction, selective use of purification thresholding, and some speculative bet sizing adjustments.

1. Sam Ganzfried and Tuomas Sandholm. "Tartanian5: A Heads-Up No-Limit Texas Hold'em Poker-Playing Program". In Computer Poker Symposium at the National Conference on Artificial Intelligence (AAAI), 2012.
2. Richard Gibson. "Regret Minimization in Games and the Development of Champion Multiplayer Computer Poker-Playing Agents". PhD thesis, University of Alberta, 2014.
3. Greg Hamerly. "Making k-means even faster". In proceedings of the 2010 SIAM international conference on data mining (SDM 2010), April 2010.
4. Eric Jackson. "Slumbot NL: Solving Large Games with Counterfactual Regret Minimization Using Sampling and Distributed Processing". In Computer Poker Workshop on Artificial Intelligence (AAAI), 2013.
5. Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. "Evaluating State-Space Abstractions in Extensive-Form Games". In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.
6. Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret Minimization in Games with Incomplete Information". In Proceedings of Advances in Neural Information Processing Systems (NIPS), 2007.

SartreNLExp

Team Name: Sartre
Team Members: Kevin Norris, Jonathan Rubin, Ian Watson
Affiliation: The University of Auckland
Location: Auckland, New Zealand
Dynamic Agent
Technique: SartreNLExp combines an approximate Nash equilibrium strategy with exploitation capabilities. It plays a base approximate Nash equilibrium strategy that was created by imitating the play [1] of the 2013 Slumbot agent [2]. SartreNLExp also incorporates a statistical exploitation module that models the opponent online, and identifies exploitable statistical anomalies in the opponent play. When a game state arises where the statistical exploitation module is able to exploit one of the opponents statistical anomalies it overrides the base strategy and provides an exploitive action. Together the base strategy and statistical exploitation module provide safe opponent exploitation, given that the opponent model is an accurate reflection of the opponent’s action frequencies. The agent has been improved from its previous iteration, presented in [3]. The exploitation capabilities of the statistical exploitation module have been greatly improved and the opponent model has been entirely overhauled. Additionally a novel decaying history method, statistic specific decaying history, has been implemented to ensure the opponent model is able to accurately reflect the frequency statistics of both static and dynamic opponents.

References to relevant papers, if any
1. Rubin, J., & Watson, I. (2011). Successful performance via decision generalisation in no limit Texas Hold’em. In Case-Based Reasoning Research and Development (pp. 467-481). Springer Berlin Heidelberg.
2. Jackson, E. (2013, June). Slumbot NL: Solving Large Games with Counterfactual Regret Minimization Using Sampling and Distributed Processing. In Workshops at the Twenty-Seventh AAAI Conference on Artificial Intelligence.
3. Norris, K., & Watson, I. (2013, August). A statistical exploitation module for Texas Hold'em: And it's benefits when used with an approximate nash equilibrium strategy. In Computational Intelligence in Games (CIG), 2013 IEEE Conference on (pp. 1-8). IEEE.

Slumbot

Team Name: Slumbot
Team Members: Eric Jackson
Affiliation: Independent
Non-dynamic Agent
Technique: I use Pure External CFR to compute an approximate equilibrium. A decomposition technique described in my forthcoming workshop paper allows me to break the
game tree into pieces that can be solved independently. I employ an abstraction that uses more granularity (both more bet sizes and more buckets) at more commonly reached game states.

"A Time and Space Efficient Algorithm for Approximately Solving Large Imperfect Information Games"; Eric Jackson; 2014; forthcoming in the Proceedings of the Workshop on Computer Poker and Imperfect Information at AAAI-14.

Tartanian7

Team Name: Tartanian
Team Members: Noam Brown, Sam Ganzfried, Tuomas Sandholm
Affiliation: Carnegie Mellon University
Location: Pittsburgh, PA, USA
Non-dynamic Agent
Technique: One/Two Paragraph Summary of Technique:

Tartanian7 plays an approximate Nash equilibrium strategy that was computed on Pittsburgh's shared-memory supercomputer, which has a cache coherent Non-Uniform Memory Access (ccNUMA) architecture. We developed a new abstraction algorithm and a new equilibrium-finding algorithm that enabled us to perform a massive equilibrium computation on this architecture.

The abstraction algorithm first clusters public flop boards, assigning each cluster to a blade on the supercomputer. These public clusters are computed by clustering using a distance function based on how often our abstraction from last year grouped hands together on the flop with different sets of public cards. Within each cluster, the algorithm then buckets the flop, turn, and river hands that are possible given one of the public flops in the cluster, using an imperfect-recall abstraction algorithm. We did not perform any abstraction for the preflop round.

Our equilibrium-finding algorithm is a modified version of external-sampling MCCFR. It samples one pair of preflop hands per iteration. For the postflop, each blade samples community cards from its public cluster and performs MCCFR in parallel. The samples are weighted to remove bias.

Our agent also uses a novel reverse mapping technique that compensates for the failure of CFR to fully converge and to possibly overfit the strategies to the abstraction.

3-player Limit Texas Hold'em

Hyperborean (instant run-off)

Team Name: Univeristy of Alberta
Team Members: Michael Bowling, Duane Szafron, Rob Holte, Nolan Bard, Neil Burch, Richard Gibson, John Hawkin, Michael Johanson, Trevor Davis, Josh Davidson, Dustin Morrill
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Non-dynamic Agent
Technique: (NOTE: This agent is the same as the 2013 ACPC's 3-player instant run-off Hyperborean entry.)

Hyperborean2014-3pl-IRO is a Nash equilibrium approximation trained using
PureCFR [1, Section 5.5], a recent CFR variant developed by Oskari Tammelin.
Because 3-player hold'em is too large a game to apply CFR techniques directly,
we employed an abstract game that merges card deals into "buckets" to create a
game of manageable size.

To create our abstract game, we first partitioned the betting sequences into two parts: an "important" part, and an "unimportant" part. Importance was determined according to the frequency with which our 3-player programs from the 2011 and 2012 ACPCs were faced with a decision at that betting sequence, as well as according to the number of chips in the pot. Then, we employed two different granularities of abstraction, one for each part of this partition. The unimportant part used 169, 180,000, 18,630, and 875 buckets per betting round respectively, while the important part used 169, 1,348,620, 1,530,000, and 2,800,000 buckets per betting round respectively. Buckets were calculated according to public card textures and k-means clustering over hand strength distriubtions [3] and yielded an imperfect recall abstract game by forgetting previous card information and rebucketing on every round [4]. The agent plays the "current strategy profile" computed from approximately 303.6 billion iterations of the PureCFR variant of CFR [1] applied to this abstract game. This type of strategy is also known as a "dynamic expert strategy" [2].

[1] Richard Gibson. Regret Minimization in Games and the Development of Champion Multiplayer Computer Poker-Playing Agents. PhD Thesis. University of Alberta, 2013.

[2] Richard Gibson and Duane Szafron. On Strategy Stitching in Large Extensive Form Multiplayer Games. In Proceedings of the Twenty-Fifth Conference on Neural Information Processing Systems (NIPS), 2011.

[3] Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. Evaluating state-space abstractions in extensive-form games. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 271–278, 2013.

[4] Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. A Practical Use of Imperfect Recall. In Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.

Hyperborean (total bankroll)

Team Name: Univeristy of Alberta
Team Members: Michael Bowling, Duane Szafron, Rob Holte, Nolan Bard, Neil Burch, Richard Gibson, John Hawkin, Michael Johanson, Trevor Davis, Josh Davidson, Dustin Morrill
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Non-dynamic Agent
Technique: (NOTE: This agent is the same as the 2013 ACPC's 3-player total bankroll Hyperborean entry.)

Hyperborean2014-3pl-TBR is a data biased response to aggregate data of ACPC competitors from the 2011 and 2012 3-player limit competitions [2]. The strategy was generated using the Counterfactual Regret Minimization (CFR) algorithm [6]. Asymmetric abstractions were used for the regret minimizing part of each player's strategy, and the frequentist model used by data biased response [1]. Each abstraction uses imperfect recall, forgetting previous card information and rebucketing on every round [5], with the k-means Earthmover and k-means OCHS buckets recently presented by Johanson et al [3]. The agent's strategy uses an abstraction with 169, 10000, 5450, and 500 buckets on each round of the game, respectively. The model of prior ACPC competitors groups observations from all competitors into a model using 169, 900, 100, and 25 buckets on each round of the game, respectively. The agent plays the "current strategy profile" generated after 20 billion iterations of external sampled CFR [4].

References to relevant papers, if any:

[1] Nolan Bard, Michael Johanson, Michael Bowling. Asymmetric Abstractions for Adversarial Settings. In Proceedings of the Thirteenth International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2014.

[2] Michael Johanson and Michael Bowling. Data Biased Robust Counter Strategies. In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS), 2009.

[3] Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. Evaluating State-Space Abstractions in Extensive-Form Games. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.

[4] Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. Monte Carlo Sampling for Regret Minimization in Extensive Games. In Proceedings of the Twenty-Third Conference on Neural Information Processing Systems (NIPS), 2009.

[5] Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. A Practical Use of Imperfect Recall. In Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.

[6] Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. Regret minimization in games with incomplete information. In Advances in Neural Information Processing Systems 20 (NIPS), 2007.

Learn2KEmpf (KEmpfer)

Team Name: KEmpfer
Team Members: Eneldo Loza Mencia, Julian Prommer
Affiliation: Knowledge Engineering Group - Technische Universität Darmstadt
Location: Darmstadt, Germany
Non-dynamic Agent
Technique: This paper tries to mimic the behaviour of a given poker agent. Hence, it follows a similar strategy as Sartre from previous years, with two differences: Firstly, in contrast to Sartre, which uses cased based reasoning (basically k-nearest neighbors), we allow to use any learning algorithm. In this particular submission, we used C4.5 to induce a model of a poker agent (more specifically, Weka's implementation J48). Secondly, a much more complete representation of a state is used with up to 50 possible features. We even induce features which are convinient in order to model the opponent modelling used by the agent to be imitated.
For this year's submission, we learned the behaviour of Hyperborean from the logs of the 2013 three player limit competition. Hence, since Hyperborean uses a CFR strategy, we expect our bot to behave accordingly. However, it is not possible to perfectly replicate the behaviour of a bot (at least with the available data). Hence, we expect our agent to perform worse than a respective opponent using CFR in this year's competition.

- RUBIN, Jonathan ; WATSON, Ian: Case-based Strategies in Computer Poker. In: AI Communications 25 (2012), Nr. 1, p. 19–48.
- RUBIN, Jonathan ; WATSON , Ian: Successful Performance via Decision Generalisation in No Limit Texas Hold’em. In: Case-Based Reasoning Research and Development, Vol. 6880. Springer Berlin Heidelberg, 2011, p. 467–481
- Kischka, Theo: Trainieren eines Computer-Pokerspielers, Bachelor's Thesis, Technische Universität Darmstadt, Knowledge Engineering Group, 2014, http://www.ke.tu-darmstadt.de/lehre/arbeiten/bachelor/2014/Kischka_Theo.pdf

Lucifer

Team Name: PokerCPT
Team Members: Luis Filipe Teofilo
Affiliation: University of Porto, Artificial Intelligence and Computer Science Laboratory
Location: Porto, Portugal
Dynamic Agent
Technique: The base agent's strategies are a nash-equilibrium (ne). Several ne strategies were computed and the agent switches between ne to difficult opponent modeling (especially on Kuhn3P). To compute the ne strategy, an implementation of cfr was used. This implementation greatly reduces the game tree by removing decisions at chance nodes where the agent knows that it has a very high or very low probability of winning. For multiplayer Poker, the cfr implementation abstracts game sequences. The methodology to group card buckets was based on grouping buckets by their utility on smaller games. As for no-limit, the actions were also abstracted into 4 single possible decisions.

SmooCT

Team Name: SmooCT
Team Members: Johannes Heinrich
Affiliation: University College London
Location: London, UK
Non-dynamic Agent
Technique: SmooCT was trained from self-play Monte-Carlo tree search, using Smooth UCT [2]. The agent uses an imperfect recall abstraction [1] based on an equidistant discretisation of expected hand strength squared values. The abstraction uses 169 and 1000 buckets for the first two betting rounds. For the turn and river the abstraction granularity has been locally refined based on the number of visits to a node in self-play training. The numbers of turn and river buckets lie in [100,400] and [10,160] respectively.

[1] Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. "A Practical Use of Imperfect Recall". Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.
[2] Johannes Heinrich and David Silver. "Self-Play Monte-Carlo Tree Search in Computer Poker". To appear in 2014.

3-player Kuhn Poker

IN PROGRESS, DONE SOON

Participants: 2012

Details: Category: Participants; Published on Thursday, 17 January 2013 23:52; Written by Eric Jackson; Hits: 10454

The 2012 competition had 13 different agents in the heads-up limit Texas hold'em competition, 11 agents in the heads-up no-limit competition, and 5 agents in the 3-player limit competition. As in previous years, agents were submitted by a mixture of universities and individual hobbyists from 10 different countries around the world.

Competitors in the 2012 Annual Computer Poker Competition were not required to supply detailed information about their submission(s) in order to compete, but some information about team members, affiliation, location, high level technique descriptions, and occasionally relevant papers were supplied. This page presents that information.

Heads-up Limit Texas Hold'em

Entropy

Team Name: ERGOD
Team Leader: Ken Barry
Team Members: Ken Barry
Affiliation: ERGOD
Location: Athlone, Westmeath, Ireland
Technique:
Entropy is powered by "ExperienceEngine", an agent capable of acting intelligently in any indeterminate system. Development of ExperienceEngine is ongoing and its inner workings cannot be revealed at this time.

Feste

Team Name: Feste
Team Leader: Franois Pays
Team Members: Franois Pays
Affiliation: Independent
Location: Paris, France
Technique:
The 2-player limit game is modelized using sequence form and solved as a min-max problem using conventional interior-point method. Betting structure is kept intact with no loss of information but cards information states are aggregated in clusters depending of betting round (flop, turn and river). The min-max problem is solved using a convex-concave variant of the log-barrier patch-following interior-point. The inner newton system is a large sparse saddle point system. Using adhoc krylov method along with preconditioning, such the system is tractable with consummer hardware. As the solution approaches, the system gets more and more ill-conditioned. Several techniques are used to stabilize the krylov solver, dynamic precision control, variable elimination and regularization. Required accuracy is reached in about 250 iterations.

Huhuers

Team Name: Huhubot
Team Leader: Shawne Lo
Team Members: Shawne Lo, Wes Ren Tong
Affiliation: Independent
Location: Toronto, Canada
Technique:
Case based reasoning through imitation of proven strong agents.

Hyperborean2p.iro

Team Name: Univeristy of Alberta
Team Leader: Michael Bowling
Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, John Hawkin, Richard Gibson, Neil Burch, Parisa Mazrooei, Josh Davidson
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Technique:
The 2-player instant run-off program is built using the Public Chance Sampling (PCS) [1] variant of Counterfactual Regret Minimization [2]. We solve a large abstract game, identical to Texas Hold'em in the preflop and flop. On the turn and river, we bucket the hands and public cards together, using approximately 1.5 million categories on the turn and 900 thousand categories on the river.
References and related papers:
- Michael Johanson, Nolan Bard, Marc Lanctot, Richard Gibson, and Michael Bowling. "Efficient Nash Equilibrium Approximation through Monte Carlo Counterfactual Regret Minimization" In AAMAS 2012
- Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret minimization in games with incomplete information" In NIPS 2008.

Hyperborean2p.tbr

Team Name: Univeristy of Alberta
Team Leader: Michael Bowling
Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, John Hawkin, Richard Gibson, Neil Burch, Parisa Mazrooei, Josh Davidson
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Technique:
Hyperborean-2012-2p-limit-tbr is an agent consisting of seven abstract strategies. All seven strategies were generated using the Counterfactual Regret Minimization (CFR) algorithm [1] with imperfect recall abstractions [3]. They are:

Two strategies in an imperfect recall abstraction using 57 million information sets that specifically counter opponents who always raise or always call. An approximation of an equilibrium within a large imperfect recall abstraction that has 879,586,352 information sets, with an unabstracted, perfect recall preflop and flop. Four strategies in the smaller (57 million information sets) abstraction that are responses to models of particular opponents seen in the 2010 or 2011 ACPC.

During a match, the counterstrategies to always raise and always call will only be used if the opponent is detected to be always raise or always call. Otherwise, a mixture of the remaining five strategies is used. The mixture is generated using a slightly modified Hedge algorithm [4] where the reward vector for the experts/strategies is computed using importance sampling over the individual strategies [2].
References and related papers:
- Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret minimization in games with incomplete information" In NIPS 2008.
- Michael Bowling, Michael Johanson, Neil Burch, and Duane Szafron. "Strategy Evaluation in Extensive Games with Importance Sampling". In Proceedings of the 25th Annual International Conference on Machine Learning (ICML), 2008.
- Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. "A Practical Use of Imperfect Recall". Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.
- P Auer, N Cesa-Bianchi, Y Freund, and R.E Schapire. "Gambling in a rigged casino: The adversarial multi-armed bandit problem". Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 1995.

LittleAce

Team Name: LittleAce
Team Leader:
Team Members:
Affiliation:
Location:
Technique:

LittleRock

Team Name: LittleRock
Team Leader: Rod Byrnes
Team Members: Rod Byrnes
Affiliation: Independent
Location: Lismore, Australia
Technique:
LittleRock uses an external sampling monte carlo CFR approach with imperfect recall. Additional RAM was available for training the agent entered into this year's competition, which allowed for a more fine grained card abstraction, but the algorithm is otherwise largely unchanged. One last-minute addition this year is a no-limit agent.

The no-limit agent has 4,491,849 information sets, the heads-up limit agent has 11,349,052 information sets and the limit 3-player agent has 47,574,530 information sets. In addition to card abstractions, the 3-player and no-limit agents also use a form of state abstraction to make the game size manageable.
References and related papers:
- Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. In Advances in Neural Information Processing Systems 22 (NIPS), pp. 10781086, 2009.

Neo Poker Bot

Team Name: Neo Poker Laboratory
Team Leader: Alexander Lee
Team Members: Alexander Lee
Affiliation: Independent
Location: Spain
Technique:
Our range of computer players was developed to play against humans. The AI was trained on top poker rooms real money hand history logs. The AI logic employs different combinations of Neural networks, Regret Minimization and Gradient Search Equilibrium Approximation, Decision Trees, Recursive Search methods as well as expert algorithms from top players in different games of poker. Our computer players have been tested against humans and demonstrated great results over 100 mln hands. The AI was not optimized to play against computer players.

Patience

Team Name: Patience
Team Leader: Nick Grozny
Team Members: Nick Grozny
Affiliation: Independent
Location: Moscow, Russia.
Technique:
Patience uses a static strategy built by the fictitious play algorithm.

Sartre

Team Name: Sartre
Team Leader: Jonathan Rubin
Team Members: Jonathan Rubin, Ian Watson
Affiliation: University of Auckland
Location: Auckland, New Zealand
Technique:
Sartre uses a case-based approach to play Texas Hold'em. AAAI hand history data from multiple agents are encoded into distinct case-bases. When it is time for Sartre to make a betting decision a case with the current game state information is created. Each individual case-base is then searched for similar scenarios resulting in a collection of playing decisions. A final decision is made via ensemble voting.
References and related papers:
- Jonathan Rubin and Ian Watson. Case-Based Strategies in Computer Poker, AI Communications, Volume 25, Number 1: 19-48, March 2012.
- Jonathan Rubin and Ian Watson. (2011). On Combining Decisions from Multiple Expert Imitators for Performance. In IJCAI-11, Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence.

Slumbot

Team Name: Slumbot
Team Leader: Eric Jackson
Team Members: Eric Jackson
Affiliation: Independent
Location: Menlo Park, CA, USA
Technique:
Slumbot employs the Public Chance Sampling variant of Counterfactual Regret Minimization. We use a large abstraction with 88 billion information sets. There is no abstraction on any street prior to the river. On the river there are about 4.7 million bins.

As a consequence of the large abstraction size and our relatively modest compute environment, our system is disk-based - regrets and accumulated probabilities are written to disk on each iteration.
References and related papers:
- [Johanson 2012] Efficient Nash Equilibrium Approximation through Monte Carlo Counterfactual Regret Minimization
- [Johanson 2011] Accelerating Best Response Calculation in Large Extensive Games
- [Zinkevich 2007] Regret Minimization in Games with Incomplete Information

ZBot

Team Name: ZBot
Team Leader: Ilkka Rajala
Team Members: Ilkka Rajala
Affiliation: Independent
Location: Helsinki, Finland
Technique:
Counterfactual regret minimization implementation that uses two phases. In the first phase the model is built dynamically by expanding it (observing more buckets) in situations which are visited more often, until the desired size has been reached.
In the second phase that model is then solved by counterfactual regret minimization.

Model has 1024 possible board texture buckets for each street, and 169/1024/512/512 hand type buckets for preflop/flop/turn/river. How many buckets are actually used in any given situation depends on how common that situation is.

Heads-up No-limit Texas Hold'em

Azure Sky

Team Name: Azure Sky Research, Inc
Team Leader: Eric Baum
Team Members: Eric Baum, Chick Markley, Dennis Horte
Affiliation: Azure Sky Research Inc.
Location: Berkeley CA US
Technique:
SARSA trained neural nets, k-armed bandits, secret sauce.

dcubot

Team Name: dcubot
Team Leaders: Neill Sweeney
Team Members: Neill Sweeney, David Sinclair
Affiliation: School of Computing, Dublin City University
Location: Dublin 9, Ireland.
Technique:
The bot uses a structure like a Neural Net to generate its own actions. A hidden Markov model is used to interpret actions i.e. read an opponent's hand. The whole system is then trained by self-play.
For any decision, the range of betting between a min-bet and all-in is divided into at most twelve sub-ranges. The structure then selects a fold,call, min-bet, all-in or one of these sub-ranges. If a sub-range is selected, the actual raise ammount is drawn from a quadratic distribution between the end-points of the sub-range. The end-points of the sub-ranges are learnt using the same reinfrocement learning algorthm as the rest of the structure.

hugh

Team Name: hugh
Team Leader: Stan Sulsky
Team Members: Stan Sulsky, Ben Sulsky
Affiliation: Independent
Location: NY, US & Toronto, Ont, CA
Technique:
Ben (poker player and son) attempts to teach Stan (programmer and father) to play poker. Stan attempts to realize Ben's ideas in code.

More specifically, pure strategies are utilized throughout. Play is based on range vs range ev calculations. PreFlop ranges are deduced by opponent modelling during play. Subsequent decisions are based a minmax search of the remaining game tree, coupled
with some tactical considerations.

Hyperborean2pNL

Team Name: Univeristy of Alberta
Team Leader: Michael Bowling
Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, Johnny Hawkin, Richard Gibson, Neil Burch, Parisa Mazrooei, Josh Davidson
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Technique:
Our 2-player no limit bot was built using a variant of Counterfactual Regret Minimization (CFR) ([3], [4]) applied to a specially designed betting abstraction of the game. Using an algorithm similar to the CFR algorithm, a different bet size is chosen for each betting sequence in the game ([1], [2]). The card abstraction used buckets hands and public cards together using imperfect recall, allowing for 18630 possible buckets on each of the flop, turn and river.
References and related papers:
- Hawkin, J.; Holte, R.; and Szafron, D. 2011. Automated action abstraction of imperfect information extensive-form games. In AAAI, 681687.
- Hawkin, J.; Holte, R.; and Szafron, D. 2012. Using Sliding Windows to Generate Action Abstractions in Extensive-Form Games. To appear, AAAI '12.
- Michael Johanson, Nolan Bard, Marc Lanctot, Richard Gibson, and Michael Bowling. "Efficient Nash Equilibrium Approximation through Monte Carlo Counterfactual Regret Minimization" In AAMAS 2012
- Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret minimization in games with incomplete information" In NIPS 2008.

LittleRock

Team Name: LittleRock
Team Leader: LittleRock
Team Members: Rod Byrnes
Affiliation: Independent
Location: Lismore, Australia
Technique:
LittleRock uses an external sampling monte carlo CFR approach with imperfect recall. Additional RAM was available for training the agent entered into this year's competition, which allowed for a more fine grained card abstraction, but the algorithm is otherwise largely unchanged. One last-minute addition this year is a no-limit agent.

The no-limit agent has 4,491,849 information sets, the heads-up limit agent has 11,349,052 information sets and the limit 3-player agent has 47,574,530 information sets. In addition to card abstractions, the 3-player and no-limit agents also use a form of state abstraction to make the game size manageable.
References and related papers:
- Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. In Advances in Neural Information Processing Systems 22 (NIPS), pp. 10781086, 2009.

Lucky7_12

Team Name: Lucky7_12
Team Leader: Bojan Butolen
Team Members: Bojan Butolen, Gregor Vohl
Affiliation: University of Maribor
Location: Maribor, Slovenia
Technique:
We have developed a multi agent system that uses 8 strategies during gameplay. By identifying the state of the game, our system chooses a set of strategies that have proved most profitable against a set of training agents. The final decision of the system is made by averaging the decisions of the individual agents.

The 8 agents included in our system are most rule-based agent. The rules for each individual agent were constructed using different knowledge bases (various match logs, expert knowledge, human observed play...) and different abstraction definitions for cards and
actions. After a set of test matches were each agent dueled against the other agents in system, we determined that none of the included agents present an inferior or superior strategy (meaning each agent lost at least against one of the other agents and won at least one match).
References and related papers:
- A submission to the Poker Symposium was made with the title: Combining Various Strategies In A Poker Playing Multi Agent System

Neo Poker Bot

Team Name: Neo Poker Laboratory
Team Leader: Alexander Lee
Team Members: Alexander Lee
Affiliation: Independent
Location: Spain
Technique:
Our range of computer players was developed to play against humans. The AI was trained on top poker rooms real money hand history logs. The AI logic employs different combinations of Neural networks, Regret Minimization and Gradient Search Equilibrium Approximation, Decision Trees, Recursive Search methods as well as expert algorithms from top players in different games of poker.
Our computer players have been tested against humans and demonstrated great results over 100 mln hands. The AI was not optimized to play against computer players.

SartreNL

Team Name: Sartre
Team Leader: Jonathan Rubin
Team Members: Jonathan Rubin, Ian Watson
Affiliation: University of Auckland
Location: Auckland, New Zealand
Technique:
SartreNL uses a case-based approach to play No Limit Texas Hold'em. Hand history data from the previous years top agents are encoded into cases. When it is time for SartreNL to make a betting decision a case with the current game state information is created. The case-base is then searched for similar cases. The solution to past similar cases are then re-used for the current situation.
References and related papers:
- Jonathan Rubin and Ian Watson. (2011). Successful Performance via Decision Generalisation in No Limit Texas Hold'em. In Case-Based Reasoning. Research and Development, 19th International Conference on Case-Based Reasoning, ICCBR 2011.

Spewie Louie

Team Name: Spewie Louie
Team Leader: Jon Parker
Team Members: Jon Parker
Affiliation: Georgetown University
Location: Washington DC, USA
Technique:
The bot assumes bets can occur in: .25x, .4286x, .6666x, 1x, 1.5x, 4x, and 9x pot increments. Nodes in the tree contain: A hand range for each player, an "effectiveMatrix" that summerizes the tree below that point in the tree, and a "strategyMatrix" which is used by the "hero" of that node. Prior to the competition a collection of 24 Million matrices (1/2 strategy and 1/2 effective) were refined while simulating roughly 12.5 Million paths through the tree. This set of 24 Million matrices is then trimmed down to 770k (strategy only) matrices for the competition. Any decision not supported by this set of matrices is handled by an "on line" tree learned.
During the learning process the set of effectiveMatrices and strategy matrices are stored in a ConcurrentHashMap. This gives the learning process good multi-thread behavior.
Preflop hands are bucketed into 22 groups. Flop and Turn hands are bucketed into 8 groups. River hands are bucketed into 7 groups.
References and related papers:
- Micheal Johanson's Masters thesis was quite helpful. "Robust Strategies and Counter-Strategies: Build a Champion Level Computer Poker Player". As were most of his other paper. Some of the older U. Alberta works by Darse Billings were also good reads. The book "The Mathematics of Poker" and its explaination of the AKQ game is very good.

Tartanian5

Team Name: Tartanian5
Team Leader: Sam Ganzfried
Team Members: Sam Ganzfried, Tuomas Sandholm
Affiliation: Carnegie Mellon University
Location: Pittsburgh, PA, 15217, United States
Technique:
Tartanian5 plays a game-theoretic approximate Nash equilibrium strategy. First, it applies a potential-aware, perfect-recall, automated abstraction algorithm to group similar game states together and construct a smaller game that is strategically similar to the full game. In order to maintain a tractable number of possible betting sequences, it employs a discretized betting model, where only a small number of bet sizes are allowed at each game state. Approximate equilibrium strategies for both players are then computed using an improved version of Nesterov's excessive gap technique specialized for poker. To obtain the final strategies, we apply a purification procedure which rounds action probabilities to 0 or 1.
References and related papers:
- Sam Ganzfried, Tuomas Sandholm, and Kevin Waugh. 2012. Strategy purification and thresholding: Effective non-equilibrium approaches for playing large games. In AAMAS.
- Andrew Gilpin, Tuomas Sandholm, and Troels Sorensen. 2007. Potential-aware automated abstraction of sequential games, and holistic equilibrium analysis of Texas Hold'em poker. In AAAI.
- Andrew Gilpin, Tuomas Sandholm, and Troels Sorensen. 2008. A heads-up no-limit Texas Hold'em poker player: Discretized betting models and automatically generated equilibrium-finding programs. In AAMAS.
- Samid Hoda, Andrew Gilpin, Javier Pena, and Tuomas Sandholm. 2010. Smoothing techniques for computing Nash equilibria of sequential games. Mathematics of Operations Research 35(2):494-512.

UNI-MB_Poker

Team Name: UNI-MB_Poker
Team Leader: Ale ?ep
Team Members: Ale ?ep, Davor Gaberek
Affiliation: University of Maribor
Location: Maribor, Slovenia
Technique:
Our Poker-agent concentrates on getting chips from his opponent to maximize its profit. It uses small raises even if it has good cards to lour his opponent into the game, bluffs in 5% of hands and folds, when odds are not in its favor. We used two criteria for our agent to decide what to do - first we examine the cards that we get and secondly we calculate the odds of us winning. After combining the two results we decide what action to take.

3-player Limit Texas Hold'em

dcubot

Team Name: dcubot
Team Leader: Neill Sweeney
Team Members: Neill Sweeney, David Sinclair
Affiliation: School of Computing, Dublin City University
Location: Dublin 9, Ireland.
Technique:
The bot uses 4 seperate connectionist strutures for each betting round. Ten input features describe the state of the betting after each legal decision and there are over 300 basic features describing the visible cards. Reading opponent hands is dealt with by maximum likelihood fitting a hidden markov model to the play with the cards hidden. A belief vector over the hidden variable is then used as an additional input.

This year we have increased the size of the structure by doubling the hidden layer.

Hyperborean3p

Team Name: Hyperborean3p
Team Leader: Michael Bowling
Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, Johnny Hawkin, Richard Gibson, Neil Burch, Parisa Mazrooei, Josh Davidson
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Technique:
Our 3-player program is built using the External Sampling (ES) [2] variant of Counterfactual Regret Minimization [3]. ES is applied to an abstract game constructed from two different card abstractions of Texas Hold'em, producing a dynamic expert strategy [1]. The first card abstraction is a very fine and allows our program to distinguish between many different possible hands on each round, whereas our second card abstraction is much coarser and merges many different hands into the same information set. The first abstraction is applied to the "important" parts of the betting tree, where importance is determined by the potsize and the frequency at which our program reached the betting sequence in last year's competition. The second, coarser abstraction is applied elsewhere.
References and related papers:
- Richard Gibson and Duane Szafron. On strategy stitching in large extensive form multiplayer games. In NIPS 2011..
- Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. Monte Carlo sampling for regret minimization in extensive games. In NIPS 2009.
- Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. Regret minimization in games with incomplete information. In NIPS 2008.

LittleRock

Team Name: LittleRock
Team Leader: Rod Byrnes
Team Members: Rod Byrnes
Affiliation: Independent
Location: Lismore, Australia
Technique:
LittleRock uses an external sampling monte carlo CFR approach with imperfect recall. Additional RAM was available for training the agent entered into this year's competition, which allowed for a more fine grained card abstraction, but the algorithm is otherwise largely unchanged. One last-minute addition this year is a no-limit agent.

The no-limit agent has 4,491,849 information sets, the heads-up limit agent has 11,349,052 information sets and the limit 3-player agent has 47,574,530 information sets. In addition to card abstractions, the 3-player and no-limit agents also use a form of state abstraction to make the game size manageable.
References and related papers:
- Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. In Advances in Neural Information Processing Systems 22 (NIPS), pp. 10781086, 2009.

Neo Poker Bot

Team Name: Neo Poker Laboratory
Team Leader: Alexander Lee
Team Members: Alexander Lee
Affiliation: Independent
Location: Spain
Technique:
Our range of computer players was developed to play against humans. The AI was trained on top poker rooms real money hand history logs. The AI logic employs different combinations of Neural networks, Regret Minimization and Gradient Search Equilibrium Approximation, Decision Trees, Recursive Search methods as well as expert algorithms from top players in different games of poker.
Our computer players have been tested against humans and demonstrated great results over 100 mln hands. The AI was not optimized to play against computer players.

Sartre3P

Team Name: Sartre
Team Leader: Jonathan Rubin
Team Members: Jonathan Rubin, Ian Watson
Affiliation: University of Auckland
Location: Auckland, New Zealand
Technique:
Sartre3P uses a case-based approach to play Texas Hold'em. AAAI hand history data from both three-player and two-player matches are encoded into separate case-bases. When a playing decision is required, a case with the current game state information is created. If no opponents have folded, Sartre3P will search the three-player case-base for similar game scenarios for a solution. On the other hand, if an opponent has folded, Sartre3P will search the two-player case-base and switch to a heads-up strategy if it is possible to map the three-player betting sequence to an appropriate two-player sequence.
References and related papers:
- Jonathan Rubin and Ian Watson. Case-Based Strategies in Computer Poker, AI Communications, Volume 25, Number 1: 19-48, March 2012.

Participants: 2013

Details: Category: Participants; Published on Thursday, 17 January 2013 23:52; Written by Super User; Hits: 8763

The 2013 competition had 14 different agents in the heads-up limit Texas hold'em competition, 14 agents in the heads-up no-limit competition, and 7 agents in the 3-player limit competition. As in previous years, agents were submitted by a mixture of universities and individual hobbyists from 14 different countries around the world.

Competitors in the 2013 Annual Computer Poker Competition were not required to supply detailed information about their submission(s) in order to compete, but some information about team members, affiliation, location, high level technique descriptions, and occasionally relevant papers were supplied. This page presents that information.

Heads-up Limit Texas Hold'em

Feste

Team Name: Feste
Team Leader: Francois Pays
Team Members: Francois Pays
Affiliation: Independent
Location: Paris, France
Technique:
- Modelization:
  We use sequence form to compute an equilibrium of an abstract downsized game model. There is no betting abstraction. The card abstraction uses the following buckett ing parameters: preflop with 169 buckets (no abstraction), flop/400, turn/50 and river/25. Buckets are computed using K-Means over hand and board parameters: current and past expected values, deviations on future streets. Note that sequence form supposes perfect recall, which is not respected by our card abstraction. The probable consequence is some additional exploitability of the resulting strategies. Players have respectively 263,435 and 263,316 movesets and both 101,302 infosets. Model has 16,263,415 leaves, which is a relatively small for the solver.
- Solver:
  The sequence form could be converted into a linear complementary problem (LCP). But this approach does not appear to be scalable since solving such LCP involves either sparse direct or dense methods. In order to take full advantage of the constraints and payoff matrices sparsity, we keep the original problem intact, large but sparse, and use only sparse indi rect methods. The full problem is a min-max problem with bilinear objective and separable cons traints [1]. It can be solved efficiently using a classical interior-point solve r coupled with an inner iterative linear system solver. We use a standard log-ba rrier infeasible primal-dual path-following method but applied to the min-max sy stem. The underlying newton system in augmented form belongs to the class of large spa rse saddle-point problems. Several modern techniques apply. We use a variant of the Projected Preconditioned Conjugate Gradient (PPCG) with an implicit-factoriz ation preconditioner [2]. The condition number of system matrix is kept under co ntrol using regularization and zero variables identification and elimination. Intermediary system matrices are never explicitly formed. The only significant m emory use comes from the given constraints/payoff sparse matrices. This approach has theoretical convergence rate of is O(log 1/accuracy) and uses a minimal amount of memory (proportional to the number of game leaves). In practice, the min-max problem is solved down to competition level accuracy in about 5 days using mentioned hardware. The solver has been successfully tested up to 120 million of leaves.
- Adaptation:
  A simple variation of the initial problem leads to more aggressive strategies: in the objective function, we insert an extra fraction “epsilon” of random strategies (i.e playing every possible action with a constant probability) that both sides may additionally encounter. Lower epsilon values means closer to the optimal strategy (i.e the nash equilibrium), higher values means more aggressive strategies, trying to maximize the exploitation of random play while protecting themselves against the corresponding aggressive strategies. Feste adapts dynamically to its opponent using the Thompson Sampling algorithm over normal distributions. For a faster adaptation, element of luck from hole cards is mitigated. In the Instant Run-off tournament, Feste has at its disposal three strategies: the optimal strategy and two additional epsilon-aggressive strategies: 1% and 5%. Even if theses additional strategies are purposely away from the nash equilibrium, they are expected to be of some use against real agents, even equilibrium-based, without being too much exploitable themselves. In the Total Bankroll tournament, Feste is allowed to use an additional 25% epsilon-aggressive strategy. This strategy is a double-edged sword: it may be the best shot against "chumps" but could also be easily exploited by real agents.
- Hardware:
  The corresponding computations has been carried out on a mid/high-range workstation (12-core Dual Xeon, 48GB of memory and 2 GPU cards). The CPU horsepower is mostly used during the probability tables generation, bucketing, payoff tables computation and match simulations. The GPU cards handle the sparse matrix-vector products (with CUDA) in the PPCG during the solve step, allowing the machine to solve two problems concurrently.
References and related papers:
1. Minimax and Convex-Concave Games. Arpita Ghosh and Stephen Boyd. EE392o, Stanford University.
2. Dollar, H. S., Gould, N. I., Schilders, W. H., & Wathen, A. J. (2006). Implicit-factorization preconditioning and iterative solvers for regularized saddle-point systems. SIAM Journal on Matrix Analysis and Applications, 28(1), 170-189.

HITSZ_CS_13

Team Name: HITSZ_CS_13
Team Leader: Xuan Wang
Team Members: Xuan Wang, Jiajia Zhang, Song Wu
Affiliation: School of Computer Science and Technology HIT
Location: Shenzhen, Guangdon province, China
Technique:
Our program makes decision accoding to current hand strength and a set of precomputed probabilities, at the same time it tries to modeling the opponent. After the opponent model is built, the program will take advantage of the model when making decision.

Hyperborean2pl.iro

Team Name: Univeristy of Alberta
Team Leader: Michael Bowling
Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, John Hawkin, Richard Gibson, Neil Burch, Josh Davidson, Trevor Davis
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Technique:
Hyperborean 2pl IRO is an approximation of a Nash equilibrium for a very large abstract game. The strategy was learned using Chance Sampled CFR [1]. The abstract game uses imperfect recall [2] and each round is created in two steps. First, we divide the public cards into simple categories (number of cards of a suit on the board, number of cards in a straight on the board, pair on the board, etc) with recall of the division on earlier rounds. Second, within each categy, we use k-means clustering over the recently presented Hand Strength Distribution / Earthmover Distance pre-river feature and OCHS river feature [3]. The abstraction has a perfect preflop and flop abstraction, 1521978 turn buckets, and 840000 river buckets.

The abstraction used is identical to the 2011 and 2012 entries. The 2011 entry was run for 100 billion Chance Sampled CFR iterations, while the 2012 and 2013 entries were run for 130 billion Chance Sampled CFR iterations.
References and related papers:
1. Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret minimization in games with incomplete information" In NIPS 2008.
2. Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. "A Practical Use of Imperfect Recall". Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.
3. Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. "Evaluating State-Space Abstractions in Extensive-Form Games". In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.

Hyperborean2pl.tbr

Team Name: Univeristy of Alberta
Team Leader: Michael Bowling
Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, John Hawkin, Richard Gibson, Neil Burch, Josh Davidson, Trevor Davis
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Technique:
Hyperborean is an implicit modelling agent [5] consisting of four data biased response strategies to specific agents seen in the 2010 and 2011 ACPC's heads-up limit events. All four strategies were generated using the Counterfactual Regret Minimization (CFR) algorithm [1] with imperfect recall abstractions. Buckets were calculated according to public card textures and k-means clustering over hand strength distributions [6] and yielded an imperfect recall abstract game by forgetting previous card information and rebucketing on every round [3]. Agents were run for 4 billion iterations of chance sampled CFR. The abstraction uses 169, 18630, 18630, and 18630 buckets on each round of the game, respectively, for a total of 118 million information sets.

A mixture of these strategies is dynamically generated using a slightly modified Exp4-like algorithm [4] where the reward vector for the experts/strategies is computed using importance sampling over the individual strategies [2].
References and related papers:
1. Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret minimization in games with incomplete information" In NIPS 2008.
2. Michael Bowling, Michael Johanson, Neil Burch, and Duane Szafron. "Strategy Evaluation in Extensive Games with Importance Sampling". In Proceedings of the 25th Annual International Conference on Machine Learning (ICML), 2008.
3. Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. "A Practical Use of Imperfect Recall". Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.
4. P Auer, N Cesa-Bianchi, Y Freund, and R.E Schapire. "Gambling in a rigged casino: The adversarial multi-armed bandit problem". Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 1995.
5. Nolan Bard, Michael Johanson, Neil Burch, Michael Bowling. "Online Implicit Agent Modelling". In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.
6. Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. "Evaluating State-Space Abstractions in Extensive-Form Games". In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.

LIACC

Team Name:LIACC
Team Leader: Luis Filipe Teofilo
Team Members: Luis Filipe Teofilo
Affiliation: University of Porto, Artificial Intelligence and Computer Science Laboratory
Location: Porto, Portugal
Technique: Expected value maximization with game partition

Little Rock

Team Name: Little Rock
Team Leader: Rod Byrnes
Team Members: Rod Byrnes
Affiliation: Independent
Location: Goonellabah, NSW, Australia
Technique:
Little Rock uses an external sampling monte carlo CFR approach with imperfect recall. All agents in this year's competition use the same card abstraction, which has 8192 buckets on each of the flop, turn and river, which are created by clustering all possible hands using a variety of metrics from the current and previous rounds. The 2 player limit agent uses no action abstrations. The other two agents use what I call a "cross-sectional" approach which abstracts aspects of the current game state rather than translating individual actions (which is what I call a "longitudinal" approach).
References and related papers:
1. Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. In Advances in Neural Information Processing Systems 22 (NIPS), pp. 1078-1086, 2009.

Marv

Team Name: Bacalhau
Team Leader: Marv Andersen
Team Members: Marv Andersen
Affiliation: Independent
Location: London, UK
Technique:
This bot is a neural net trained to imitate the play of previous ACPC winners.

Neo Poker Bot

Team Name: Neo Poker Laboratory
Team Leader: Alexander Lee
Team Members: Alexander Lee
Affiliation: Independent
Location:
Technique:
The bot was built using proprietary universal game theory methods applied to poker. We complete Fixed Limit Hold’em game tree search without approximation. Original AI utilizes own database of about 3TB and to comply with completion format our team provided special simplified version of Neo - Neopokerbot_FL2V.

ProPokerTools

Team Name: ProPokerTools
Team Leader: Dan Hutchings
Team Members: Dan Hutchings
Affiliation: ProPokerTools
Location: Lakewood, Colorado.
Technique:
This hulhe agent was created using established methods; regret minimization, partial recall, etc. etc. My goal is to develop a suite of training tools built around strong approximations of unexploitable strategies. I have started with heads up limit holdem as a test case, as it is the game with the largest body of research; I fully intend to move onto other games and already have some promising results.
I have given myself a constraint in building my AI agents; all agents are created on a single machine that costs less than $1,000. I do not have high hopes for victory; I am primarily interested in how far from the best-of-the-best I can get using established methods on commodity hardware. "Pretty close" is more than good enough for my purposes. I have not attempted to optimize my AI agents for competition play.

Slugathorus

Team Name: Slugathorus
Team Leader: Daniel Berger
Team Members: Daniel Berger
Affiliation: University of New South Wales
Location: Sydney, Australia
Technique:
The agent plays an approximate Nash Equilibrium strategy generated by public chance sampled MCCFR over an abstraction with 2 billion information sets.
References and related papers:
1. "Efficient Nash Equilibrium Approximation through Monte Carlo Counterfactual Regret Minimization". (Johanson, 2012)
2. "Regret Minimization in Games With Incomplete Information". (Zinkevich, 2007)

UNamur / Joshua

Team Name: University of Namur
Team Leader: Nicolas Verbeeren
Team Members: Nicolas Verbeeren
Affiliation: University of Namur
Location: Namur, Namur, Belgium
Technique:
Joshua is based on a maximum entropy probabilistic model.

ZBot

Team Name: ZBot
Team Leader: Ilkka Rajala
Team Members: Ilkka Rajala
Affiliation: Independent
Location: Helsinki, Finland
Technique:
Counterfactual regret minimization implementation that uses two phases. In the first phase the model is built dynamically by expanding it (observing more buckets) in situations which are visited more often, until the desired size has been reached. In the second phase that model is then solved by counterfactual regret minimization.

Basically the same as ZBot 2012, only much bigger.

Heads-up No-limit Texas Hold'em

Entropy

Team Name: Entropy
Team Leader:
Team Members:
Affiliation:
Location:
Technique:

HITSZ_CS_13

Team Name: HITSZ_CS_13
Team Leader: Xuan Wang
Team Members: Xuan Wang, Jiajia Zhang, Song Wu
Affiliation: School of Computer Science and Technology HIT
Location: Shenzhen, Guangdon province, China
Technique:
Our program makes decision accoding to current hand strength and a set of precomputed probabilities, at the same time it tries to modeling the opponent. After the opponent model is built, the program will take advantage of the model when making decision.

hugh

Team Name: hugh
Team Leader: Stan Sulsky
Team Members: Stan Sulsky, Ben Sulsky
Affiliation: Independent, University of Toronto
Location: New York NY, Toronto
Technique:
We attempt to deduce our opponents' strategy from it's actions, and apply expert tactics to exploit that strategy. On later streets this is done by exploring the remaining game tree. On early streets it is based on heuristics.

This version of hugh is experimental, not expected to do particularly well.

Hyperborean2pn.iro

Team Name: Univeristy of Alberta
Team Leader: Michael Bowling
Team Members: Richard Gibson, Joshua Davidson, Michael Johanson, Nolan Bard, Neil Burch, John Hawkin, Trevor Davis, Christopher Archibald, Michael Bowling, Duane Szafron, Rob Holte
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Technique:
This agent is a meta-player that switches between 2 different strategies. A default strategy is played until we have seen the opponent make a minimum-sized bet on at least 1% of the hands played so far (a min bet as the first bet of the game is not counted). At this time, we switch to an alternative strategy that both makes min bets itself and better understands min bets.

Both strategies were computed using Counterfactual Regret Minimization (CFR) [Zinkevich et al., NIPS 2007]. Because 2-player nolimit hold'em is too large a game to apply CFR to directly, we employed abstract games that merges card deals into "buckets" to create a game of manageable size [Gilpin & Sandholm, AAMAS 2007]. In addition, we abstract the raise action to a number of bets relative to the pot size. Our default strategy only makes raises equal 0.5, 0.75, 1, 1.5, 3, 6, 11, 20, or 40 times the pot size, or go all-in, while our alternative strategy makes min raises and raises equal to 0.5, 0.75, 1, 2, 3, 11, or all-in. When the opponent makes an action that our agent cannot, we map the action to one of our raise sizes using probabilistic translation [Schnizlein, Bowling, and Szafron, IJCAI 2009].

To create our abstract game for the default strategy, we first partitioned the betting sequences into two parts: an "important" part, and an "unimportant" part. Importance was determined according to the frequency with which one of our preliminary 2-player nolimit programs was faced with a decision at that betting sequence in self-play, as well as according to the number of chips in the pot. Then, we employed two different granularities of abstraction, one for each part of this partition. The unimportant part used 169, 3700, 3700, and 3700 buckets per betting round respectively, while the important part used 169, 180,000, 1,530,000, and 1,680,000 buckets per betting round respectively. Buckets were calculated according to public card textures and k-means clustering over hand strength distriubtions [Johanson et al., AAMAS 2013] and yielded an imperfect recall abstract game by forgetting previous card information and rebucketing on every round [Waugh et al., SARA 2009]. The strategy profile of this abstract game was computed from approximately 498 billion iterations of the "Pure CFR" variant of CFR [Richard Gibson, PhD thesis, in preparation]. This type of strategy is also known as a "dynamic expert strategy" [Gibson & Szafron, NIPS 2011]. The alternative strategy used a simple abstraction with 169, 3700, 3700, and 1175 buckets per round respectively.

Hyperborean2pn.tbr

Team Name: Univeristy of Alberta
Team Leader: Michael Bowling
Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, John Hawkin, Richard Gibson, Neil Burch, Josh Davidson, Trevor Davis
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Technique:
Hyperborean is an implicit modelling agent [5] consisting of two abstract strate gies. All strategies were generated using the Counterfactual Regret Minimization (CFR) algorithm [1] with imperfect recall abstractions [3]. We also abstract the raise action to a number of bets relative to the pot size. Both strategies makes raises equal to 0.5, 0.75, 1, 1.5, 3, 6, 11, 20, or 40 times the pot size, or go all-in. The portfolio of strategies for the agent consists of:

1) A Nash equilibrium approximation
This strategy is the same as the default strategy in our heads-up no-limit IRO entry. To create our abstract game for the strategy, we first partitioned the betting sequences into two parts: an "important" part, and an "unimportant" part. Importance was determined according to the frequency with which one of our preliminary 2-player nolimit programs was faced with a decision at that betting sequence in self-play, as well as according to the number of chips in the pot. Then, we employed two different granularities of abstraction, one for each part of this partition. The unimportant part used 169, 3700, 3700, and 3700 buckets per betting round respectively, while the important part used 169, 180,000, 1,530,000, and 1,680,000 buckets per betting round respectively. Buckets were calculated according to public card textures and k-means clustering over hand strength distributions [6] and yielded an imperfect recall abstract game by forgetting previous card information and rebucketing on every round [3]. The strategy profile of this abstract game was computed from approximately 498 billion iterations of the "Pure CFR" variant of CFR [Richard Gibson, PhD thesis, in preparation]. This type of strategy is also known as a "dynamic expert strategy" [7].

2) A data biased response to aggregate data of 2011 and 2012 ACPC competitors
The exploitive response in the portfolio was created using data biased robust counter strategies [8] to aggregate data from all of the agents in the 2011 and 2012 heads-up no-limit ACPC events. It uses the same betting abstraction as the above Nash equilibrium approximation, but the card abstraction consist of 169, 9000, 9000, and 3700 buckets per betting round uniformly across the game tree.

A mixture of these agents is dynamically generated using a slightly modified Exp4-like algorithm [4] where the reward vector for the experts/strategies is computed using importance sampling over the individual strategies [2].
References and related papers:
1. Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret minimization in games with incomplete information" In NIPS 2008.
2. Michael Bowling, Michael Johanson, Neil Burch, and Duane Szafron. "Strategy Evaluation in Extensive Games with Importance Sampling". In Proceedings of the 25th Annual International Conference on Machine Learning (ICML), 2008.
3. Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. "A Practical Use of Imperfect Recall". Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.
4. P Auer, N Cesa-Bianchi, Y Freund, and R.E Schapire. "Gambling in a rigged casino: The adversarial multi-armed bandit problem". Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 1995.
5. Nolan Bard, Michael Johanson, Neil Burch, Michael Bowling. "Online Implicit Agent Modelling". In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.
6. Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. "Evaluating State-Space Abstractions in Extensive-Form Games". In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.
7. Richard Gibson and Duane Szafron. "On Strategy Stitching in Large Extensive Form Multiplayer Games". In Proceedings of the Twenty-Fifth Conference on Neural Information Processing Systems (NIPS), 2011.
8. Michael Johanson and Michael Bowling. "Data Biased Robust Counter Strategies". In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS), 2009.

KEmpfer

Team Name: KEmpfer
Team Leader: Eneldo Loza Mencia
Team Members: Eneldo Loza Mencia, Tomek Gasiorowski, Peter Glockner, Julian Prommer
Affiliation: Knowledge Engineering Group, Technische Universitat Darmstadt
Location: Darmstadt, Germany
Technique:
The agent implements a list of expert rules and follows these. Additional opponent statistics are collected and these are used in the rules, but these rules are currently disabled. The backup strategy if no expert rule is found is to play according to the expected hand strength.

Koypetitor

Team Name:Koypetitor
Team Leader: Adrian Koy
Team Members: Adrian Koy, Andrej Kuttruf, assistants
Affiliation: Independent
Location: London, United Kingdom
Technique:

LIACC

Team Name:LIACC
Team Leader: Luis Filipe Teofilo
Team Members: Luis Filipe Teofilo
Affiliation: University of Porto, Artificial Intelligence and Computer Science Laboratory
Location: Porto, Portugal
Technique: Expected value maximization with game partition

Little Rock

Team Name: Little Rock
Team Leader: Rod Byrnes
Team Members: Rod Byrnes
Affiliation: Independent
Location: Goonellabah, NSW, Australia
Technique:
Little Rock uses an external sampling monte carlo CFR approach with imperfect recall. All agents in this year's competition use the same card abstraction, which has 8192 buckets on each of the flop, turn and river, which are created by clustering all possible hands using a variety of metrics from the current and previous rounds. The 2 player limit agent uses no action abstrations. The other two agents use what I call a "cross-sectional" approach which abstracts aspects of the current game state rather than translating individual actions (which is what I call a "longitudinal" approach).
References and related papers:
1. Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. In Advances in Neural Information Processing Systems 22 (NIPS), pp. 1078-1086, 2009.

Neo Poker Bot

Team Name: Neo Poker Laboratory
Team Leader: Alexander Lee
Team Members: Alexander Lee
Affiliation: Independent
Location:
Technique:
The AI logic employs different combinations of Neural networks, Regret Minimization and Gradient Search Equilibrium Approximation, Decision Trees, Recursive Search methods as well as expert algorithms from professional poker player. Neo analyzes accumulated statistical data which allows the AI to adjust its style of play against opponents.

Nyx

Team Name: Nyx
Team Leader: Matej Moravcik
Team Members: Matej Moravcik, Martin Schmid
Affiliation: Charles University
Location: Prague, Prague.
Technique:
Implementation of counterfactual regret minimization.

Sartre

Team Name: Sartre
Team Leader: Kevin Norris
Team Members: Kevin Norris, Jonathan Rubin, Ian Watson
Affiliation: University of Auckland
Location: Auckland, New Zealand
Technique:
References and related papers:

Slumbot

Team Name: Slumbot
Team Leader: Eric Jackson
Team Members: Eric Jackson
Affiliation: Independent
Location: Menlo Park, CA, USA
Technique:
Slumbot NL uses a variant of counterfactual regret minimization with public chance sampling.
References and related papers:
1. "Slumbot NL: Solving Large Games with Counterfactual Regret Minimization Using Sampling and Distributed Processing" from the upcoming proceedings of the Computer Poker Workshop at AAAI-13.

Tartanian6

Team Name: Tartanian6
Team Leader: Tuomas Sandholm
Team Members: Noam Brown, Sam Ganzfried, Tuomas Sandholm
Affiliation: Carnegie Mellon University
Location: Pittsburgh, PA, USA
Technique:
Tartanian6 plays an approximate Nash equilibrium strategy that was computed using MCCFR with external sampling on an imperfect recall abstraction. For the river betting round, it computes undominated equilibrium strategies in a finer-grained abstraction in real-time using CPLEX's LP solver.
References and related papers:
1. Sam Ganzfried and Tuomas Sandholm. 2013. Improving Performance in Imperfect-Information Games with Large State and Action Spaces by Solving Endgames. Computer Poker and Imperfect Information Workshop at the National Conference on Artificial Intelligence (AAAI).
2. Sam Ganzfried and Tuomas Sandholm. 2013. Action Translation in Extensive-Form Games with Large Action Spaces: Axioms, Paradoxes, and the Pseudo-Harmonic Mapping. To appear in Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI).
3. Sam Ganzfried, Tuomas Sandholm, and Kevin Waugh. 2012. Strategy Purification and Thresholding: Effective Non-Equilibrium Approaches for Playing Large Games. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS).
4. Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. 2013. Evaluating State-Space Abstractions in Extensive-Form Games. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS).
5. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. 2009. Monte Carlo Sampling for Regret Minimization in Extensive Games. In Proceedings of Advances in Neural Information Processing Systems (NIPS).

3-player Limit Texas Hold'em

HITSZ_CS_13

Team Name: HITSZ_CS_13
Team Leader: Xuan Wang
Team Members: Xuan Wang, Jiajia Zhang, Song Wu
Affiliation: School of Computer Science and Technology HIT
Location: Shenzhen, Guangdon province, China
Technique:
Our program makes decision accoding to current hand strength and a set of precomputed probabilities, at the same time it tries to modeling the opponent. After the opponent model is built, the program will take advantage of the model when making decision.

Hyperborean3pl.iro

Team Name: Univeristy of Alberta
Team Leader: Michael Bowling
Team Members: Richard Gibson, Joshua Davidson, Michael Johanson, Nolan Bard, Neil Burch, John Hawkin, Trevor Davis, Christopher Archibald, Michael Bowling, Duane Szafron, Rob Holte
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Technique:
Counterfactual Regret Minimization (CFR) [Zinkevich et al., NIPS 2007] was the main technique used to build this agent. Because 3-player hold'em is too large a game to apply CFR to directly, we employed an abstract game that merges card deals into "buckets" to create a game of manageable size [Gilpin & Sandholm, AAMAS 2007].

To create our abstract game, we first partitioned the betting sequences into two parts: an "important" part, and an "unimportant" part. Importance was determined according to the frequency with which our 3-player programs from the 2011 and 2012 ACPCs were faced with a decision at that betting sequence, as well as according to the number of chips in the pot. Then, we employed two different granularities of abstraction, one for each part of this partition. The unimportant part used 169, 180,000, 18,630, and 875 buckets per betting round respectively, while the important part used 169, 1,348,620, 1,530,000, and 2,800,000 buckets per betting round respectively. Buckets were calculated according to public card textures and k-means clustering over hand strength distriubtions [Johanson et al., AAMAS 2013] and yielded an imperfect recall abstract game by forgetting previous card information and rebucketing on every round [Waugh et al., SARA 2009]. The agent plays the "current strategy profile" computed from approximately 303.6 billion iterations of the "Pure CFR" variant of CFR [Richard Gibson, PhD thesis, in preparation] applied to this abstract game. This type of strategy is also known as a "dynamic expert strategy" [Gibson & Szafron, NIPS 2011].

Hyperborean3pl.tbr

Team Name: Univeristy of Alberta
Team Leader: Michael Bowling
Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, John Hawkin, Richard Gibson, Neil Burch, Josh Davidson, Trevor Davis
Affiliation: University of Alberta
Location: Edmonton, Alberta, Canada
Technique:
Hyperborean is a data biased response to aggregate data of ACPC competitors from the 2010 and 2011 3-player limit competitions [4]. The strategy was generated using the Counterfactual Regret Minimization (CFR) algorithm [1] with imperfect recall abstractions. Buckets were calculated according to public card textures and k-means clustering over hand strength distributions [3] and yielded an imperfect recall abstract game by forgetting previous card information and rebucketing on every round [2]. The agent plays the "current strategy profile" generated after 20 billion iterations of external sampled CFR [5]. The abstraction uses 169, 10000, 5450, and 500 buckets on each round of the game, respectively.
References and related papers:
1. Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret minimization in games with incomplete information" In NIPS 2008.
2. Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. "A Practical Use of Imperfect Recall". Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.
3. Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. "Evaluating State-Space Abstractions in Extensive-Form Games". In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.
4. Michael Johanson and Michael Bowling. "Data Biased Robust Counter Strategies". In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS), 2009.
5. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. "Monte Carlo Sampling for Regret Minimization in Extensive Games". In Proceedings of the Twenty-Third Conference on Neural Information Processing Systems (NIPS), 2009.

KEmpfer

Team Name: KEmpfer
Team Leader: Eneldo Loza Mencia
Team Members: Eneldo Loza Mencia, Tomek Gasiorowski, Peter Glockner, Julian Prommer
Affiliation: Knowledge Engineering Group, Technische Universitat Darmstadt
Location: Darmstadt, Germany
Technique:
The agent implements a list of expert rules and follows these. Additional opponent statistics are collected and these are used in the rules, but these rules are currently disabled. The backup strategy if no expert rule is found is to play according to the expected hand strength.

LIACC

Team Name:LIACC
Team Leader: Luis Filipe Teofilo
Team Members: Luis Filipe Teofilo
Affiliation: University of Porto, Artificial Intelligence and Computer Science Laboratory
Location: Porto, Portugal
Technique: Expected value maximization with game partition

Little Rock

Team Name: Little Rock
Team Leader: Rod Byrnes
Team Members: Rod Byrnes
Affiliation: Independent
Location: Goonellabah, NSW, Australia
Technique:
Little Rock uses an external sampling monte carlo CFR approach with imperfect recall. All agents in this year's competition use the same card abstraction, which has 8192 buckets on each of the flop, turn and river, which are created by clustering all possible hands using a variety of metrics from the current and previous rounds. The 2 player limit agent uses no action abstrations. The other two agents use what I call a "cross-sectional" approach which abstracts aspects of the current game state rather than translating individual actions (which is what I call a "longitudinal" approach).
References and related papers:
1. Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. In Advances in Neural Information Processing Systems 22 (NIPS), pp. 1078-1086, 2009.

Neo Poker Bot

Team Name: Neo Poker Laboratory
Team Leader: Alexander Lee
Team Members: Alexander Lee
Affiliation: Independent
Location:
Technique:
The AI logic employs different combinations of Neural networks, Regret Minimization and Gradient Search Equilibrium Approximation, Decision Trees, Recursive Search methods as well as expert algorithms from top players in different games of poker. The AI was not optimized to play against computer players.