Participants: 2016

The 2016 competition had 9 different agents in the heads-up no-limit Texas hold'em competition. As in previous years, agents were submitted by a mixture of universities and individual hobbyists from 5 different countries around the world.

Competitors in the 2016 Annual Computer Poker Competition were not required to supply detailed information about their submission(s) in order to compete, but some information about team members, affiliation, location, high level technique descriptions, and occasionally relevant papers were supplied. This page presents that information.

Heads-up No-Limit Texas Hold'em

Act1

  • Team Name: Act1
  • Team Members: Tim Reiff
  • Affiliation: unfoldpoker
  • Location: Las Vegas, USA
  • Non-dynamic Agent
  • Technique:

Act1 was trained by an experimental distributed implementation of the Pure CFR algorithm.  A heuristic was added to occasionally avoid some game tree paths, reducing the time spent per training iteration.  To compensate for imperfect recall, a distance metric that considers features from all postflop streets was used to construct the card abstraction on the river.  Several bet sizes were omitted because they offer little benefit against other equilibrium opponents while requiring a disproportionate amount of resources to train and store.

The strategy consists of 159 billion information sets (430 billion information set-action pairs) and completed 5.15 trillion iterations.

Hugh

  • Team Name: Hugh
  • Team Members: Stan Sulsky
  • Affiliation: Independent
  • Location: New York, USA
  • Non-dynamic Agent
  • Technique: Just a rule-based engine.

KEmpfer_cfr

  • Team Name: KEmpfer
  • Team Members: Julian Prommer, Patryk Hopner, Suiteng Lu, Eneldo Loza Mencia
  • Affiliation: Knowledge Engineering Group, TU Darmstadt
  • Location: Darmstadt, Germany
  • Non-dynamic Agent
  • Technique: 

This bot implements a CFR strategy. For training the policy, we used the Open Pure CFR implementation and adapted it to no-limit heads-up. In addition, we implemented some more advanced techniques such as cards and bucket clustering.

Nyx

  • Team Name: Nyx
  • Team Members: Martin Schmid, Matej Moravcik
  • Affiliation: Charles University
  • Location: Prague, Czech Republic
  • Non-dynamic Agent
  • Technique: 
  • Equilibrium approximating agent
  • Small computational resources
  • Very compact strategy representation ( only 2GB for uncompressed strategy )
  • Imperfect recall action abstraction with up to 16 possible bets in an information state
  • Abstraction as well as the strategy are continuously learned during self-play
  • Heavily modified CFR utilizing dynamic programing to handle non-stationary imperfect action abstraction with many action

Automatic public card abstraction for the flop round - Schmid, M., Moravcik, M., Hladik, M., & Gaukroder, S. J. (2015, January). Automatic Public State Space Abstraction in Imperfect Information Games. In Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence.

Proteus

  • Team Name: Queen's Automated Poker Team (QAPT)
  • Team Members: Chris Barnes, Spencer Evans, Austin Attah, Robert Sun
  • Affiliation: Queen's University
  • Location: Kingston, Canada
  • Mostly static, non-equilibrium Agent
  • Technique: 

We have attempted to create a generic model by mining the logs of matches from
previous matches, especially well performing bots. In future work we plan on
implementing in-game modelling of the opponents.

Rembrant5

  • Team Name: Rembrant5
  • Team Members: Gregor Vohl
  • Affiliation: FERI
  • Location: Maribor, Slovenia
  • Static Agent
  • Technique: 

History games are used to calculate equity of the current hand and the current board cards. Bot is making random decisions with probability of actions from the history games. Because the pure random is not very effective there are also couple of hardcoded rules the but must consider before making a single action.

Slumbot

  • Team Name: Slumbot
  • Team Members: Eric Jackson
  • Affiliation: Independent Researcher
  • Location: Menlo Park, USA
  • Static Agent
  • Technique: 

Slumbot is a large Counterfactual Regret Minimization (CFR) implementation. It uses the external sampling variant of MCCFR (Monte Carlo CFR) and employs a symmetric abstraction.  Some statistics about the size of the abstraction:

  • 4.5x10^11 information sets
  • 1.1x10^12 information-set-action pairs
  • 1.5*10^6 betting sequences

We used a distributed implementation of CFR running on eleven r3.4xlarge Amazon EC2 instances.

More details can be found in my paper to be presented at the 2016 Computer Poker Workshop at AAAI.

BabyTartanian8

  • Team Name: Tartanian
  • Team Members: Noam Brown, Tuomas Sandholm
  • Affiliation: Carnegie Mellon University
  • Location: Pittsburgh, USA
  • Static Agent
  • Technique: 

BabyTartanian8 plays an approximate Nash equilibrium that was computed on the San Diego Comet supercomputer. For equilibrium finding, we used a new Monte Carlo CFR variant that leverages the recently-introduced regret-based pruning (RBP) method [Brown & Sandholm NIPS-15] to sample actions with negative regret less frequently, which dramatically speeds up convergence. Our agent uses an asymmetric action abstraction. This required conducting two separate equilibrium-finding runs.

Noam Brown and Tuomas Sandholm. Regret-Based Pruning in Extensive-Form Games. In Neural Information Processing Systems (NIPS), 2015.

Noam Brown, Sam Ganzfried, and Tuomas Sandholm. Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Hold'em Agent. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2015.

Participants: 2014

The 2014 competition had 12 different agents in the heads-up limit Texas hold'em competition, 16 agents in the heads-up no-limit competition, and 6 agents in the 3-player limit competition. As in previous years, agents were submitted by a mixture of universities and individual hobbyists from at least 11 different countries around the world.

Competitors in the 2014 Annual Computer Poker Competition were not required to supply detailed information about their submission(s) in order to compete, but some information about team members, affiliation, location, high level technique descriptions, and occasionally relevant papers were supplied. This page presents that information.


Heads-up Limit Texas Hold'em

Cleverpiggy

  • Team Name: Cleverpiggy
  • Team Members: Allen Cunningham
  • Affiliation: Independent
  • Location: Marinal del Rey, CA, US
  • Non-dynamic Agent
  • Technique: Cleverpiggy is the progeny of 20 billion iterations of chance sampled CFR run on a quad core intel with 48gigs of ram.  She uses a card abstraction with 169, 567528, 60000, 180000 hand types for each respective street.  Flop hands are essentially unabstracted.  For the turn and river, board types are established by dividing all flops into 20 categories, each of which branches into 3 turns which branch into three rivers, resulting in 60 turn and 180 river distinctions.  Hands for each board type are then divided into 1000 buckets based on earth mover and OCHS clustering.


    Regret Minimization in Games with Incomplete Information 2007
    Evaluating State-Space Abstractions in Extensive-Form Games 2013

Escabeche

  • Team Members: Marv Andersen
  • Affiliation: Independent
  • Location: London, UK
  • Non-dynamic Agent
  • Technique: This bot is a neural net trained to imitate the play of previous ACPC winners.

Feste

  • Team Name: Feste
  • Team Members: Francois Pays
  • Affiliation: Independent
  • Location: Paris, France
  • Dynamic Agent
  • Technique: The card abstraction uses respectively 169, 1500, 400 and 200 buckets for preflop, flop, turn and river. The buckets are computed using k-means clustering over selected hand parameters such as expected values, standard deviation and skewness at last round.

    The resulting abstraction is represented using sequence form with the imperfect recall extension and has 1.3 billion game states. It is solved using a custom interior point solver with indirect algebra [1]. The solver runs on a mid-range workstation and is GPU-accelerated with CUDA.

    Feste has two strategies at its disposal: a defensive one, close to the equilibrium but still slightly offensive, and a second strategy, definitely aggressive and consequently off the equilibrium. During the course of the game, Feste uses Thompson sampling to select the best adapted strategy for the opponent.


    [1] Francois Pays. 2014. An Interior Point Approach to Large Games of Incomplete Information. Proceedings of the AAAI-2014 Workshop on Computer Poker.

Hyperborean

  • Team Name: Univeristy of Alberta
  • Team Members: Michael Bowling, Duane Szafron, Rob Holte, Nolan Bard, Neil Burch, Richard Gibson, John Hawkin, Michael Johanson, Trevor Davis, Josh Davidson, Dustin Morrill
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Dynamic Agent
  • Technique: Hyperborean2014-2pl is an implicit modelling agent [2] consisting of four abstract strategies. All strategies were generated using the Counterfactual Regret Minimization (CFR) algorithm [8] with imperfect recall card abstractions.  Buckets were calculated according to public card textures and the k-means Earthmover and k-means OCHS buckets recently presented by Johanson et al [6].  By forgetting previous card information and rebucketing on every round [7], this yields an imperfect recall abstract game.

    The portfolio of strategies for the agent consists of:

    1) An asymmetric equliibrium strategy

    An asymmetric equilibrium strategy was generated to exploit mistakes that can be made by equilibrium based agents using smaller abstractions of the game [3]. The abstraction for the final strategy uses 169, 1,348,620,  1,521,978, and 840,000 buckets on each round, respectively.  During training with CFR, the opponent uses a smaller abstraction with 169 buckets on the pre-flop, and 9,000 buckets on each subsequent round.

    2) Three data biased robust counter strategies based on prior ACPC competitors

    Three strategies in the portfolio are designed to exploit specific players from prior ACPC events.  Each response was created using data biased robust counter strategies [5] to data from a particular competitor in prior ACPC events.  An asymmetric abstraction is used for the frequentist model used by the data biased response [3], placing observations of the player in a smaller abstraction than the regret minimizing portions of the strategy.

    A mixture of these strategies is dynamically generated using a slightly modified Exp4-like algorithm [1] where the reward vector for the experts/strategies is computed using importance sampling over the individual strategies [4].


    [1] P Auer, N Cesa-Bianchi, Y Freund, and R.E Schapire. Gambling in a rigged casino: The adversarial multi-armed bandit problem. Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 1995.

    [2] Nolan Bard, Michael Johanson, Neil Burch, Michael Bowling. Online Implicit Agent Modelling. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.

    [3] Nolan Bard, Michael Johanson, Michael Bowling.  Asymmetric Abstractions for Adversarial Settings.  In Proceedings of the Thirteenth International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2014.

    [4] Michael Bowling, Michael Johanson, Neil Burch, and Duane Szafron. Strategy Evaluation in Extensive Games with Importance Sampling. In Proceedings of the 25th Annual International Conference on Machine Learning (ICML), 2008.

    [5] Michael Johanson and Michael Bowling. Data Biased Robust Counter Strategies. In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS), 2009.

    [6] Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. Evaluating state-space abstractions in extensive-form games. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 271–278, 2013.

    [7] Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. A Practical Use of Imperfect Recall. In Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.

    [8] Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. Regret minimization in games with incomplete information. In Advances in Neural Information Processing Systems 20 (NIPS), 2007.

Lucifer

  • Team Name: PokerCPT
  • Team Members: Luis Filipe Teofilo
  • Affiliation: University of Porto, Artificial Intelligence and Computer Science Laboratory
  • Location: Porto, Portugal
  • Dynamic Agent
  • Technique: The base agent's strategies are a nash-equilibrium (ne). Several ne strategies were computed and the agent switches between ne to difficult opponent modeling (especially on Kuhn3P). To compute the ne strategy, an implementation of cfr was used. This implementation greatly reduces the game tree by removing decisions at chance nodes where the agent knows that it has a very high or very low probability of winning. For multiplayer Poker, the cfr implementation abstracts game sequences. The methodology to group card buckets was based on grouping buckets by their utility on smaller games. As for no-limit, the actions were also abstracted into 4 single possible decisions.

PokerStar

  • Team Name: PokerStar
  • Team Members: Ingo Kaps
  • Location: Frankfurt, Hessen, Germany
  • Dynamic Agent
  • Technique: The PokerStar Bot is written in Pascal.

    The PokerStar Bot plays at Preflop: 2.5% fold, 95% call, 2,5% raise,
    so other bots could not easy modelling.

    After Preflop PokerStar Bot calculates a opponent based squered weighted hand strength
    to use an optimized static bucket CFR Table.
    If the opponent RaiseRatio is to low a rule based Strategy will be used.
    If the opponent RaiseRatio is very low additional a Selby Preflop Strategy will be used.
    If the opponent RaiseRatio is to high Pokerstar Bot will ever call when the opponent is
    raising, or will raise if the opponent checks.

ProPokerTools

  • Team Name: ProPokerTools
  • Team Members: Dan Hutchings
  • Affiliation: ProPokerTools
  • Location: Lakewood, Colorado, US
  • Non-dynamic Agent
  • Technique: This hulhe agent was created using established methods; regret minimization,
    partial recall, etc. etc.

    Last year, I gave myself a constraint in building my AI agents; all agents were
    created on a single machine that costs less than $1,000. This year, I have loosened that constraint to allow $1,000 worth of rented compute time in the 'cloud'.

    This year's entry has been improved in three different areas; size (9 times larger), build time (9 times longer), and abstraction quality. Tests using the 2013 benchmark server show this agent would likely have placed third or fourth in the instant run-off competition if it were entered last year. Additional improvements have been developed but were not ready in time for this year's competition.

Slugathorus

  • Team Name: Slugathorus
  • Team Members: Daniel Berger
  • Affiliation: University of South Wales
  • Dynamic Agent
  • Technique: The agent combines a precomputed approximate Nash equilibrum strategy generated by public chance sampled MCCFR with a new modelling technique designed to identify when the opponent is making mistakes and exploit them.

    "Modelling Player Weakness in Poker". (Berger, 2013)
    "Efficient Nash Equilibrium Approximation through Monte Carlo Counterfactual Regret Minimization". (Johanson, 2012)
    "Regret Minimization in Games With Incomplete Information". (Zinkevich, 2007)

SmooCT

  • Team Name: SmooCT
  • Team Members: Johannes Heinrich
  • Affiliation: University College London
  • Non-dynamic Agent
  • Technique: SmooCT was trained from self-play Monte-Carlo tree search, using Smooth UCT [2]. The resulting strategy aims to approximate a Nash equilibrium. The agent uses an imperfect recall abstraction [1] based on an equidistant discretisation of expected hand strength squared values. The abstraction uses 169, 2000, 1000 and 600 buckets for the four betting rounds respectively.

    [1] Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. "A Practical Use of Imperfect Recall". Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.
    [2] Johannes Heinrich and David Silver. "Self-Play Monte-Carlo Tree Search in Computer Poker". To appear in 2014.

Terrible Terrance

  • Team Members: Jon Parker
  • Affiliation: Georgetown University and John Hopkins University
  • Non-dynamic Agent
  • Technique: The solution was built using sparse dynamic programming.  The "dynamic programming" portion of the solution is due to reusing precomputed "PayoffMatrix" objects and "StrategyBlock" objects.  Saving the PayoffMatrices allows the bot to accurately estimate upstream payoffs with relatively little computation.  Saving the StrategyBlock objects allows "new" Blocks to be initialized with solutions that already reflect some CFR style iterations.  No board abstraction is used because the "best" StrategyBlock to use for a betting round is found by looking up the nearest StrategyBlock given the relevant BettingSequence and BoardSequence

    My bot folds very few hands preflop (when compared against winners from prior HULHE competitions) and it also open calls a decent fraction of the time.  I am somewhat concerned that the open calling behavior is indicative of an error in the bot somewhere.  However, I have searched very hard for an error and haven't found one.  Moreover, the overall "game value" I computed is almost identical to the "game value" Eric Jackson (winner from 2012) found (I asked him in an email about this).

Heads-up No-limit Texas Hold'em

Feste

  • Team Name: Feste
  • Team Members: Francois Pays
  • Affiliation: Independent
  • Location: Paris, France
  • Non-dynamic Agent
  • Technique: The card abstraction uses respectively 169, 400, 50 and 25 buckets for preflop, flop, turn and river. The buckets are computed using k-means clustering over selected hand parameters such as expected values, standard deviation and skewness at last round. The betting abstraction is quite coarse: half pot (only at first bet), pot, quad-pot and all-in.

    The abstraction is represented using sequence form with the imperfect recall extension. The resulting abstraction has only 300 million game states. It is solved using a custom interior point solver with indirect algebra [1]. The solver runs on a mid-range workstation and is GPU-accelerated with CUDA. The solver has been tested up to 3 billion game states and can therefore handle abstractions ten times larger, but interestingly, either finer card or betting abstractions did not result in stronger no-limit players.

    Since Feste is not yet able to gather accurate enough information from its opponent in 3000-hand games, there is no dynamic adaptation. The instant runoff player follows a defensive strategy and the total-bankroll player, a slightly more aggressive one.


    [1] Francois Pays. 2014. An Interior Point Approach to Large Games of Incomplete Information. Proceedings of the AAAI-2014 Workshop on Computer Poker.

HibiscusBiscuit

  • Team Name: Cleverpiggy
  • Team Members: Allen Cunningham
  • Affiliation: Independent
  • Location: Marinal del Rey, CA, US
  • Non-dynamic Agent
  • Technique: Hibiscus Biscuit is comprised of separately trained big blind and button strategies with knowledge of different bet sizes.  In each case the hero uses only a couple sizes but defends against many.  Both sides use a card abstraction of 169, 20000, 18000 and 17577 buckets.  These consist of board card distinctions, earth mover clustering for the flop and turn, and clustering over (wins, ties) for the river.


    Regret Minimization in Games with Incomplete Information 2007
    Evaluating State-Space Abstractions in Extensive-Form Games 2013

Hyperborean (instant run-off)

  • Team Name: Univeristy of Alberta
  • Team Members: Michael Bowling, Duane Szafron, Rob Holte, Nolan Bard, Neil Burch, Richard Gibson, John Hawkin, Michael Johanson, Trevor Davis, Josh Davidson, Dustin Morrill
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Non-dynamic Agent
  • Technique: Hyperborean2014-NL-IRO is a Nash equilibrium approximation trained using PureCFR [1, Section 5.5],
    a recent CFR variant developed by Oskari Tammelin.  Its card abstraction is symmetric
    and uses imperfect recall, with 169 (perfect) preflop buckets, 18630 flop buckets,
    3700 turn buckets and 1175 river buckets, using the k-means Earthmover and k-means OCHS
    buckets recently presented by Johanson et al [2].  The betting abstraction is asymmetric,
    and has different bet sizes and limits for the opponent and the agent.  The opponent is anticipated
    to have a large number of bets, including min-bets or tenth-pot bets, with higher limits
    on how many bets of each type can be made in each round than the agent.  The agent has a similar set of bets,
    including 0.1-pot and 0.25-pot but not including min-bets, with bet sizes 1.5-pot and less being
    restricted to its first two actions.  This asymmetric betting abstraction gives the agent the ability to
    interpret many actions of the opponent in order to limit the impact of translation errors, while still having
    a few unusual bet sizes (0.1, 0.25, 0.65) that may cause translation errors in the opponents.

    Since this agent is asymmetric, computing its strategy required solving two abstract games.
    The game with the opponent in seat 1 had 9,765,306,248 information sets and 26,879,972,986 infoset-actions,
    and the game with the opponent in seat 2 had 10,986,105,934 information sets and 30,285,810,764 infoset-actions.
    While the average strategy is the component of CFR that converges to an equilibrium, for
    this set of strategies we only ran 80 billion and 84 billion iterations of PureCFR respectively, and
    for this size of game we anticipated improvement up to 300 billion or more iterations.  Instead of
    the average strategy, this agent uses the current strategy which does not converge to equilibrium,
    but has been demonstrated by Gibson to improve much more quickly in in-game performance [1, Section 4.4.3].


    [1] Richard Gibson. Regret Minimization in Games and the Development of Champion Multiplayer Computer Poker-Playing Agents. PhD Thesis. University of Alberta, 2013.
    [2] Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. Evaluating state-space abstractions in extensive-form games. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 271–278, 2013.

Hyperborean (total bankroll)

  • Team Name: Univeristy of Alberta
  • Team Members: Michael Bowling, Duane Szafron, Rob Holte, Nolan Bard, Neil Burch, Richard Gibson, John Hawkin, Michael Johanson, Trevor Davis, Josh Davidson, Dustin Morrill
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Dynamic Agent
  • Technique: Hyperborean2014-2pn-TBR is an implicit modelling agent [2] consisting of three abstract strategies. All strategies were generated using the Counterfactual Regret Minimization (CFR) algorithm [10] with imperfect recall abstractions [9]. We also abstract the raise action to a number of bets relative to the pot size. Both strategies makes raises equal to 0.5, 0.75, 1, 1.5, 3, 6, 11, 20, or 40 times the pot size, or go all-in. The portfolio of strategies for the agent consists of:

    1) A Nash equilibrium approximation

    To create our abstract game for the strategy, we first partitioned the betting sequences into two parts: an "important" part, and an "unimportant" part. Importance was determined according to the frequency with which one of our preliminary 2-player nolimit programs was faced with a decision at that betting sequence in self-play, as well as according to the number of chips in the pot. Then, we employed two different granularities of abstraction, one for each part of this partition. The unimportant part used 169, 3700, 3700, and 3700 buckets per betting round respectively, while the important part used 169, 180,000, 1,530,000, and 1,680,000 buckets per betting round respectively. Buckets were calculated according to public card textures and the k-means Earthmover and k-means OCHS buckets recently presented by Johanson et al [8].  By forgetting previous card information and rebucketing on every round [9], this yields an imperfect recall abstract game. The strategy profile of this abstract game was computed from approximately 498 billion iterations of PureCFR [5, Section 5.5], a recent CFR variant developed by Oskari Tammelin.  This type of strategy is also known as a "dynamic expert strategy" [6].

    2) A data biased response to aggregate data of 2011 and 2012 ACPC competitors

    One exploitive response in the portfolio was created using data biased robust counter strategies [7] to aggregate data from all of the agents in the 2011 and 2012 heads-up no-limit ACPC events. It uses the same betting abstraction as the above Nash equilibrium approximation, but the card abstraction consists of 169, 9000, 9000, and 3700 k-means Earthmover and k-means OCHS buckets per betting round uniformly across the game tree.  An asymmetric abstraction is used for the frequentist model used by the data biased response [3].  The model's abstraction ignores card information and only models agents on their abstract betting.

    3) A data biased response to aggregate data of some 2013 ACPC competitors

    The second exploitive response uses the same abstract game as the previous DBR, but only uses data from agents that weren't beaten by the 2013 Hyperborean TBR entry for at least 750 mbb/g.

    A mixture of these agents is dynamically generated using a slightly modified Exp4-like algorithm [1] where the reward vector for the experts/strategies is computed using importance sampling over the individual strategies [4].


    [1] P Auer, N Cesa-Bianchi, Y Freund, and R.E Schapire. Gambling in a rigged casino: The adversarial multi-armed bandit problem. Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 1995.

    [2] Nolan Bard, Michael Johanson, Neil Burch, Michael Bowling. Online Implicit Agent Modelling. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.

    [3] Nolan Bard, Michael Johanson, Michael Bowling.  Asymmetric Abstractions for Adversarial Settings.  In Proceedings of the Thirteenth International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2014.

    [4] Michael Bowling, Michael Johanson, Neil Burch, and Duane Szafron. Strategy Evaluation in Extensive Games with Importance Sampling. In Proceedings of the 25th Annual International Conference on Machine Learning (ICML), 2008.

    [5] Richard Gibson. Regret Minimization in Games and the Development of Champion Multiplayer Computer Poker-Playing Agents. PhD Thesis. University of Alberta, 2013.

    [6] Richard Gibson and Duane Szafron.  On Strategy Stitching in Large Extensive Form Multiplayer Games.  In Proceedings of the Twenty-Fifth Conference on Neural Information Processing Systems (NIPS), 2011.

    [7] Michael Johanson and Michael Bowling. Data Biased Robust Counter Strategies. In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS), 2009.

    [8] Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. Evaluating state-space abstractions in extensive-form games. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 271–278, 2013.

    [9] Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. A Practical Use of Imperfect Recall. In Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.

    [10] Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. Regret minimization in games with incomplete information. In Advances in Neural Information Processing Systems 20 (NIPS), 2007.

ArizonaStu (KEmpfer)

  • Team Name: KEmpfer
  • Team Members: Eneldo Loza Mencia, Julian Prommer
  • Affiliation: Knowledge Engineering Group - Technische Universität Darmstadt
  • Location: Darmstadt, Germany
  • Non-dynamic Agent
  • Technique: The agent implements a list of expert rules and follows these. Additional opponent statistics can be collected and be used in the rules, but we currently do not make use of this option. The backup strategy if no expert rule is found is to play according to the all-in equity and pot-odds.

Little Rock

  • Team Name: Little Rock
  • Team Members: Rod Byrnes
  • Affiliation: Independent
  • Location: Goonellabah, NSW, Australia
  • Non-dynamic Agent
  • Technique: External sampling MCCFR approach, virtually the same as last year but with a few enhancements to the card abstraction and action abstraction techniques.


    Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. In Advances in Neural Information Processing Systems 22 (NIPS), pp. 1078-1086, 2009.

Lucifer

  • Team Name: PokerCPT
  • Team Members: Luis Filipe Teofilo
  • Affiliation: University of Porto, Artificial Intelligence and Computer Science Laboratory
  • Location: Porto, Portugal
  • Dynamic Agent
  • Technique: The base agent's strategies are a nash-equilibrium (ne). Several ne strategies were computed and the agent switches between ne to difficult opponent modeling (especially on Kuhn3P). To compute the ne strategy, an implementation of cfr was used. This implementation greatly reduces the game tree by removing decisions at chance nodes where the agent knows that it has a very high or very low probability of winning. For multiplayer Poker, the cfr implementation abstracts game sequences. The methodology to group card buckets was based on grouping buckets by their utility on smaller games. As for no-limit, the actions were also abstracted into 4 single possible decisions.

Nyx

  • Team Name: Nyx
  • Team Members: Martin Schmid
  • Affiliation: Charles University
  • Location: Prague, Czech Republicx
  • Non-dynamic Agent
  • Technique: Improved version of the previous Nyx. Better action abstraction and new automatic public card abstraction

PijaiBot

  • Team Name: PijaiBot
  • Team Members: Ryan Pijai
  • Affiliation: Independent
  • Location:  Orlando, FL, USA
  • Non-dynamic Agent
  • Technique: PijaiBot is an Imperfect Recall, Approximate Nash Equilibrium agent with novel approaches to card-clustering, bet-translating, and opponent-trapping.  All non-isomorphically-similar card situations are grouped together using K-Means Clustering with Bhattacharyya Distance of Expected Hand-Strengths as the distance measure rather than more traditional distance measures.  When opponent bet sizes do not match any of the sizes in PijaiBot's betting abstraction, PijaiBot interprets bet sizes using Soft Translation of Geometric Similarity based on pot-odds, rather than on pot-relative or stack-relative bet sizes as has been done in the past.

    PijaiBot has special-case logic for translating small bets that do not match any of its abstraction bet sizes into checks and calls, and is able to patch up its internal abstract betting history as needed to make those translations valid action sequences.  PijaiBot attempts to exploit other agents that do not handle this and other types of similarly tricky situations properly by occasionally overriding PijaiBot's own Nash Equilibrium strategy suggestions with potentially more damaging actions that test and confuse its opponents by inducing them into misrepresenting the true state of the game.

Prelude

  • Team Name: Prelude
  • Team Members: Tim Reiff
  • Affiliation: Unfold Poker
  • Location:  Las Vegas, NV, USA
  • Non-dynamic Agent
  • Technique: Prelude is an equilibrium strategy that implements several published techniques, including the training algorithm Pure CFR, the opponent bet translation method, and a card abstraction based on k-means clustering over hand strength distributions.  I had hoped to test an agent with some ambitious importance sampling, but that one is converging slowly...  Thus, I whipped up a more conservative entry, with a streamlined betting tree partly designed for faster training.  I snuck in a few minor experiments still, including modified EV histograms and OCHS categories for the card abstraction, selective use of purification thresholding, and some speculative bet sizing adjustments.


    1. Sam Ganzfried and Tuomas Sandholm. "Tartanian5: A Heads-Up No-Limit Texas Hold'em Poker-Playing Program". In Computer Poker Symposium at the National Conference on Artificial Intelligence (AAAI), 2012.
    2. Richard Gibson. "Regret Minimization in Games and the Development of Champion Multiplayer Computer Poker-Playing Agents". PhD thesis, University of Alberta, 2014.
    3. Greg Hamerly. "Making k-means even faster".  In proceedings of the 2010 SIAM international conference on data mining (SDM 2010), April 2010.
    4. Eric Jackson. "Slumbot NL: Solving Large Games with Counterfactual Regret Minimization Using Sampling and Distributed Processing". In Computer Poker Workshop on Artificial Intelligence (AAAI), 2013.
    5. Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. "Evaluating State-Space Abstractions in Extensive-Form Games". In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.
    6. Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret Minimization in Games with Incomplete Information". In Proceedings of Advances in Neural Information Processing Systems (NIPS), 2007.

SartreNLExp

  • Team Name: Sartre
  • Team Members: Kevin Norris, Jonathan Rubin, Ian Watson
  • Affiliation: The University of Auckland
  • Location: Auckland, New Zealand
  • Dynamic Agent
  • Technique: SartreNLExp combines an approximate Nash equilibrium strategy with exploitation capabilities. It plays a base approximate Nash equilibrium strategy that was created by imitating the play [1] of the 2013 Slumbot agent [2]. SartreNLExp also incorporates a statistical exploitation module that models the opponent online, and identifies exploitable statistical anomalies in the opponent play. When a game state arises where the statistical exploitation module is able to exploit one of the opponents statistical anomalies it overrides the base strategy and provides an exploitive action. Together the base strategy and statistical exploitation module provide safe opponent exploitation, given that the opponent model is an accurate reflection of the opponent’s action frequencies. The agent has been improved from its previous iteration, presented in [3]. The exploitation capabilities of the statistical exploitation module have been greatly improved and the opponent model has been entirely overhauled. Additionally a novel decaying history method, statistic specific decaying history, has been implemented to ensure the opponent model is able to accurately reflect the frequency statistics of both static and dynamic opponents.

    References to relevant papers, if any
    1. Rubin, J., & Watson, I. (2011). Successful performance via decision generalisation in no limit Texas Hold’em. In Case-Based Reasoning Research and Development (pp. 467-481). Springer Berlin Heidelberg.
    2. Jackson, E. (2013, June). Slumbot NL: Solving Large Games with Counterfactual Regret Minimization Using Sampling and Distributed Processing. In Workshops at the Twenty-Seventh AAAI Conference on Artificial Intelligence.
    3. Norris, K., & Watson, I. (2013, August). A statistical exploitation module for Texas Hold'em: And it's benefits when used with an approximate nash equilibrium strategy. In Computational Intelligence in Games (CIG), 2013 IEEE Conference on (pp. 1-8). IEEE.

Slumbot

  • Team Name: Slumbot
  • Team Members: Eric Jackson
  • Affiliation: Independent
  • Non-dynamic Agent
  • Technique: I use Pure External CFR to compute an approximate equilibrium.  A decomposition technique described in my forthcoming workshop paper allows me to break the
    game tree into pieces that can be solved independently.  I employ an abstraction that uses more granularity (both more bet sizes and more buckets) at more commonly reached game states.


    "A Time and Space Efficient Algorithm for Approximately Solving Large Imperfect Information Games"; Eric Jackson; 2014; forthcoming in the Proceedings of the Workshop on Computer Poker and Imperfect Information at AAAI-14.

Tartanian7

  • Team Name: Tartanian
  • Team Members: Noam Brown, Sam Ganzfried, Tuomas Sandholm
  • Affiliation: Carnegie Mellon University
  • Location: Pittsburgh, PA, USA
  • Non-dynamic Agent
  • Technique: One/Two Paragraph Summary of Technique:

    Tartanian7 plays an approximate Nash equilibrium strategy that was computed on Pittsburgh's shared-memory supercomputer, which has a cache coherent Non-Uniform Memory Access (ccNUMA) architecture. We developed a new abstraction algorithm and a new equilibrium-finding algorithm that enabled us to perform a massive equilibrium computation on this architecture.

    The abstraction algorithm first clusters public flop boards, assigning each cluster to a blade on the supercomputer. These public clusters are computed by clustering using a distance function based on how often our abstraction from last year grouped hands together on the flop with different sets of public cards. Within each cluster, the algorithm then buckets the flop, turn, and river hands that are possible given one of the public flops in the cluster, using an imperfect-recall abstraction algorithm. We did not perform any abstraction for the preflop round.

    Our equilibrium-finding algorithm is a modified version of external-sampling MCCFR. It samples one pair of preflop hands per iteration. For the postflop, each blade samples community cards from its public cluster and performs MCCFR in parallel. The samples are weighted to remove bias.

    Our agent also uses a novel reverse mapping technique that compensates for the failure of CFR to fully converge and to possibly overfit the strategies to the abstraction.

3-player Limit Texas Hold'em

Hyperborean (instant run-off)

  • Team Name: Univeristy of Alberta
  • Team Members: Michael Bowling, Duane Szafron, Rob Holte, Nolan Bard, Neil Burch, Richard Gibson, John Hawkin, Michael Johanson, Trevor Davis, Josh Davidson, Dustin Morrill
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Non-dynamic Agent
  • Technique: (NOTE: This agent is the same as the 2013 ACPC's 3-player instant run-off Hyperborean entry.)

    Hyperborean2014-3pl-IRO is a Nash equilibrium approximation trained using
    PureCFR [1, Section 5.5], a recent CFR variant developed by Oskari Tammelin.
    Because 3-player hold'em is too large a game to apply CFR techniques directly,
    we employed an abstract game that merges card deals into "buckets" to create a
    game of manageable size.

    To create our abstract game, we first partitioned the betting sequences into two parts: an "important" part, and an "unimportant" part. Importance was determined according to the frequency with which our 3-player programs from the 2011 and 2012 ACPCs were faced with a decision at that betting sequence, as well as according to the number of chips in the pot. Then, we employed two different granularities of abstraction, one for each part of this partition. The unimportant part used 169, 180,000, 18,630, and 875 buckets per betting round respectively, while the important part used 169, 1,348,620, 1,530,000, and 2,800,000 buckets per betting round respectively. Buckets were calculated according to public card textures and k-means clustering over hand strength distriubtions [3] and yielded an imperfect recall abstract game by forgetting previous card information and rebucketing on every round [4]. The agent plays the "current strategy profile" computed from approximately 303.6 billion iterations of the PureCFR variant of CFR [1] applied to this abstract game. This type of strategy is also known as a "dynamic expert strategy" [2].


    [1] Richard Gibson. Regret Minimization in Games and the Development of Champion Multiplayer Computer Poker-Playing Agents. PhD Thesis. University of Alberta, 2013.

    [2] Richard Gibson and Duane Szafron.  On Strategy Stitching in Large Extensive Form Multiplayer Games.  In Proceedings of the Twenty-Fifth Conference on Neural Information Processing Systems (NIPS), 2011.

    [3] Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. Evaluating state-space abstractions in extensive-form games. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 271–278, 2013.

    [4] Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. A Practical Use of Imperfect Recall.  In Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.

Hyperborean (total bankroll)

  • Team Name: Univeristy of Alberta
  • Team Members: Michael Bowling, Duane Szafron, Rob Holte, Nolan Bard, Neil Burch, Richard Gibson, John Hawkin, Michael Johanson, Trevor Davis, Josh Davidson, Dustin Morrill
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Non-dynamic Agent
  • Technique: (NOTE: This agent is the same as the 2013 ACPC's 3-player total bankroll Hyperborean entry.)

    Hyperborean2014-3pl-TBR is a data biased response to aggregate data of ACPC competitors from the 2011 and 2012 3-player limit competitions [2]. The strategy was generated using the Counterfactual Regret Minimization (CFR) algorithm [6].  Asymmetric abstractions were used for the regret minimizing part of each player's strategy, and the frequentist model used by data biased response [1].  Each abstraction uses imperfect recall, forgetting previous card information and rebucketing on every round [5], with the k-means Earthmover and k-means OCHS buckets recently presented by Johanson et al [3]. The agent's strategy uses an abstraction with 169, 10000, 5450, and 500 buckets on each round of the game, respectively.  The model of prior ACPC competitors groups observations from all competitors into a model using 169, 900, 100, and 25 buckets on each round of the game, respectively.  The agent plays the "current strategy profile" generated after 20 billion iterations of external sampled CFR [4].


    References to relevant papers, if any:

    [1] Nolan Bard, Michael Johanson, Michael Bowling.  Asymmetric Abstractions for Adversarial Settings.  In Proceedings of the Thirteenth International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2014.

    [2] Michael Johanson and Michael Bowling. Data Biased Robust Counter Strategies. In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS), 2009.

    [3] Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. Evaluating State-Space Abstractions in Extensive-Form Games. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.

    [4] Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. Monte Carlo Sampling for Regret Minimization in Extensive Games. In Proceedings of the Twenty-Third Conference on Neural Information Processing Systems (NIPS), 2009.

    [5] Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. A Practical Use of Imperfect Recall. In Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.

    [6] Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. Regret minimization in games with incomplete information. In Advances in Neural Information Processing Systems 20 (NIPS), 2007.

Learn2KEmpf (KEmpfer)

  • Team Name: KEmpfer
  • Team Members: Eneldo Loza Mencia, Julian Prommer
  • Affiliation: Knowledge Engineering Group - Technische Universität Darmstadt
  • Location: Darmstadt, Germany
  • Non-dynamic Agent
  • Technique: This paper tries to mimic the behaviour of a given poker agent. Hence, it follows a similar strategy as Sartre from previous years, with two differences: Firstly, in contrast to Sartre, which uses cased based reasoning (basically k-nearest neighbors), we allow to use any learning algorithm. In this particular submission, we used C4.5 to induce a model of a poker agent (more specifically, Weka's implementation J48). Secondly, a much more complete representation of a state is used with up to 50 possible features. We even induce features which are convinient in order to model the opponent modelling used by the agent to be imitated.
    For this year's submission, we learned the behaviour of Hyperborean from the logs of the 2013 three player limit competition. Hence, since Hyperborean uses a CFR strategy, we expect our bot to behave accordingly. However, it is not possible to perfectly replicate the behaviour of a bot (at least with the available data). Hence, we expect our agent to perform worse than a respective opponent using CFR in this year's competition.

    - RUBIN, Jonathan ; WATSON, Ian: Case-based Strategies in Computer Poker. In: AI Communications 25 (2012), Nr. 1, p. 19–48.
    - RUBIN, Jonathan ; WATSON , Ian: Successful Performance via Decision Generalisation in No Limit Texas Hold’em. In: Case-Based Reasoning Research and Development, Vol. 6880. Springer Berlin Heidelberg, 2011, p. 467–481
    - Kischka, Theo: Trainieren eines Computer-Pokerspielers, Bachelor's Thesis, Technische Universität Darmstadt, Knowledge Engineering Group, 2014, http://www.ke.tu-darmstadt.de/lehre/arbeiten/bachelor/2014/Kischka_Theo.pdf

Lucifer

  • Team Name: PokerCPT
  • Team Members: Luis Filipe Teofilo
  • Affiliation: University of Porto, Artificial Intelligence and Computer Science Laboratory
  • Location: Porto, Portugal
  • Dynamic Agent
  • Technique: The base agent's strategies are a nash-equilibrium (ne). Several ne strategies were computed and the agent switches between ne to difficult opponent modeling (especially on Kuhn3P). To compute the ne strategy, an implementation of cfr was used. This implementation greatly reduces the game tree by removing decisions at chance nodes where the agent knows that it has a very high or very low probability of winning. For multiplayer Poker, the cfr implementation abstracts game sequences. The methodology to group card buckets was based on grouping buckets by their utility on smaller games. As for no-limit, the actions were also abstracted into 4 single possible decisions.

SmooCT

  • Team Name: SmooCT
  • Team Members: Johannes Heinrich
  • Affiliation: University College London
  • Location: London, UK
  • Non-dynamic Agent
  • Technique: SmooCT was trained from self-play Monte-Carlo tree search, using Smooth UCT [2]. The agent uses an imperfect recall abstraction [1] based on an equidistant discretisation of expected hand strength squared values. The abstraction uses 169 and 1000 buckets for the first two betting rounds. For the turn and river the abstraction granularity has been locally refined based on the number of visits to a node in self-play training. The numbers of turn and river buckets lie in [100,400] and [10,160] respectively.


    [1] Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. "A Practical Use of Imperfect Recall". Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.
    [2] Johannes Heinrich and David Silver. "Self-Play Monte-Carlo Tree Search in Computer Poker". To appear in 2014.

3-player Kuhn Poker

IN PROGRESS, DONE SOON

Participants: 2012

The 2012 competition had 13 different agents in the heads-up limit Texas hold'em competition, 11 agents in the heads-up no-limit competition, and 5 agents in the 3-player limit competition. As in previous years, agents were submitted by a mixture of universities and individual hobbyists from 10 different countries around the world.

Competitors in the 2012 Annual Computer Poker Competition were not required to supply detailed information about their submission(s) in order to compete, but some information about team members, affiliation, location, high level technique descriptions, and occasionally relevant papers were supplied. This page presents that information.


Heads-up Limit Texas Hold'em

Entropy

  • Team Name: ERGOD
  • Team Leader: Ken Barry
  • Team Members: Ken Barry
  • Affiliation: ERGOD
  • Location: Athlone, Westmeath, Ireland
  • Technique:
  • Entropy is powered by "ExperienceEngine", an agent capable of acting intelligently in any indeterminate system. Development of ExperienceEngine is ongoing and its inner workings cannot be revealed at this time.

Feste

  • Team Name: Feste
  • Team Leader: Franois Pays
  • Team Members: Franois Pays
  • Affiliation: Independent
  • Location: Paris, France
  • Technique:
  • The 2-player limit game is modelized using sequence form and solved as a min-max problem using conventional interior-point method. Betting structure is kept intact with no loss of information but cards information states are aggregated in clusters depending of betting round (flop, turn and river). The min-max problem is solved using a convex-concave variant of the log-barrier patch-following interior-point. The inner newton system is a large sparse saddle point system. Using adhoc krylov method along with preconditioning, such the system is tractable with consummer hardware. As the solution approaches, the system gets more and more ill-conditioned. Several techniques are used to stabilize the krylov solver, dynamic precision control, variable elimination and regularization. Required accuracy is reached in about 250 iterations.

Huhuers

  • Team Name: Huhubot
  • Team Leader: Shawne Lo
  • Team Members: Shawne Lo, Wes Ren Tong
  • Affiliation: Independent
  • Location: Toronto, Canada
  • Technique:
    Case based reasoning through imitation of proven strong agents.

Hyperborean2p.iro

  • Team Name: Univeristy of Alberta
  • Team Leader: Michael Bowling
  • Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, John Hawkin, Richard Gibson, Neil Burch, Parisa Mazrooei, Josh Davidson
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Technique:
    The 2-player instant run-off program is built using the Public Chance Sampling (PCS) [1] variant of Counterfactual Regret Minimization [2]. We solve a large abstract game, identical to Texas Hold'em in the preflop and flop. On the turn and river, we bucket the hands and public cards together, using approximately 1.5 million categories on the turn and 900 thousand categories on the river.
  • References and related papers:
    • Michael Johanson, Nolan Bard, Marc Lanctot, Richard Gibson, and Michael Bowling. "Efficient Nash Equilibrium Approximation through Monte Carlo Counterfactual Regret Minimization" In AAMAS 2012
    • Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret minimization in games with incomplete information" In NIPS 2008.

Hyperborean2p.tbr

  • Team Name: Univeristy of Alberta
  • Team Leader: Michael Bowling
  • Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, John Hawkin, Richard Gibson, Neil Burch, Parisa Mazrooei, Josh Davidson
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Technique:
    Hyperborean-2012-2p-limit-tbr is an agent consisting of seven abstract strategies. All seven strategies were generated using the Counterfactual Regret Minimization (CFR) algorithm [1] with imperfect recall abstractions [3]. They are:

    Two strategies in an imperfect recall abstraction using 57 million information sets that specifically counter opponents who always raise or always call. An approximation of an equilibrium within a large imperfect recall abstraction that has 879,586,352 information sets, with an unabstracted, perfect recall preflop and flop. Four strategies in the smaller (57 million information sets) abstraction that are responses to models of particular opponents seen in the 2010 or 2011 ACPC.

    During a match, the counterstrategies to always raise and always call will only be used if the opponent is detected to be always raise or always call. Otherwise, a mixture of the remaining five strategies is used. The mixture is generated using a slightly modified Hedge algorithm [4] where the reward vector for the experts/strategies is computed using importance sampling over the individual strategies [2].
  • References and related papers:
    • Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret minimization in games with incomplete information" In NIPS 2008.
    • Michael Bowling, Michael Johanson, Neil Burch, and Duane Szafron. "Strategy Evaluation in Extensive Games with Importance Sampling". In Proceedings of the 25th Annual International Conference on Machine Learning (ICML), 2008.
    • Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. "A Practical Use of Imperfect Recall". Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.
    • P Auer, N Cesa-Bianchi, Y Freund, and R.E Schapire. "Gambling in a rigged casino: The adversarial multi-armed bandit problem". Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 1995.

LittleAce

  • Team Name: LittleAce
  • Team Leader:
  • Team Members:
  • Affiliation:
  • Location:
  • Technique:

LittleRock

  • Team Name: LittleRock
  • Team Leader: Rod Byrnes
  • Team Members: Rod Byrnes
  • Affiliation: Independent
  • Location: Lismore, Australia
  • Technique:
    LittleRock uses an external sampling monte carlo CFR approach with imperfect recall. Additional RAM was available for training the agent entered into this year's competition, which allowed for a more fine grained card abstraction, but the algorithm is otherwise largely unchanged. One last-minute addition this year is a no-limit agent.

    The no-limit agent has 4,491,849 information sets, the heads-up limit agent has 11,349,052 information sets and the limit 3-player agent has 47,574,530 information sets. In addition to card abstractions, the 3-player and no-limit agents also use a form of state abstraction to make the game size manageable.
  • References and related papers:
    • Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. In Advances in Neural Information Processing Systems 22 (NIPS), pp. 10781086, 2009.

Neo Poker Bot

  • Team Name: Neo Poker Laboratory
  • Team Leader: Alexander Lee
  • Team Members: Alexander Lee
  • Affiliation: Independent
  • Location: Spain
  • Technique:
    Our range of computer players was developed to play against humans. The AI was trained on top poker rooms real money hand history logs. The AI logic employs different combinations of Neural networks, Regret Minimization and Gradient Search Equilibrium Approximation, Decision Trees, Recursive Search methods as well as expert algorithms from top players in different games of poker. Our computer players have been tested against humans and demonstrated great results over 100 mln hands. The AI was not optimized to play against computer players.

Patience

  • Team Name: Patience
  • Team Leader: Nick Grozny
  • Team Members: Nick Grozny
  • Affiliation: Independent
  • Location: Moscow, Russia.
  • Technique:
    Patience uses a static strategy built by the fictitious play algorithm.

 

Sartre

  • Team Name: Sartre
  • Team Leader: Jonathan Rubin
  • Team Members: Jonathan Rubin, Ian Watson
  • Affiliation: University of Auckland
  • Location: Auckland, New Zealand
  • Technique:
    Sartre uses a case-based approach to play Texas Hold'em. AAAI hand history data from multiple agents are encoded into distinct case-bases. When it is time for Sartre to make a betting decision a case with the current game state information is created. Each individual case-base is then searched for similar scenarios resulting in a collection of playing decisions. A final decision is made via ensemble voting.
  • References and related papers:
    • Jonathan Rubin and Ian Watson. Case-Based Strategies in Computer Poker, AI Communications, Volume 25, Number 1: 19-48, March 2012.
    • Jonathan Rubin and Ian Watson. (2011). On Combining Decisions from Multiple Expert Imitators for Performance. In IJCAI-11, Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence.

Slumbot

  • Team Name: Slumbot
  • Team Leader: Eric Jackson
  • Team Members: Eric Jackson
  • Affiliation: Independent
  • Location: Menlo Park, CA, USA
  • Technique:
    Slumbot employs the Public Chance Sampling variant of Counterfactual Regret Minimization. We use a large abstraction with 88 billion information sets. There is no abstraction on any street prior to the river. On the river there are about 4.7 million bins.

    As a consequence of the large abstraction size and our relatively modest compute environment, our system is disk-based - regrets and accumulated probabilities are written to disk on each iteration.
  • References and related papers:
    • [Johanson 2012] Efficient Nash Equilibrium Approximation through Monte Carlo Counterfactual Regret Minimization
    • [Johanson 2011] Accelerating Best Response Calculation in Large Extensive Games
    • [Zinkevich 2007] Regret Minimization in Games with Incomplete Information

ZBot

  • Team Name: ZBot
  • Team Leader: Ilkka Rajala
  • Team Members: Ilkka Rajala
  • Affiliation: Independent
  • Location: Helsinki, Finland
  • Technique:
    Counterfactual regret minimization implementation that uses two phases. In the first phase the model is built dynamically by expanding it (observing more buckets) in situations which are visited more often, until the desired size has been reached.
    In the second phase that model is then solved by counterfactual regret minimization.

    Model has 1024 possible board texture buckets for each street, and 169/1024/512/512 hand type buckets for preflop/flop/turn/river. How many buckets are actually used in any given situation depends on how common that situation is.

Heads-up No-limit Texas Hold'em

Azure Sky

  • Team Name: Azure Sky Research, Inc
  • Team Leader: Eric Baum
  • Team Members: Eric Baum, Chick Markley, Dennis Horte
  • Affiliation: Azure Sky Research Inc.
  • Location: Berkeley CA US
  • Technique:
    SARSA trained neural nets, k-armed bandits, secret sauce.

dcubot

  • Team Name: dcubot
  • Team Leaders: Neill Sweeney
  • Team Members: Neill Sweeney, David Sinclair
  • Affiliation: School of Computing, Dublin City University
  • Location: Dublin 9, Ireland.
  • Technique:
    The bot uses a structure like a Neural Net to generate its own actions. A hidden Markov model is used to interpret actions i.e. read an opponent's hand. The whole system is then trained by self-play.
    For any decision, the range of betting between a min-bet and all-in is divided into at most twelve sub-ranges. The structure then selects a fold,call, min-bet, all-in or one of these sub-ranges. If a sub-range is selected, the actual raise ammount is drawn from a quadratic distribution between the end-points of the sub-range. The end-points of the sub-ranges are learnt using the same reinfrocement learning algorthm as the rest of the structure.

hugh

  • Team Name: hugh
  • Team Leader: Stan Sulsky
  • Team Members: Stan Sulsky, Ben Sulsky
  • Affiliation: Independent
  • Location: NY, US & Toronto, Ont, CA
  • Technique:
    Ben (poker player and son) attempts to teach Stan (programmer and father) to play poker. Stan attempts to realize Ben's ideas in code.

    More specifically, pure strategies are utilized throughout. Play is based on range vs range ev calculations. PreFlop ranges are deduced by opponent modelling during play. Subsequent decisions are based a minmax search of the remaining game tree, coupled
    with some tactical considerations.

Hyperborean2pNL

  • Team Name: Univeristy of Alberta
  • Team Leader: Michael Bowling
  • Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, Johnny Hawkin, Richard Gibson, Neil Burch, Parisa Mazrooei, Josh Davidson
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Technique:
    Our 2-player no limit bot was built using a variant of Counterfactual Regret Minimization (CFR) ([3], [4]) applied to a specially designed betting abstraction of the game. Using an algorithm similar to the CFR algorithm, a different bet size is chosen for each betting sequence in the game ([1], [2]). The card abstraction used buckets hands and public cards together using imperfect recall, allowing for 18630 possible buckets on each of the flop, turn and river.
  • References and related papers:
    • Hawkin, J.; Holte, R.; and Szafron, D. 2011. Automated action abstraction of imperfect information extensive-form games. In AAAI, 681687.
    • Hawkin, J.; Holte, R.; and Szafron, D. 2012. Using Sliding Windows to Generate Action Abstractions in Extensive-Form Games. To appear, AAAI '12.
    • Michael Johanson, Nolan Bard, Marc Lanctot, Richard Gibson, and Michael Bowling. "Efficient Nash Equilibrium Approximation through Monte Carlo Counterfactual Regret Minimization" In AAMAS 2012
    • Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret minimization in games with incomplete information" In NIPS 2008.

LittleRock

  • Team Name: LittleRock
  • Team Leader: LittleRock
  • Team Members: Rod Byrnes
  • Affiliation: Independent
  • Location: Lismore, Australia
  • Technique:
    LittleRock uses an external sampling monte carlo CFR approach with imperfect recall. Additional RAM was available for training the agent entered into this year's competition, which allowed for a more fine grained card abstraction, but the algorithm is otherwise largely unchanged. One last-minute addition this year is a no-limit agent.

    The no-limit agent has 4,491,849 information sets, the heads-up limit agent has 11,349,052 information sets and the limit 3-player agent has 47,574,530 information sets. In addition to card abstractions, the 3-player and no-limit agents also use a form of state abstraction to make the game size manageable.
  • References and related papers:
    • Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. In Advances in Neural Information Processing Systems 22 (NIPS), pp. 10781086, 2009.

Lucky7_12

  • Team Name: Lucky7_12
  • Team Leader: Bojan Butolen
  • Team Members: Bojan Butolen, Gregor Vohl
  • Affiliation: University of Maribor
  • Location: Maribor, Slovenia
  • Technique:
  • We have developed a multi agent system that uses 8 strategies during gameplay. By identifying the state of the game, our system chooses a set of strategies that have proved most profitable against a set of training agents. The final decision of the system is made by averaging the decisions of the individual agents.

    The 8 agents included in our system are most rule-based agent. The rules for each individual agent were constructed using different knowledge bases (various match logs, expert knowledge, human observed play...) and different abstraction definitions for cards and
    actions. After a set of test matches were each agent dueled against the other agents in system, we determined that none of the included agents present an inferior or superior strategy (meaning each agent lost at least against one of the other agents and won at least one match).
  • References and related papers:
    • A submission to the Poker Symposium was made with the title: Combining Various Strategies In A Poker Playing Multi Agent System

Neo Poker Bot

  • Team Name: Neo Poker Laboratory
  • Team Leader: Alexander Lee
  • Team Members: Alexander Lee
  • Affiliation: Independent
  • Location: Spain
  • Technique:
    Our range of computer players was developed to play against humans. The AI was trained on top poker rooms real money hand history logs. The AI logic employs different combinations of Neural networks, Regret Minimization and Gradient Search Equilibrium Approximation, Decision Trees, Recursive Search methods as well as expert algorithms from top players in different games of poker.
    Our computer players have been tested against humans and demonstrated great results over 100 mln hands. The AI was not optimized to play against computer players.

SartreNL

  • Team Name: Sartre
  • Team Leader: Jonathan Rubin
  • Team Members: Jonathan Rubin, Ian Watson
  • Affiliation: University of Auckland
  • Location: Auckland, New Zealand
  • Technique:
    SartreNL uses a case-based approach to play No Limit Texas Hold'em. Hand history data from the previous years top agents are encoded into cases. When it is time for SartreNL to make a betting decision a case with the current game state information is created. The case-base is then searched for similar cases. The solution to past similar cases are then re-used for the current situation.
  • References and related papers:
    • Jonathan Rubin and Ian Watson. (2011). Successful Performance via Decision Generalisation in No Limit Texas Hold'em. In Case-Based Reasoning. Research and Development, 19th International Conference on Case-Based Reasoning, ICCBR 2011.

Spewie Louie

  • Team Name: Spewie Louie
  • Team Leader: Jon Parker
  • Team Members: Jon Parker
  • Affiliation: Georgetown University
  • Location: Washington DC, USA
  • Technique:
    The bot assumes bets can occur in: .25x, .4286x, .6666x, 1x, 1.5x, 4x, and 9x pot increments. Nodes in the tree contain: A hand range for each player, an "effectiveMatrix" that summerizes the tree below that point in the tree, and a "strategyMatrix" which is used by the "hero" of that node. Prior to the competition a collection of 24 Million matrices (1/2 strategy and 1/2 effective) were refined while simulating roughly 12.5 Million paths through the tree. This set of 24 Million matrices is then trimmed down to 770k (strategy only) matrices for the competition. Any decision not supported by this set of matrices is handled by an "on line" tree learned.
    During the learning process the set of effectiveMatrices and strategy matrices are stored in a ConcurrentHashMap. This gives the learning process good multi-thread behavior.
    Preflop hands are bucketed into 22 groups. Flop and Turn hands are bucketed into 8 groups. River hands are bucketed into 7 groups.
  • References and related papers:
    • Micheal Johanson's Masters thesis was quite helpful. "Robust Strategies and Counter-Strategies: Build a Champion Level Computer Poker Player". As were most of his other paper. Some of the older U. Alberta works by Darse Billings were also good reads. The book "The Mathematics of Poker" and its explaination of the AKQ game is very good.

Tartanian5

  • Team Name: Tartanian5
  • Team Leader: Sam Ganzfried
  • Team Members: Sam Ganzfried, Tuomas Sandholm
  • Affiliation: Carnegie Mellon University
  • Location: Pittsburgh, PA, 15217, United States
  • Technique:
    Tartanian5 plays a game-theoretic approximate Nash equilibrium strategy. First, it applies a potential-aware, perfect-recall, automated abstraction algorithm to group similar game states together and construct a smaller game that is strategically similar to the full game. In order to maintain a tractable number of possible betting sequences, it employs a discretized betting model, where only a small number of bet sizes are allowed at each game state. Approximate equilibrium strategies for both players are then computed using an improved version of Nesterov's excessive gap technique specialized for poker. To obtain the final strategies, we apply a purification procedure which rounds action probabilities to 0 or 1.
  • References and related papers:
    • Sam Ganzfried, Tuomas Sandholm, and Kevin Waugh. 2012. Strategy purification and thresholding: Effective non-equilibrium approaches for playing large games. In AAMAS.
    • Andrew Gilpin, Tuomas Sandholm, and Troels Sorensen. 2007. Potential-aware automated abstraction of sequential games, and holistic equilibrium analysis of Texas Hold'em poker. In AAAI.
    • Andrew Gilpin, Tuomas Sandholm, and Troels Sorensen. 2008. A heads-up no-limit Texas Hold'em poker player: Discretized betting models and automatically generated equilibrium-finding programs. In AAMAS.
    • Samid Hoda, Andrew Gilpin, Javier Pena, and Tuomas Sandholm. 2010. Smoothing techniques for computing Nash equilibria of sequential games. Mathematics of Operations Research 35(2):494-512.

UNI-MB_Poker

  • Team Name: UNI-MB_Poker
  • Team Leader: Ale ?ep
  • Team Members: Ale ?ep, Davor Gaberek
  • Affiliation: University of Maribor
  • Location: Maribor, Slovenia
  • Technique:
  • Our Poker-agent concentrates on getting chips from his opponent to maximize its profit. It uses small raises even if it has good cards to lour his opponent into the game, bluffs in 5% of hands and folds, when odds are not in its favor. We used two criteria for our agent to decide what to do - first we examine the cards that we get and secondly we calculate the odds of us winning. After combining the two results we decide what action to take.

     


3-player Limit Texas Hold'em

dcubot

  • Team Name: dcubot
  • Team Leader: Neill Sweeney
  • Team Members: Neill Sweeney, David Sinclair
  • Affiliation: School of Computing, Dublin City University
  • Location: Dublin 9, Ireland.
  • Technique:
    The bot uses 4 seperate connectionist strutures for each betting round. Ten input features describe the state of the betting after each legal decision and there are over 300 basic features describing the visible cards. Reading opponent hands is dealt with by maximum likelihood fitting a hidden markov model to the play with the cards hidden. A belief vector over the hidden variable is then used as an additional input.

    This year we have increased the size of the structure by doubling the hidden layer.

Hyperborean3p

  • Team Name: Hyperborean3p
  • Team Leader: Michael Bowling
  • Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, Johnny Hawkin, Richard Gibson, Neil Burch, Parisa Mazrooei, Josh Davidson
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Technique:
    Our 3-player program is built using the External Sampling (ES) [2] variant of Counterfactual Regret Minimization [3]. ES is applied to an abstract game constructed from two different card abstractions of Texas Hold'em, producing a dynamic expert strategy [1]. The first card abstraction is a very fine and allows our program to distinguish between many different possible hands on each round, whereas our second card abstraction is much coarser and merges many different hands into the same information set. The first abstraction is applied to the "important" parts of the betting tree, where importance is determined by the potsize and the frequency at which our program reached the betting sequence in last year's competition. The second, coarser abstraction is applied elsewhere.
  • References and related papers:
    • Richard Gibson and Duane Szafron. On strategy stitching in large extensive form multiplayer games. In NIPS 2011..
    • Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. Monte Carlo sampling for regret minimization in extensive games. In NIPS 2009.
    • Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. Regret minimization in games with incomplete information. In NIPS 2008.

LittleRock

  • Team Name: LittleRock
  • Team Leader: Rod Byrnes
  • Team Members: Rod Byrnes
  • Affiliation: Independent
  • Location: Lismore, Australia
  • Technique:
    LittleRock uses an external sampling monte carlo CFR approach with imperfect recall. Additional RAM was available for training the agent entered into this year's competition, which allowed for a more fine grained card abstraction, but the algorithm is otherwise largely unchanged. One last-minute addition this year is a no-limit agent.

    The no-limit agent has 4,491,849 information sets, the heads-up limit agent has 11,349,052 information sets and the limit 3-player agent has 47,574,530 information sets. In addition to card abstractions, the 3-player and no-limit agents also use a form of state abstraction to make the game size manageable.
  • References and related papers:
    • Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. In Advances in Neural Information Processing Systems 22 (NIPS), pp. 10781086, 2009.

Neo Poker Bot

  • Team Name: Neo Poker Laboratory
  • Team Leader: Alexander Lee
  • Team Members: Alexander Lee
  • Affiliation: Independent
  • Location: Spain
  • Technique:
    Our range of computer players was developed to play against humans. The AI was trained on top poker rooms real money hand history logs. The AI logic employs different combinations of Neural networks, Regret Minimization and Gradient Search Equilibrium Approximation, Decision Trees, Recursive Search methods as well as expert algorithms from top players in different games of poker.
    Our computer players have been tested against humans and demonstrated great results over 100 mln hands. The AI was not optimized to play against computer players.

 

Sartre3P

  • Team Name: Sartre
  • Team Leader: Jonathan Rubin
  • Team Members: Jonathan Rubin, Ian Watson
  • Affiliation: University of Auckland
  • Location: Auckland, New Zealand
  • Technique:
    Sartre3P uses a case-based approach to play Texas Hold'em. AAAI hand history data from both three-player and two-player matches are encoded into separate case-bases. When a playing decision is required, a case with the current game state information is created. If no opponents have folded, Sartre3P will search the three-player case-base for similar game scenarios for a solution. On the other hand, if an opponent has folded, Sartre3P will search the two-player case-base and switch to a heads-up strategy if it is possible to map the three-player betting sequence to an appropriate two-player sequence.
  • References and related papers:
    • Jonathan Rubin and Ian Watson. Case-Based Strategies in Computer Poker, AI Communications, Volume 25, Number 1: 19-48, March 2012.

Participants: 2013

The 2013 competition had 14 different agents in the heads-up limit Texas hold'em competition, 14 agents in the heads-up no-limit competition, and 7 agents in the 3-player limit competition. As in previous years, agents were submitted by a mixture of universities and individual hobbyists from 14 different countries around the world.

Competitors in the 2013 Annual Computer Poker Competition were not required to supply detailed information about their submission(s) in order to compete, but some information about team members, affiliation, location, high level technique descriptions, and occasionally relevant papers were supplied. This page presents that information.


Heads-up Limit Texas Hold'em

Feste

  • Team Name: Feste
  • Team Leader: Francois Pays
  • Team Members: Francois Pays
  • Affiliation: Independent
  • Location: Paris, France
  • Technique:
    • Modelization:
      We use sequence form to compute an equilibrium of an abstract downsized game model. There is no betting abstraction. The card abstraction uses the following buckett ing parameters: preflop with 169 buckets (no abstraction), flop/400, turn/50 and river/25. Buckets are computed using K-Means over hand and board parameters: current and past expected values, deviations on future streets. Note that sequence form supposes perfect recall, which is not respected by our card abstraction. The probable consequence is some additional exploitability of the resulting strategies. Players have respectively 263,435 and 263,316 movesets and both 101,302 infosets. Model has 16,263,415 leaves, which is a relatively small for the solver.
    • Solver:
      The sequence form could be converted into a linear complementary problem (LCP). But this approach does not appear to be scalable since solving such LCP involves either sparse direct or dense methods. In order to take full advantage of the constraints and payoff matrices sparsity, we keep the original problem intact, large but sparse, and use only sparse indi rect methods. The full problem is a min-max problem with bilinear objective and separable cons traints [1]. It can be solved efficiently using a classical interior-point solve r coupled with an inner iterative linear system solver. We use a standard log-ba rrier infeasible primal-dual path-following method but applied to the min-max sy stem. The underlying newton system in augmented form belongs to the class of large spa rse saddle-point problems. Several modern techniques apply. We use a variant of the Projected Preconditioned Conjugate Gradient (PPCG) with an implicit-factoriz ation preconditioner [2]. The condition number of system matrix is kept under co ntrol using regularization and zero variables identification and elimination. Intermediary system matrices are never explicitly formed. The only significant m emory use comes from the given constraints/payoff sparse matrices. This approach has theoretical convergence rate of is O(log 1/accuracy) and uses a minimal amount of memory (proportional to the number of game leaves). In practice, the min-max problem is solved down to competition level accuracy in about 5 days using mentioned hardware. The solver has been successfully tested up to 120 million of leaves.
    • Adaptation:
      A simple variation of the initial problem leads to more aggressive strategies: in the objective function, we insert an extra fraction “epsilon” of random strategies (i.e playing every possible action with a constant probability) that both sides may additionally encounter. Lower epsilon values means closer to the optimal strategy (i.e the nash equilibrium), higher values means more aggressive strategies, trying to maximize the exploitation of random play while protecting themselves against the corresponding aggressive strategies. Feste adapts dynamically to its opponent using the Thompson Sampling algorithm over normal distributions. For a faster adaptation, element of luck from hole cards is mitigated. In the Instant Run-off tournament, Feste has at its disposal three strategies: the optimal strategy and two additional epsilon-aggressive strategies: 1% and 5%. Even if theses additional strategies are purposely away from the nash equilibrium, they are expected to be of some use against real agents, even equilibrium-based, without being too much exploitable themselves. In the Total Bankroll tournament, Feste is allowed to use an additional 25% epsilon-aggressive strategy. This strategy is a double-edged sword: it may be the best shot against "chumps" but could also be easily exploited by real agents.
    • Hardware:
      The corresponding computations has been carried out on a mid/high-range workstation (12-core Dual Xeon, 48GB of memory and 2 GPU cards). The CPU horsepower is mostly used during the probability tables generation, bucketing, payoff tables computation and match simulations. The GPU cards handle the sparse matrix-vector products (with CUDA) in the PPCG during the solve step, allowing the machine to solve two problems concurrently.
  • References and related papers:
    1. Minimax and Convex-Concave Games. Arpita Ghosh and Stephen Boyd. EE392o, Stanford University.
    2. Dollar, H. S., Gould, N. I., Schilders, W. H., & Wathen, A. J. (2006). Implicit-factorization preconditioning and iterative solvers for regularized saddle-point systems. SIAM Journal on Matrix Analysis and Applications, 28(1), 170-189.

HITSZ_CS_13

  • Team Name: HITSZ_CS_13
  • Team Leader: Xuan Wang
  • Team Members: Xuan Wang, Jiajia Zhang, Song Wu
  • Affiliation: School of Computer Science and Technology HIT
  • Location: Shenzhen, Guangdon province, China
  • Technique:
    Our program makes decision accoding to current hand strength and a set of precomputed probabilities, at the same time it tries to modeling the opponent. After the opponent model is built, the program will take advantage of the model when making decision.

Hyperborean2pl.iro

  • Team Name: Univeristy of Alberta
  • Team Leader: Michael Bowling
  • Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, John Hawkin, Richard Gibson, Neil Burch, Josh Davidson, Trevor Davis
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Technique:
    Hyperborean 2pl IRO is an approximation of a Nash equilibrium for a very large abstract game. The strategy was learned using Chance Sampled CFR [1]. The abstract game uses imperfect recall [2] and each round is created in two steps. First, we divide the public cards into simple categories (number of cards of a suit on the board, number of cards in a straight on the board, pair on the board, etc) with recall of the division on earlier rounds. Second, within each categy, we use k-means clustering over the recently presented Hand Strength Distribution / Earthmover Distance pre-river feature and OCHS river feature [3]. The abstraction has a perfect preflop and flop abstraction, 1521978 turn buckets, and 840000 river buckets.

    The abstraction used is identical to the 2011 and 2012 entries. The 2011 entry was run for 100 billion Chance Sampled CFR iterations, while the 2012 and 2013 entries were run for 130 billion Chance Sampled CFR iterations.
  • References and related papers:
    1. Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret minimization in games with incomplete information" In NIPS 2008.
    2. Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. "A Practical Use of Imperfect Recall". Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.
    3. Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. "Evaluating State-Space Abstractions in Extensive-Form Games". In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.

Hyperborean2pl.tbr

  • Team Name: Univeristy of Alberta
  • Team Leader: Michael Bowling
  • Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, John Hawkin, Richard Gibson, Neil Burch, Josh Davidson, Trevor Davis
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Technique:
    Hyperborean is an implicit modelling agent [5] consisting of four data biased response strategies to specific agents seen in the 2010 and 2011 ACPC's heads-up limit events. All four strategies were generated using the Counterfactual Regret Minimization (CFR) algorithm [1] with imperfect recall abstractions. Buckets were calculated according to public card textures and k-means clustering over hand strength distributions [6] and yielded an imperfect recall abstract game by forgetting previous card information and rebucketing on every round [3]. Agents were run for 4 billion iterations of chance sampled CFR. The abstraction uses 169, 18630, 18630, and 18630 buckets on each round of the game, respectively, for a total of 118 million information sets.

    A mixture of these strategies is dynamically generated using a slightly modified Exp4-like algorithm [4] where the reward vector for the experts/strategies is computed using importance sampling over the individual strategies [2].
  • References and related papers:
    1. Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret minimization in games with incomplete information" In NIPS 2008.
    2. Michael Bowling, Michael Johanson, Neil Burch, and Duane Szafron. "Strategy Evaluation in Extensive Games with Importance Sampling". In Proceedings of the 25th Annual International Conference on Machine Learning (ICML), 2008.
    3. Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. "A Practical Use of Imperfect Recall". Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.
    4. P Auer, N Cesa-Bianchi, Y Freund, and R.E Schapire. "Gambling in a rigged casino: The adversarial multi-armed bandit problem". Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 1995.
    5. Nolan Bard, Michael Johanson, Neil Burch, Michael Bowling. "Online Implicit Agent Modelling". In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.
    6. Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. "Evaluating State-Space Abstractions in Extensive-Form Games". In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.

LIACC

  • Team Name:LIACC
  • Team Leader: Luis Filipe Teofilo
  • Team Members: Luis Filipe Teofilo
  • Affiliation: University of Porto, Artificial Intelligence and Computer Science Laboratory
  • Location: Porto, Portugal
  • Technique: Expected value maximization with game partition

Little Rock

  • Team Name: Little Rock
  • Team Leader: Rod Byrnes
  • Team Members: Rod Byrnes
  • Affiliation: Independent
  • Location: Goonellabah, NSW, Australia
  • Technique:
    Little Rock uses an external sampling monte carlo CFR approach with imperfect recall. All agents in this year's competition use the same card abstraction, which has 8192 buckets on each of the flop, turn and river, which are created by clustering all possible hands using a variety of metrics from the current and previous rounds. The 2 player limit agent uses no action abstrations. The other two agents use what I call a "cross-sectional" approach which abstracts aspects of the current game state rather than translating individual actions (which is what I call a "longitudinal" approach).
  • References and related papers:
    1. Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. In Advances in Neural Information Processing Systems 22 (NIPS), pp. 1078-1086, 2009.

Marv

  • Team Name: Bacalhau
  • Team Leader: Marv Andersen
  • Team Members: Marv Andersen
  • Affiliation: Independent
  • Location: London, UK
  • Technique:
    This bot is a neural net trained to imitate the play of previous ACPC winners.

Neo Poker Bot

  • Team Name: Neo Poker Laboratory
  • Team Leader: Alexander Lee
  • Team Members: Alexander Lee
  • Affiliation: Independent
  • Location:
  • Technique:
    The bot was built using proprietary universal game theory methods applied to poker. We complete Fixed Limit Hold’em game tree search without approximation. Original AI utilizes own database of about 3TB and to comply with completion format our team provided special simplified version of Neo - Neopokerbot_FL2V.

ProPokerTools

  • Team Name: ProPokerTools
  • Team Leader: Dan Hutchings
  • Team Members: Dan Hutchings
  • Affiliation: ProPokerTools
  • Location: Lakewood, Colorado.
  • Technique:
    This hulhe agent was created using established methods; regret minimization, partial recall, etc. etc. My goal is to develop a suite of training tools built around strong approximations of unexploitable strategies. I have started with heads up limit holdem as a test case, as it is the game with the largest body of research; I fully intend to move onto other games and already have some promising results.
    I have given myself a constraint in building my AI agents; all agents are created on a single machine that costs less than $1,000. I do not have high hopes for victory; I am primarily interested in how far from the best-of-the-best I can get using established methods on commodity hardware. "Pretty close" is more than good enough for my purposes. I have not attempted to optimize my AI agents for competition play.

Slugathorus

  • Team Name: Slugathorus
  • Team Leader: Daniel Berger
  • Team Members: Daniel Berger
  • Affiliation: University of New South Wales
  • Location: Sydney, Australia
  • Technique:
    The agent plays an approximate Nash Equilibrium strategy generated by public chance sampled MCCFR over an abstraction with 2 billion information sets.
  • References and related papers:
    1. "Efficient Nash Equilibrium Approximation through Monte Carlo Counterfactual Regret Minimization". (Johanson, 2012)
    2. "Regret Minimization in Games With Incomplete Information". (Zinkevich, 2007)

UNamur / Joshua

  • Team Name: University of Namur
  • Team Leader: Nicolas Verbeeren
  • Team Members: Nicolas Verbeeren
  • Affiliation: University of Namur
  • Location: Namur, Namur, Belgium
  • Technique:
    Joshua is based on a maximum entropy probabilistic model.

ZBot

  • Team Name: ZBot
  • Team Leader: Ilkka Rajala
  • Team Members: Ilkka Rajala
  • Affiliation: Independent
  • Location: Helsinki, Finland
  • Technique:
    Counterfactual regret minimization implementation that uses two phases. In the first phase the model is built dynamically by expanding it (observing more buckets) in situations which are visited more often, until the desired size has been reached. In the second phase that model is then solved by counterfactual regret minimization.

    Basically the same as ZBot 2012, only much bigger.

Heads-up No-limit Texas Hold'em

Entropy

  • Team Name: Entropy
  • Team Leader:
  • Team Members:
  • Affiliation:
  • Location:
  • Technique:

HITSZ_CS_13

  • Team Name: HITSZ_CS_13
  • Team Leader: Xuan Wang
  • Team Members: Xuan Wang, Jiajia Zhang, Song Wu
  • Affiliation: School of Computer Science and Technology HIT
  • Location: Shenzhen, Guangdon province, China
  • Technique:
    Our program makes decision accoding to current hand strength and a set of precomputed probabilities, at the same time it tries to modeling the opponent. After the opponent model is built, the program will take advantage of the model when making decision.

hugh

  • Team Name: hugh
  • Team Leader: Stan Sulsky
  • Team Members: Stan Sulsky, Ben Sulsky
  • Affiliation: Independent, University of Toronto
  • Location: New York NY, Toronto
  • Technique:
    We attempt to deduce our opponents' strategy from it's actions, and apply expert tactics to exploit that strategy. On later streets this is done by exploring the remaining game tree. On early streets it is based on heuristics.

    This version of hugh is experimental, not expected to do particularly well.

Hyperborean2pn.iro

  • Team Name: Univeristy of Alberta
  • Team Leader: Michael Bowling
  • Team Members: Richard Gibson, Joshua Davidson, Michael Johanson, Nolan Bard, Neil Burch, John Hawkin, Trevor Davis, Christopher Archibald, Michael Bowling, Duane Szafron, Rob Holte
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Technique:
    This agent is a meta-player that switches between 2 different strategies. A default strategy is played until we have seen the opponent make a minimum-sized bet on at least 1% of the hands played so far (a min bet as the first bet of the game is not counted). At this time, we switch to an alternative strategy that both makes min bets itself and better understands min bets.

    Both strategies were computed using Counterfactual Regret Minimization (CFR) [Zinkevich et al., NIPS 2007]. Because 2-player nolimit hold'em is too large a game to apply CFR to directly, we employed abstract games that merges card deals into "buckets" to create a game of manageable size [Gilpin & Sandholm, AAMAS 2007]. In addition, we abstract the raise action to a number of bets relative to the pot size. Our default strategy only makes raises equal 0.5, 0.75, 1, 1.5, 3, 6, 11, 20, or 40 times the pot size, or go all-in, while our alternative strategy makes min raises and raises equal to 0.5, 0.75, 1, 2, 3, 11, or all-in. When the opponent makes an action that our agent cannot, we map the action to one of our raise sizes using probabilistic translation [Schnizlein, Bowling, and Szafron, IJCAI 2009].

    To create our abstract game for the default strategy, we first partitioned the betting sequences into two parts: an "important" part, and an "unimportant" part. Importance was determined according to the frequency with which one of our preliminary 2-player nolimit programs was faced with a decision at that betting sequence in self-play, as well as according to the number of chips in the pot. Then, we employed two different granularities of abstraction, one for each part of this partition. The unimportant part used 169, 3700, 3700, and 3700 buckets per betting round respectively, while the important part used 169, 180,000, 1,530,000, and 1,680,000 buckets per betting round respectively. Buckets were calculated according to public card textures and k-means clustering over hand strength distriubtions [Johanson et al., AAMAS 2013] and yielded an imperfect recall abstract game by forgetting previous card information and rebucketing on every round [Waugh et al., SARA 2009]. The strategy profile of this abstract game was computed from approximately 498 billion iterations of the "Pure CFR" variant of CFR [Richard Gibson, PhD thesis, in preparation]. This type of strategy is also known as a "dynamic expert strategy" [Gibson & Szafron, NIPS 2011]. The alternative strategy used a simple abstraction with 169, 3700, 3700, and 1175 buckets per round respectively.

Hyperborean2pn.tbr

  • Team Name: Univeristy of Alberta
  • Team Leader: Michael Bowling
  • Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, John Hawkin, Richard Gibson, Neil Burch, Josh Davidson, Trevor Davis
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Technique:
    Hyperborean is an implicit modelling agent [5] consisting of two abstract strate gies. All strategies were generated using the Counterfactual Regret Minimization (CFR) algorithm [1] with imperfect recall abstractions [3]. We also abstract the raise action to a number of bets relative to the pot size. Both strategies makes raises equal to 0.5, 0.75, 1, 1.5, 3, 6, 11, 20, or 40 times the pot size, or go all-in. The portfolio of strategies for the agent consists of:

    1) A Nash equilibrium approximation
    This strategy is the same as the default strategy in our heads-up no-limit IRO entry. To create our abstract game for the strategy, we first partitioned the betting sequences into two parts: an "important" part, and an "unimportant" part. Importance was determined according to the frequency with which one of our preliminary 2-player nolimit programs was faced with a decision at that betting sequence in self-play, as well as according to the number of chips in the pot. Then, we employed two different granularities of abstraction, one for each part of this partition. The unimportant part used 169, 3700, 3700, and 3700 buckets per betting round respectively, while the important part used 169, 180,000, 1,530,000, and 1,680,000 buckets per betting round respectively. Buckets were calculated according to public card textures and k-means clustering over hand strength distributions [6] and yielded an imperfect recall abstract game by forgetting previous card information and rebucketing on every round [3]. The strategy profile of this abstract game was computed from approximately 498 billion iterations of the "Pure CFR" variant of CFR [Richard Gibson, PhD thesis, in preparation]. This type of strategy is also known as a "dynamic expert strategy" [7].

    2) A data biased response to aggregate data of 2011 and 2012 ACPC competitors
    The exploitive response in the portfolio was created using data biased robust counter strategies [8] to aggregate data from all of the agents in the 2011 and 2012 heads-up no-limit ACPC events. It uses the same betting abstraction as the above Nash equilibrium approximation, but the card abstraction consist of 169, 9000, 9000, and 3700 buckets per betting round uniformly across the game tree.

    A mixture of these agents is dynamically generated using a slightly modified Exp4-like algorithm [4] where the reward vector for the experts/strategies is computed using importance sampling over the individual strategies [2].
  • References and related papers:
    1. Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret minimization in games with incomplete information" In NIPS 2008.
    2. Michael Bowling, Michael Johanson, Neil Burch, and Duane Szafron. "Strategy Evaluation in Extensive Games with Importance Sampling". In Proceedings of the 25th Annual International Conference on Machine Learning (ICML), 2008.
    3. Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. "A Practical Use of Imperfect Recall". Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.
    4. P Auer, N Cesa-Bianchi, Y Freund, and R.E Schapire. "Gambling in a rigged casino: The adversarial multi-armed bandit problem". Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 1995.
    5. Nolan Bard, Michael Johanson, Neil Burch, Michael Bowling. "Online Implicit Agent Modelling". In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.
    6. Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. "Evaluating State-Space Abstractions in Extensive-Form Games". In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.
    7. Richard Gibson and Duane Szafron. "On Strategy Stitching in Large Extensive Form Multiplayer Games". In Proceedings of the Twenty-Fifth Conference on Neural Information Processing Systems (NIPS), 2011.
    8. Michael Johanson and Michael Bowling. "Data Biased Robust Counter Strategies". In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS), 2009.

KEmpfer

  • Team Name: KEmpfer
  • Team Leader: Eneldo Loza Mencia
  • Team Members: Eneldo Loza Mencia, Tomek Gasiorowski, Peter Glockner, Julian Prommer
  • Affiliation: Knowledge Engineering Group, Technische Universitat Darmstadt
  • Location: Darmstadt, Germany
  • Technique:
    The agent implements a list of expert rules and follows these. Additional opponent statistics are collected and these are used in the rules, but these rules are currently disabled. The backup strategy if no expert rule is found is to play according to the expected hand strength.

Koypetitor

  • Team Name:Koypetitor
  • Team Leader: Adrian Koy
  • Team Members: Adrian Koy, Andrej Kuttruf, assistants
  • Affiliation: Independent
  • Location: London, United Kingdom
  • Technique:

LIACC

  • Team Name:LIACC
  • Team Leader: Luis Filipe Teofilo
  • Team Members: Luis Filipe Teofilo
  • Affiliation: University of Porto, Artificial Intelligence and Computer Science Laboratory
  • Location: Porto, Portugal
  • Technique: Expected value maximization with game partition

Little Rock

  • Team Name: Little Rock
  • Team Leader: Rod Byrnes
  • Team Members: Rod Byrnes
  • Affiliation: Independent
  • Location: Goonellabah, NSW, Australia
  • Technique:
    Little Rock uses an external sampling monte carlo CFR approach with imperfect recall. All agents in this year's competition use the same card abstraction, which has 8192 buckets on each of the flop, turn and river, which are created by clustering all possible hands using a variety of metrics from the current and previous rounds. The 2 player limit agent uses no action abstrations. The other two agents use what I call a "cross-sectional" approach which abstracts aspects of the current game state rather than translating individual actions (which is what I call a "longitudinal" approach).
  • References and related papers:
    1. Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. In Advances in Neural Information Processing Systems 22 (NIPS), pp. 1078-1086, 2009.

Neo Poker Bot

  • Team Name: Neo Poker Laboratory
  • Team Leader: Alexander Lee
  • Team Members: Alexander Lee
  • Affiliation: Independent
  • Location:
  • Technique:
    The AI logic employs different combinations of Neural networks, Regret Minimization and Gradient Search Equilibrium Approximation, Decision Trees, Recursive Search methods as well as expert algorithms from professional poker player. Neo analyzes accumulated statistical data which allows the AI to adjust its style of play against opponents.

Nyx

  • Team Name: Nyx
  • Team Leader: Matej Moravcik
  • Team Members: Matej Moravcik, Martin Schmid
  • Affiliation: Charles University
  • Location: Prague, Prague.
  • Technique:
    Implementation of counterfactual regret minimization.

Sartre

  • Team Name: Sartre
  • Team Leader: Kevin Norris
  • Team Members: Kevin Norris, Jonathan Rubin, Ian Watson
  • Affiliation: University of Auckland
  • Location: Auckland, New Zealand
  • Technique:
  • References and related papers:

Slumbot

  • Team Name: Slumbot
  • Team Leader: Eric Jackson
  • Team Members: Eric Jackson
  • Affiliation: Independent
  • Location: Menlo Park, CA, USA
  • Technique:
    Slumbot NL uses a variant of counterfactual regret minimization with public chance sampling.
  • References and related papers:
    1. "Slumbot NL: Solving Large Games with Counterfactual Regret Minimization Using Sampling and Distributed Processing" from the upcoming proceedings of the Computer Poker Workshop at AAAI-13.

Tartanian6

  • Team Name: Tartanian6
  • Team Leader: Tuomas Sandholm
  • Team Members: Noam Brown, Sam Ganzfried, Tuomas Sandholm
  • Affiliation: Carnegie Mellon University
  • Location: Pittsburgh, PA, USA
  • Technique:
    Tartanian6 plays an approximate Nash equilibrium strategy that was computed using MCCFR with external sampling on an imperfect recall abstraction. For the river betting round, it computes undominated equilibrium strategies in a finer-grained abstraction in real-time using CPLEX's LP solver.
  • References and related papers:
    1. Sam Ganzfried and Tuomas Sandholm. 2013. Improving Performance in Imperfect-Information Games with Large State and Action Spaces by Solving Endgames. Computer Poker and Imperfect Information Workshop at the National Conference on Artificial Intelligence (AAAI).
    2. Sam Ganzfried and Tuomas Sandholm. 2013. Action Translation in Extensive-Form Games with Large Action Spaces: Axioms, Paradoxes, and the Pseudo-Harmonic Mapping. To appear in Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI).
    3. Sam Ganzfried, Tuomas Sandholm, and Kevin Waugh. 2012. Strategy Purification and Thresholding: Effective Non-Equilibrium Approaches for Playing Large Games. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS).
    4. Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. 2013. Evaluating State-Space Abstractions in Extensive-Form Games. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS).
    5. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. 2009. Monte Carlo Sampling for Regret Minimization in Extensive Games. In Proceedings of Advances in Neural Information Processing Systems (NIPS).

3-player Limit Texas Hold'em

HITSZ_CS_13

  • Team Name: HITSZ_CS_13
  • Team Leader: Xuan Wang
  • Team Members: Xuan Wang, Jiajia Zhang, Song Wu
  • Affiliation: School of Computer Science and Technology HIT
  • Location: Shenzhen, Guangdon province, China
  • Technique:
    Our program makes decision accoding to current hand strength and a set of precomputed probabilities, at the same time it tries to modeling the opponent. After the opponent model is built, the program will take advantage of the model when making decision.

Hyperborean3pl.iro

  • Team Name: Univeristy of Alberta
  • Team Leader: Michael Bowling
  • Team Members: Richard Gibson, Joshua Davidson, Michael Johanson, Nolan Bard, Neil Burch, John Hawkin, Trevor Davis, Christopher Archibald, Michael Bowling, Duane Szafron, Rob Holte
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Technique:
    Counterfactual Regret Minimization (CFR) [Zinkevich et al., NIPS 2007] was the main technique used to build this agent. Because 3-player hold'em is too large a game to apply CFR to directly, we employed an abstract game that merges card deals into "buckets" to create a game of manageable size [Gilpin & Sandholm, AAMAS 2007].

    To create our abstract game, we first partitioned the betting sequences into two parts: an "important" part, and an "unimportant" part. Importance was determined according to the frequency with which our 3-player programs from the 2011 and 2012 ACPCs were faced with a decision at that betting sequence, as well as according to the number of chips in the pot. Then, we employed two different granularities of abstraction, one for each part of this partition. The unimportant part used 169, 180,000, 18,630, and 875 buckets per betting round respectively, while the important part used 169, 1,348,620, 1,530,000, and 2,800,000 buckets per betting round respectively. Buckets were calculated according to public card textures and k-means clustering over hand strength distriubtions [Johanson et al., AAMAS 2013] and yielded an imperfect recall abstract game by forgetting previous card information and rebucketing on every round [Waugh et al., SARA 2009]. The agent plays the "current strategy profile" computed from approximately 303.6 billion iterations of the "Pure CFR" variant of CFR [Richard Gibson, PhD thesis, in preparation] applied to this abstract game. This type of strategy is also known as a "dynamic expert strategy" [Gibson & Szafron, NIPS 2011].

Hyperborean3pl.tbr

  • Team Name: Univeristy of Alberta
  • Team Leader: Michael Bowling
  • Team Members: Michael Bowling, Duane Szafron, Rob Holte, Chris Archibald, Michael Johanson, Nolan Bard, John Hawkin, Richard Gibson, Neil Burch, Josh Davidson, Trevor Davis
  • Affiliation: University of Alberta
  • Location: Edmonton, Alberta, Canada
  • Technique:
    Hyperborean is a data biased response to aggregate data of ACPC competitors from the 2010 and 2011 3-player limit competitions [4]. The strategy was generated using the Counterfactual Regret Minimization (CFR) algorithm [1] with imperfect recall abstractions. Buckets were calculated according to public card textures and k-means clustering over hand strength distributions [3] and yielded an imperfect recall abstract game by forgetting previous card information and rebucketing on every round [2]. The agent plays the "current strategy profile" generated after 20 billion iterations of external sampled CFR [5]. The abstraction uses 169, 10000, 5450, and 500 buckets on each round of the game, respectively.
  • References and related papers:
    1. Martin Zinkevich, Michael Johanson, Michael Bowling, and Carmelo Piccione. "Regret minimization in games with incomplete information" In NIPS 2008.
    2. Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, and Michael Bowling. "A Practical Use of Imperfect Recall". Proceedings of the Eighth Symposium on Abstraction, Reformulation and Approximation (SARA), 2009.
    3. Michael Johanson, Neil Burch, Richard Valenzano, and Michael Bowling. "Evaluating State-Space Abstractions in Extensive-Form Games". In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2013.
    4. Michael Johanson and Michael Bowling. "Data Biased Robust Counter Strategies". In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS), 2009.
    5. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. "Monte Carlo Sampling for Regret Minimization in Extensive Games". In Proceedings of the Twenty-Third Conference on Neural Information Processing Systems (NIPS), 2009.

KEmpfer

  • Team Name: KEmpfer
  • Team Leader: Eneldo Loza Mencia
  • Team Members: Eneldo Loza Mencia, Tomek Gasiorowski, Peter Glockner, Julian Prommer
  • Affiliation: Knowledge Engineering Group, Technische Universitat Darmstadt
  • Location: Darmstadt, Germany
  • Technique:
    The agent implements a list of expert rules and follows these. Additional opponent statistics are collected and these are used in the rules, but these rules are currently disabled. The backup strategy if no expert rule is found is to play according to the expected hand strength.

LIACC

  • Team Name:LIACC
  • Team Leader: Luis Filipe Teofilo
  • Team Members: Luis Filipe Teofilo
  • Affiliation: University of Porto, Artificial Intelligence and Computer Science Laboratory
  • Location: Porto, Portugal
  • Technique: Expected value maximization with game partition

Little Rock

  • Team Name: Little Rock
  • Team Leader: Rod Byrnes
  • Team Members: Rod Byrnes
  • Affiliation: Independent
  • Location: Goonellabah, NSW, Australia
  • Technique:
    Little Rock uses an external sampling monte carlo CFR approach with imperfect recall. All agents in this year's competition use the same card abstraction, which has 8192 buckets on each of the flop, turn and river, which are created by clustering all possible hands using a variety of metrics from the current and previous rounds. The 2 player limit agent uses no action abstrations. The other two agents use what I call a "cross-sectional" approach which abstracts aspects of the current game state rather than translating individual actions (which is what I call a "longitudinal" approach).
  • References and related papers:
    1. Monte Carlo Sampling for Regret Minimization in Extensive Games. Marc Lanctot, Kevin Waugh, Martin Zinkevich, and Michael Bowling. In Advances in Neural Information Processing Systems 22 (NIPS), pp. 1078-1086, 2009.

Neo Poker Bot

  • Team Name: Neo Poker Laboratory
  • Team Leader: Alexander Lee
  • Team Members: Alexander Lee
  • Affiliation: Independent
  • Location:
  • Technique:
    The AI logic employs different combinations of Neural networks, Regret Minimization and Gradient Search Equilibrium Approximation, Decision Trees, Recursive Search methods as well as expert algorithms from top players in different games of poker. The AI was not optimized to play against computer players.