The games implemented in the framework from very simple test games (Tic Tac Toe) to complex strategy games (Pandemic), as well as different challenges for AI players. All implemented games can be found in the
games package, each registered in the
games.GameType class; this class allows specifying properties for each game, so that games with a specific property can be automatically listed for experiments (e.g. users could run AI players on all games with the “cooperative” tag).
The table below lists the current games in roughly ascending order of complexity. Lines is the number of lines of code, and Actions is the number of different Classes that extend AbstractAction.
|Game||Lines||Actions||Turn Order||Notable Features|
|Tic Tac Toe||533||1||Standard||2 players, perfect information|
|Connect Four||613||1||Standard||2 players, perfect information|
|Dots And Boxes||475||1||Standard||2+ players, perfect information|
|Blackjack||834||2||Bespoke||Each player is playing a game of solitaire against the dealer, who plays a fixed policy|
|Can’t Stop||622||3||Standard||Perfect information stochastic game. Introduces dice.|
|Diamant||798||3||Simultaneous||The main action is decided simultaneously by all players, which means that the copy() method must hide this information when asking a player for their turn.|
|Uno||1133||2||Standard||Turn Order can skip players and reverse direction. Introduces shuffling of hidden information on copy() - this applies to all later games on this list too.|
|Sushi GO||1322||3||Simultaneous||Card-drafting during play ensures that players gain increasing information about the cards held by other players. Simultaneous play of cards is more complicated than Diamant.|
|Love Letter||1332||11||Standard||Many of the actions provide partial information to players. The first game on this list that fully uses PartialObservableDecks|
|Exploding Kittens||1561||11||Standard||Has no direct ‘score’, but last player standing wins. The Forward Model implements an ActionStack to keep track of interrupt actions (‘Nopes’) that can be played to cancel a just-played action. We would not recommend you copy this pattern, but instead use the IExtendedSequence framework that was introduced later (see Dominion)|
|Virus!||1668||13||Standard||Uses some custom Components.|
|Poker||1827||6||Standard||Uses a full deck of standard playing cards. Implementation of bidding and raising for wagering on results.|
|Stratego||1110||3||Standard||2-player only game with the identity of pieces hidden until encountered on a grid map.|
|BattleLore||1623||4||Standard||2-player only wargame. Units act on a hex-map and have varying abilities.|
|Colt Express||3178||16||Bespoke||First game on this list with two distinct Phases with multiple actions in each (Planning phase, followed by Stealing phase). There is as a result much more logic in the Turn Order implementation. also uses some custom Components to represent the train compartments.|
|Pandemic||3358||11||Bespoke||Co-operative game that uses the Dynamic Workflow framework to define rules and the flow of play. Considerable logic now held in Turn Order.|
|Catan||3650||16||Bespoke||Game of building and trading resources on a hex map. The action space for trading is complex as the game formally allows for free-form negotiation.|
|Dominion||5780||30||Bespoke||Deck-building card game where a player can take several actions on their turn, which is also split into distinct phases. Some actions give other players the opportunity to play their own actions/interrupt before control passes back. Uses the IExtendedSequence framework extensively so that the logic for each action is cleanly encapsulated in one Class.|
Players alternate placing their symbol in a NxN grid until one player completes a line, column or diagonal and wins the game; if all cells in the grid get filled up without a winner, the game is a draw.
This is the simplest game included in the framework, meant to be used as a quick reference for the minimum requirements to get a game up and running. Its implementation makes use of mostly default concepts and components, but it implements a scoring heuristic and a custom GUI for an easier interaction given the specific game mechanics.
Players alternate turn to drop a token into a 6x7 grid where it falls under gravity. First player to form a line of four wins.
This uses the GridBoard component, like Tic Tac Toe, but can support more than 2 players and it uses the edges of the board instead of the squares. Each player takes it in turn to connect two orthogonally adjacent dots on a 5x5 grid. If their lines surround a box on all four sides, then they win a point. The game ends when all the edges have been drawn.
This uses the infrastructure around cards (and the classic French deck). This is really a solitaire game as all players play against the dealer, who plays a fixed strategy. This is implemented with the dealer as the first player - this is not ideal and the dealer should formally be part of the Forward Model.
This introduces dice, and players may now make multiple moves (rolling dice and then deciding how to use them to advance their markers) until they either choose to stop, or go bust (roll dice that can’t be used). This is all controlled from within the Forward Model for the game (rather than Turn Order) given the relative simplicity. (Using Turn Order would have been a perfectly valid alternative design choice.)
This is a push-your-luck game with simultaneous moves and hidden information. Each player decides simultaneously whether to stay in the Cave or exit. All players that exit on the same turn share the available treasure; those that stay in the mine potentially get greater future rewards, but get nothing on that round if the Cave collapses due to two identical Trap cards (one card is revealed after all players have decided).
TAG always asks each player sequentially for their decision. To simulate the simultaneous move aspect, the information in GameState of decisions made by other players are removed when this is copied and passed to the next player to ensure that information is correctly hidden.
Each player starts with a hand of seven cards. They each play one simultaneously and pass the remaining cards to the player to their left. At the end of a round players score points for set collection, with different cards having different points for sets. This is repeated over three rounds to determine an overall winner.
The interesting feature in TAG is that at the start of a round the player has no information on the content of the other players hands, but as each card is played they get more and more information as hands are passed round the table. After N-1 rounds (with N players), all players have perfect information on the cards in all hands (assuming infallible memory), but simultaneous play of cards still leads to a level of imperfect information.
Each card represents a character, which in turn consists of a value and a unique effect. At the start of the game, each player holds one card. A second card is drawn at the beginning of each turn. Given the two cards in a player’s hand, they choose one to be played. Card effects can target other players and may exclude them from the remaining game. After the last card of the deck has been drawn, the player with the highest valued card wins the current round. The first player who has won 5 rounds wins the game.
The game features partial observability (opponent cards), asymmetric and changing player roles as well as a point system over multiple rounds.
The game consists of numbered and action cards. Numbered cards are of one of four colours and can only be played in case either the colour or the number matches the newest card on the discard pile. Action cards shake up the game by letting players draw additional cards, choosing the next colour to be played and reversing the turn order. A player wins after gaining a number of points over several rounds (computed as the sum of all other player’s card values).
Similarly to Love Letter, Uno is played multiple rounds during which the winning player of each round scores points. This game has the potential to be the longest game in the framework since players need to draw new cards in case they cannot play any card.
Each player has a body, which consists of four organs, one of each colour, plus a wild one. The organs can be: infected, when an opponent player plays a virus card on it; vaccinated, when the player plays a medicine on the organ; immunised if two medicine cards are placed on the organ; or destroyed if the opponents place two consecutive virus on the organ. The winner is the first player who builds a human body, while avoiding opponents from infecting it, destroying it, or stealing the player’s organs. The game features partial observability, with the draw pile and opponents’ cards being hidden.
Players try to avoid drawing an exploding kitten card while collecting other useful cards. Each card gives a player access to unique actions which shake up the game-state, e.g. selecting the player taking a turn next and shuffling the deck.
In contrast to previous games, Exploding Kittens features an action stack in which players have the chance to react to cards played by other players using a Nope card. A Nope card cancels the most recent effect, but can itself be cancelled by another Nope card. The turn order and action stack are realised by extending base-classes of the framework.
Since Exploding Kittens was implemented, the core framework has been extended with the
IExtendedSequence interface which should provide the need for future games to implement this Stack arrangement themselves (see Dominion).
This is implements the classic rules of Texas Hold’em. A bid can be raised by any of 10%, 20%, 30% or 40% in the base implementation.
A 2-player only game on a grid map. Each player has a number of different units and seeks to capture the opponent’s Flag. At the start of the game the location of all the opponent’s pieces are known, but not their identity. The identity is only revealed on an attempt to capture a piece. The full rules would allow an initial placement phase for all the pieces; in this implementation there are three fixed setups used, and one is picked randomly.
The board is represented by a train of multiple compartments with two levels each. A player controls a bandit with a unique special ability. Their goal is to collect money while traversing the compartments and avoiding the sheriff.
The game consists of several rounds, each with a planning and an execution phase. During the planning phase, players take turns in playing action cards from a random set and thereby create a stack of actions. In the subsequent execution phase, the stack is processed by asking each player how to use the current action. Those actions can move the player’s character along the compartments, interact with neighbouring players, or collect money. The game ends after all rounds are played, the player with the most money being the winner.
This processing scheme forces players to adapt their pre-planned strategy according to all the moves that have been played already, in an interesting case of partial observability/non-determinism: the opponents’ type of action may be known (sometimes completely hidden in a round), but not the way in which it will be executed. Additionally, the unique abilities create asymmetric game-play where the overall strategy should be adapted to the bandit’s ability.
This is strictly a two-player game. Each player controls a number of military units on a hex-map. On their turn they play a command card from their hand which allows them to move a restricted subset of their units, based on the type of unit and/or their location on the map. Combat is resolved with dice, and each unit has specific advantages and vulnerabilities viz-a-viz others. This is the first game to introduce dice in TAG, with stocahasticity other than the order of shuffled decks of cards.
The full board game has many different faction and unit types. Currently TAG only implements the learning scenario from the second edition base game with 4 Viper Legions and 5 Blood Harvesters versus 4 Yeoman Archers and 5 Citadel Guards.
Pandemic is a complex cooperative board game for 2 to 4 players. The board represents a map of the world, where the major cities are connected by a graph structure. The game starts with four diseases breaking out in the world and the objective of the players is to cure all diseases, which keep spreading after each player’s turn. At the beginning of the game, each player is assigned a unique role with special abilities and cards that can be used for travelling between cities, building research stations or curing diseases. If the players do not treat the diseases, they quickly spread to more cities leading to outbreaks. All players lose if they run out of cards in the draw deck, if too many outbreaks occur or if the disease spread too much. Special event cards are available to the players, which can be played anytime (even outside the player’s turn). In each turn, the player can play up to 4 consecutive actions, with a changing action space (e.g. moving to a new city may result in new actions that the player can do there).
To handle event cards and actions that require consent or reactions from other players, a reaction system has been implemented, so that a player can be asked to return an action when needed. Pandemic also features partial observability, as the draw deck and the infection deck are not visible. This game employs the tree-structured rule system for handling its main game loop.
The classic Euro-game of building and trading on a hexagonal island. The trading rules are formally totally free-form in negotiation. To implement this in TAG, any trade is restricted to a number of offer/counter-offer phases (controlled by a game parameter). The initial offers are always very aggressive, and select the player and resource pair to be traded; this then fixes the player and resources for the following actions until the trade is agreed or declined.
Dominion is a deck-building game in which players start with basic cards, and then buy cards with better values and abilities to add to their deck. IT has the common structure of Euro-games of first needing to build an ‘engine’, and then using that engine to get victory points. A short-sighted strategy of buying victory points immediately will do very badly, against human players at least, and good agents need to learn the virtues of delayed gratification.
The game makes extensive use of
IExtendedSequence to deal with two common issues in games:
1) Move Groups This is where we have to make what is formally a single ‘action’, but one which makes sense to break up into a set of distinct decisions to avoid combinatorial explosions. An example in Dominion is where we have to discard 3 cards out of 6. This gives a total of 20 possible actions (6C3); but might be more tractable to consider as three distinct decisions - the first card to discard (out of 6), then the second (out of 5), and then the third (out of 4). This formally gives 120 options over the three decisions, but can be very helpful where there is clearly one best card to discard, which will focus MCTS rapidly on this as the first action. An example in Catan might be first deciding to what to build (Development Card, Settlement, or Road if you have cards for any of them), and then secondly deciding where to build it.
2) Extended Actions
In Dominion we can have chains of decisions that cascade from each other. with other players interrupting in the sequence. For example if I play a ‘Militia’, each other player decides first whether to defend themselves with a Reaction card (which may in turn enable a whole set of further decision), or else discards down to a hand of three cards (their decision as to which to discard). This circulates round the table before I continue with my turn.
This sort of thing is perfectly trackable within ForwardModel and TurnOrder, but directly it makes better design sense to encapsulate all this logic and tracking in one place (the AbstractAction) to prevent ForwardModel (or TurnOrder) becoming very bloated with lots of
if..then..else style logic.
See the documentation for more details.