Content

Game Overview
1. Game Properties
Two-Player Zero-Sum Game
1. Definition
2. Intuition
3. MiniMax Strategy
4. Real-Time/Online search

Game Overview

Generalization of Search Problems
- Game tree: actions of >= 1 players/agents
- Agents acting to maximize their profits
- Others' profits might not have positive effect on yours
Key Features of Game
- Each player has different goal
- Different paths/states assigned different costs
- Each player tries to alter the world to best benefit itself

Game Properties

Two-player
Discrete: game states or decision can be mapped on discrete values
Finite: a finite # of states/decisions can be made
Zero-sum (Fully Competitive): A gains = B loses
- Strategic/normal form game: one shot e.g. rock-paper-scissors
- Extensive form game: turn-taking, multiple moves
Deterministic: no chance involved
Perfect Information: all aspects of the state are fully observable

Two-Player Zero-Sum Game

Definition

2 players: A(Max), B(Min)
States S
Initial state I
Terminal positions T
Successor functions
- Input: a state
- Return: a set of possible next states
Utility/Payoff function V: T -> R
- Mapping showing how good each terminal state is for A/how bad for B

Intuition

Game ends when terminal t ∊ T reached
Game state: state-player pair
A gets V(t), B gets -V(t)

MiniMax Strategy

Build full game tree
- Root: start state
- Edges: possible moves
- Leaves: terminals (utilities U(t) labeled)
Back values up the tree
- U(n) = min{ U(c): c is a child of n if n is Min node }
- U(n) = max{ U(c): c is a child of n if n is Max node }
O(b^d) space

Depth-First Implementation

To avoid expanding the tree exponentially in size
Need finite depth

DFMiniMax(n, Player): //return Utility of state n given that Player is MIN or MAX

    If n is TERMINAL:
        Return V(n) //Return terminal states utility (V is specified as part of game)

    //Apply Player s moves to get successor states. 
    ChildList = n.Successors(Player)

    If Player == MIN:
        return minimum of DFMiniMax(c, MAX) over c ∈ ChildList
    Else: //Player is MAX
        return maximum of DFMiniMax(c, MIN) over c ∈ ChildList

Pruning

α-cuts(max node) & β-cuts(min node)

AlphaBeta(n,Player,alpha,beta): //return Utility of state 

    If n is TERMINAL:
        return V(n) //Return terminal states utility 

    ChildList = n.Successors(Player)

    If Player == MAX:
        v = -infinity
        for c in ChildList:
            v = max(v, AlphaBeta(c,MIN,alpha,beta)) 
            alpha = max(v, alpha)
            If beta <= alpha:
                break 
        return v

    Else: //Player == MIN 
        v = infinity
        for c in ChildList:
            v = min(v, AlphaBeta(c,MAX,alpha,beta)) 
            beta = min(v, beta)
            If beta <= alpha:
                break 
        return v

// Initial call: AlphaBeta(START_NODE, PLAYER, -infinity, infinity)

O(b^d) space if no pruning, O(b^(d/2)) if optimal pruning
- Branching factor of 1st layer: B; 2nd: 1; 3rd: B; ...
In practice, must make heuristic estimates of the terminal nodes -> evaluation function

Real-Time/Online search

Run A* until out of memory
Use evaluation function to decide which path looks best
Make the move
Restart search at the node reached (can be cached)

Game Tree Search

Content

Game Overview

Game Properties

Two-Player Zero-Sum Game

Definition

Intuition

MiniMax Strategy

Depth-First Implementation

Pruning

Real-Time/Online search

results matching ""

No results matching ""