21CSC206T_UNIT3.pptx.pdf ARITIFICIAL INTELLIGENCE

• Adversarial search: Search based on Game theory- Agents-
Competitive environment
• According to game theory, a game is played between two players.
To complete the game, one has to win the game and the other
looses automatically.’
• Such Conflicting goal- adversarial search
• Game playing technique- Those games- Human Intelligence and
Logic factor- Excluding other factors like Luck factor
• Tic-Tac-Toe, Checkers, Chess – Only mind works, no luck works
Adversarial search Methods-Game
playing-Important concepts
19/12/23 2

• Techniques required to get the best
optimal solution (Choose Algorithms
for best optimal solution within limited
time)
• Pruning: A technique which allows
ignoring the unwanted portions of
a search tree which make no
difference in its final result.
• Heuristic Evaluation Function: It
allows to approximate the cost
value at each level of the search
tree, before reaching the goal
node.
Adversarial search Methods-Game
19/12/23 3

• To play a game, we use a game tree to know all the
possible choices and to pick the best one out. There are
following elements of a game-playing:
• S0
: It is the initial state from where a game begins.
• PLAYER (s): It defines which player is having the current
turn to make a move in the state.
• ACTIONS (s): It defines the set of legal moves to be
used in a state.
• RESULT (s, a): It is a transition model which defines the
result of a move.
• TERMINAL-TEST (s): It defines that the game has ended
and returns true.
• UTILITY (s,p): It defines the final value with which the
game has ended. This function is also known
as Objective function or Payoff function. The price
which the winner will get i.e.
• (-1): If the PLAYER loses.
• (+1): If the PLAYER wins.
• (0): If there is a draw between the PLAYERS.
• For example, in chess, tic-tac-toe, we have two or three
possible outcomes. Either to win, to lose, or to draw the
match with values +1,-1 or 0.
• Game Tree for Tic-Tac-Toe
• Node: Game states, Edges: Moves taken by players
Game playing and knowledge structure-
Elements of Game Playing search
19/12/23 4

• INITIAL STATE (S0
): The top node in
the game-tree represents the initial
state in the tree and shows all the
possible choice to pick out one.
• PLAYER (s): There are two
players, MAX and MIN. MAX begins
the game by picking one best move
and place X in the empty square
box.
• ACTIONS (s): Both the players can
make moves in the empty boxes
chance by chance.
• RESULT (s, a): The moves made
by MIN and MAX will decide the
outcome of the game.
• TERMINAL-TEST(s): When all the
empty boxes will be filled, it will be
the terminating state of the game.
• UTILITY: At the end, we will get to
know who wins: MAX or MIN, and
accordingly, the price will be given to
them.
Game playing and knowledge structure-
Elements of Game Playing search
19/12/23 5

• Types of algorithms in Adversarial search
• In a normal search, we follow a sequence of actions to reach the goal or to
finish the game optimally. But in an adversarial search, the result depends
on the players which will decide the result of the game. It is also obvious
that the solution for the goal state will be an optimal solution because the
player will try to win the game with the shortest path and under limited
time.
• Minmax Algorithm
• Alpha-beta Pruning
Game as a search problem
19/12/23 6

• Game vs. search problem
• "Unpredictable" opponent 🡪 specifying a
move for every possible opponent reply
• Time limits 🡪 unlikely to find goal, must
approximate
Game Playing vs. Search
19/12/23 7

• Formal definition of a game:
• Initial state
• Successor function: returns list of (move, state) pairs
• Terminal test: determines when game over
Terminal states: states where game ends
• Utility function (objective function or payoff
function): gives numeric value for terminal states
We will consider games with 2 players (Max and Min);
Max moves first.
Game Playing
19/12/23 8

Tree from
Max’s
perspective
Game Tree Example:
Tic-Tac-Toe
19/12/23 9

• Minimax algorithm
• Perfect play for deterministic, 2-player game
• Max tries to maximize its score
• Min tries to minimize Max’s score (Min)
• Goal: move to position of highest minimax value
🡪 Identify best achievable payoff against best play
Minimax Algorithm
19/12/23 10

Payoff for Max
Minimax Algorithm
19/12/23 11

• Goal of game tree search: to determine one move for
Max player that maximizes the guaranteed payoff for a
given game tree for MAX
Regardless of the moves the MIN will take
• The value of each node (Max and MIN) is determined by
(back up from) the values of its children
• MAX plays the worst case scenario:
Always assume MIN to take moves to maximize his
pay-off (i.e., to minimize the pay-off of MAX)
• For a MAX node, the backed up value is the maximum
of the values associated with its children
• For a MIN node, the backed up value is the minimum of
the values associated with its children
Minimax Rule
19/12/23 12

1. Create start node as a MAX node with current board configuration
2. Expand nodes down to some depth (i.e., ply) of lookahead in the
game.
3. Apply the evaluation function at each of the leaf nodes
4. Obtain the “back up" values for each of the non-leaf nodes from
its children by Minimax rule until a value is computed for the root
node.
5. Pick the operator associated with the child node whose backed
up value determined the value at the root as the move for MAX
Minimax procedure
19/12/23 13

2 7 1 8
MAX
MIN
2 7 1 8
2 1
2 7 1 8
2 1
2
2 7 1 8
2 1
2
This is the move
selected by minimax
Static evaluator
value
Minimax Search
19/12/23 14

3 9 0 7 2 6
Payoff for Max
Minimax Algorithm (cont’d)
19/12/23 15

3 9 0 7 2 6
3 0 2
Payoff for Max
19/12/23 16

3 9 0 7 2 6
3 0 2
3
Payoff for Max
19/12/23 17

• Properties of minimax algorithm:
• Complete? Yes (if tree is finite)
Optimal? Yes (against an optimal opponent)
Time complexity? O(bm
)
• Space complexity? O(bm) (depth-first exploration, if it generates all
successors at once)
m – maximum depth of tree; b branching factor
m – maximum depth of the tree; b – legal moves;
19/12/23 18

• Limitations
• Not always feasible to traverse entire tree
• Time limitations
• Key Improvement
• Use evaluation function instead of utility
• Evaluation function provides estimate of utility at given position
Minimax Algorithm
19/12/23 19

Unit 2 List of Topics
• Searching techniques – Uninformed search –
General search Algorithm
• Uninformed search Methods – Breadth First
Search
• Uninformed search Methods – Depth First
Search
• Uninformed search Methods – Depth limited
Search
• Uniformed search Methods- Iterative
Deepening search
• Bi-directional search
• Informed search- Generate and test, Best First
search
• Informed search-A* Algorithm
• AO* search
• Local search Algorithms-Hill Climbing, Simulated
Annealing
• Local Beam Search
• Genetic Algorithms
• Adversarial search Methods-Game
• Game playing and knowledge structure.
• Game as a search problem-Minimax Approach
• Minimax Algorithm
• Alpha beta pruning
• Game theory problems
19/12/23 20

21
Alpha Beta Pruning
•Alpha-beta pruning is a modified version of the minimax
algorithm. It is an optimization technique for the minimax
algorithm.
•As we have seen in the minimax search algorithm that the
number of game states it has to examine are exponential in
depth of the tree.
•Since we cannot eliminate the exponent, but we can cut it to
half.
•Hence there is a technique by which without checking each
node of the game tree we can compute the correct minimax
decision, and this technique is called pruning.
•This involves two threshold parameter Alpha and beta for future
expansion, so it is called alpha-beta pruning. It is also called
as Alpha-Beta Algorithm.
•Alpha-beta pruning can be applied at any depth of a tree, and
19/12/23

• The two-parameter can be defined as:
• Alpha: The best (highest-value) choice we have found so far at any
point along the path of Maximizer. The initial value of alpha is -∞.
• Beta: The best (lowest-value) choice we have found so far at any point
along the path of Minimizer. The initial value of beta is +∞.
• The Alpha-beta pruning to a standard minimax algorithm
returns the same move as the standard algorithm does, but it
removes all the nodes which are not really affecting the final
decision but making algorithm slow. Hence by pruning these
nodes, it makes the algorithm fast.
Alpha Beta Pruning
19/12/23 22

Condition for Alpha-beta pruning:
• The main condition which required for alpha-beta pruning is:
α>=β
Key points about alpha-beta pruning:
• The Max player will only update the value of alpha.
• The Min player will only update the value of beta.
• While backtracking the tree, the node values will be passed to upper
nodes instead of values of alpha and beta.
• We will only pass the alpha, beta values to the child nodes.
Alpha Beta Pruning
19/12/23 23

function minimax(node, depth, alpha, beta, maximizingPlayer) is
if depth ==0 or node is a terminal node then
return static evaluation of node
if MaximizingPlayer then // for Maximizer Player
maxEva= -infinity
for each child of node do
s= minimax(child, depth-1, alpha, beta, False)
maxEva= max(maxEva, s)
alpha= max(alpha, maxEva)
if beta<=alpha
break
return maxEva
else // for Minimizer player
minEva= +infinity
Alpha Beta Pruning
19/12/23 24
for each child of node do
s= minimax(child, depth-1, alpha, beta, tru
e)
minEva= min(minEva, s)
beta= min(beta, minEva)
if beta<=alpha
break
return minEva

Working of Alpha-Beta Pruning:
• Let's take an example of two-player search tree to understand the working of
Alpha-beta pruning
• Step 1: At the first step the, Max player will start first move from node A where
α= -∞ and β= +∞, these value of alpha and beta passed down to node B where
again α= -∞ and β= +∞, and Node B passes the same value to its child D.
Alpha Beta Pruning
19/12/23 25

Step 2: At Node D, the value of α will be calculated as its turn for Max. The value of α is
compared with firstly 2 and then 3, and the max (2, 3) = 3 will be the value of α at
node D and node value will also 3.
Step 3: Now algorithm backtrack to node B, where the value of β will change as this is a
turn of Min, Now β= +∞, will compare with the available subsequent nodes value, i.e.
min (∞, 3) = 3, hence at node B now α= -∞, and β= 3.
In the next step, algorithm traverse the next successor of Node B which is node E, and
the values of α= -∞, and β= 3 will also be passed.
Alpha Beta Pruning
19/12/23 26

Step 4: At node E, Max will take its turn, and the value of alpha will change.
The current value of alpha will be compared with 5, so max (-∞, 5) = 5,
hence at node E α= 5 and β= 3, where α>=β, so the right successor of E
will be pruned, and algorithm will not traverse it, and the value at node E
will be 5.
Alpha Beta Pruning
19/12/23 27

Step 5: At next step, algorithm again backtrack the tree, from node B to node A. At node A, the
value of alpha will be changed the maximum available value is 3 as max (-∞, 3)= 3, and β=
+∞, these two values now passes to right successor of A which is Node C.
At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0, and max(3,0)=
3, and then compared with right child which is 1, and max(3,1)= 3 still α remains 3, but the
node value of F will become 1.
Alpha Beta Pruning
19/12/23 28

• Step 7: Node F returns the node value 1 to node C, at C α= 3 and β= +∞,
here the value of beta will be changed, it will compare with 1 so min (∞,
1) = 1. Now at C, α=3 and β= 1, and again it satisfies the condition α>=β,
so the next child of C which is G will be pruned, and the algorithm will not
compute the entire sub-tree G.
Alpha Beta Pruning
19/12/23 29

Step 8: C now returns the value of 1 to A here the best value for A is max (3,
1) = 3. Following is the final game tree which is the showing the nodes
which are computed and nodes which has never computed. Hence the
optimal value for the maximizer is 3 for this example.
Alpha Beta Pruning
19/12/23 30

Move Ordering in Alpha-Beta pruning:
• The effectiveness of alpha-beta pruning is highly dependent on the order in which each
node is examined. Move order is an important aspect of alpha-beta pruning.
• It can be of two types:
• Worst ordering: In some cases, alpha-beta pruning algorithm does not prune any of the
leaves of the tree, and works exactly as minimax algorithm. In this case, it also consumes
more time because of alpha-beta factors, such a move of pruning is called worst
ordering. In this case, the best move occurs on the right side of the tree. The time
complexity for such an order is O(bm
).
• Ideal ordering: The ideal ordering for alpha-beta pruning occurs when lots of pruning
happens in the tree, and best moves occur at the left side of the tree. We apply DFS
hence it first search left of the tree and go deep twice as minimax algorithm in the same
amount of time. Complexity in ideal ordering is O(bm/2
).
Alpha Beta Pruning
19/12/23 31

Unit 2 List of Topics
• Searching techniques – Uninformed search –
General search Algorithm
• Uninformed search Methods – Breadth First
Search
• Uninformed search Methods – Depth First
Search
• Uninformed search Methods – Depth limited
Search
• Uniformed search Methods- Iterative
Deepening search
• Bi-directional search
• Informed search- Generate and test, Best First
search
• Informed search-A* Algorithm
• AO* search
• Local search Algorithms-Hill Climbing, Simulated
Annealing
• Local Beam Search
• Genetic Algorithms
• Adversarial search Methods-Game
• Game playing and knowledge structure.
• Game as a search problem-Minimax Approach
• Minimax Algorithm
• Alpha beta pruning
• Game theory problems
19/12/23 32

• It deals with Bargaining.
• The whole process can be expressed
Mathematically
• Based on Behavior Theory, has a more
casual approach towards study of Human
Behavior.
• It also considers how people Interact in
Groups.
What is Game Theory?
19/12/23 33

•Theory of rational behavior for interactive decision problems.
• In a game, several agents strive to maximize their (expected)
utility index by choosing particular courses of action, and each
agent's final utility payoffs depend on the profile of courses of
action chosen by all agents.
•The interactive situation, specified by the set of participants,
the possible courses of action of each agent, and the set of all
possible utility payoffs, is called a game;
• the agents 'playing' a game are called the players.
Game Theory Definition
19/12/23 34

Definition: Zero-Sum Game – A game in
which the payoffs for the players always adds
up to zero is called a zero-sum game.
Definition: Maximin strategy – If we
determine the least possible payoff for each
strategy, and choose the strategy for which this
minimum payoff is largest, we have the
maximin strategy.
Definitions
19/12/23 35

Definition: Constant-sum and nonconstant-sum game –
If the payoffs to all players add up to the same constant,
regardless which strategies they choose, then we have a
constant-sum game. The constant may be zero or any
other number, so zero-sum games are a class of
constant-sum games. If the payoff does not add up to a
constant, but varies depending on which strategies are
chosen, then we have a non-constant sum game.
A Further Definition
19/12/23 36

(1) Each decision maker has available to him
two or more well-specified choices or
sequences of choices.
(2) Every possible combination of plays
available to the players leads to a
well-defined end-state (win, loss, or draw)
that terminates the game.
(3) A specified payoff for each player is
associated with each end-state.
Game theory: assumptions
19/12/23 37

(4) Each decision maker has perfect
knowledge of the game and of his
opposition.
(5) All decision makers are rational; that
is, each player, given two alternatives, will
select the one that yields him the greater
payoff.
Game theory: assumptions (Cont)
19/12/23 38

⚫ A game is a contest involving two or more
decision makers, each of whom wants to win
⚫ Game theory is the study of how optimal strategies are
formulated in conflict
⚫ A player's payoff is the amount that the player
wins or loses in a particular situation in a game.
⚫ A players has a dominant strategy if that player's
best strategy does not depend on what other
players do.
⚫ A two-person game involves two parties (X and
Y)
⚫ A zero-sum game means that the sum of losses for one
player must equal the sum of gains for the other. Thus,
the overall sum is zero
Rules, Strategies, Payoffs, and
Equilibrium
19/12/23 39

• Two competitors are planning radio and newspaper
advertisements to increase their business. This is the
payoff matrix for store X. A negative number means
store Y has a positive payoff
Payoff Matrix - Store X
19/12/23
40

⚫ Look to the “cake cutting problem” to explain
⚫ Cutter – maximize the minimum the Chooser will
leave him
⚫ Chooser – minimize the maximum the Cutter will
get
Chooser 🡪
Cutter
Choose bigger
piece
Choose smaller
piece
Cut cake as evenly
as possible
Half the cake minus
a crumb
Half the cake plus a
crumb
Make one piece
bigger than the other
Small piece Big piece
Minimax Criterion
19/12/23 42

⚫ If the upper and lower values are the same, the number is called the
value of the game and an equilibrium or saddle point condition exists
⚫ The value of a game is the average or expected game outcome if the game is
played an infinite number of times
⚫ A saddle point indicates that each player has a pure strategy i.e., the strategy
is followed no matter what the opponent does
Minimax Criterion
19/12/23 43

• Von Neumann likened the solution point to the
point in the middle of a saddle shaped mountain
pass
• It is, at the same time, the maximum elevation reached
by a traveler going through the pass to get to the other
side and the minimum elevation encountered by a
mountain goat traveling the crest of the range
Saddle Point
19/12/23
44

Player Y’s
Strategies
Minimum Row
Number
Y1 Y2
Player X’s strategies X1 10 6 6
X2 -12 2 -12
Maximum Column
Number
10 6
Pure Strategy - Minimax Criterion
19/12/23 45

⚫ When there is no saddle point, players will play each strategy for a
certain percentage of the time
⚫ The most common way to solve a mixed strategy is to use the
expected gain or loss approach
⚫ A player plays each strategy a particular percentage of the time so that
the expected value of the game does not depend upon what the
opponent does
Y1
P
Y2
1-P
Expected Gain
X1
Q
4 2 4P+2(1-P)
X2
1-Q
1 10 1P+10(1-p)
4Q+1(1-Q) 2Q+10(1-q)
Mixed Strategy Game
19/12/23 46

4P+2(1-P) = 1P+10(1-P)
or: P = 8/11 and 1-p = 3/11
Expected payoff:
1P+10(1-P)
=1(8/11)+10(3/11)
EPX
= 3.46
4Q+1(1-Q)=2Q+10(1-q)
or: Q=9/11 and 1-Q = 2/11
Expected payoff:
EPY
=3.46
Mixed Strategy Game
: Solving for P & Q
19/12/23 47

• Using the solution procedure for a mixed strategy game, solve the
following game
Mixed Strategy Game : Example
19/12/23
48

Example
• This game can be solved by setting up the mixed strategy
table and developing the appropriate equations:
Mixed Strategy Game
19/12/23
49

Mixed Strategy Game: Example
19/12/23
50

Two-person zero-sum and constant-sum games are played according to the following
basic assumption:
Each player chooses a strategy that enables him/her to do the best he/she can, given
that his/her opponent knows the strategy he/she is following.
A two-person zero-sum game has a saddle point if and only if
Max (row minimum) = min (column maximum)
all all
rows columns
(1)
Two-Person Zero-Sum and
Constant-Sum Games
19/12/23 51

If a two-person zero-sum or constant-sum game has a saddle point, the row player should
choose any strategy (row) attaining the maximum on the right side of (1). The column player
should choose any strategy (column) attaining the minimum on the right side of (1).
In general, we may use the following method to find the optimal strategies and value of
two-person zero-sum or constant-sum game:
Step 1 Check for a saddle point. If the game has none, go on to step 2.
Constant-Sum Games (Cont)
19/12/23 52

Step 2 Eliminate any of the row player’s dominated strategies. Looking at the reduced
matrix (dominated rows crossed out), eliminate any of the column player’s dominated
strategies and then those of the row player. Continue until no more dominated strategies
can be found. Then proceed to step 3.
Step 3 If the game matrix is now 2 x 2, solve the game graphically. Otherwise, solve by
using a linear programming method.
Constant-Sum Games (Cont)
19/12/23 53

• Game theory assumes that the decision maker
and the opponent are rational, and that they
subscribe to the maximin criterion as the decision
rule for selecting their strategy
• This is often reasonable if when the other player is
an opponent out to maximize his/her own gains,
e.g. competitor for the same customers.
• Consider:
Player 1 with three strategies S1, S2, and S3 and
Player 2 with four strategies OP1, OP2, OP3, and
OP4.
Zero Sum Games
19/12/23 54

• The value 4 achieved by both players is
called the value of the game
• The intersection of S2 and OP2 is called
a saddle point. A game with a saddle
point is also called a game with an
equilibrium solution.
• At the saddle point, neither player can
improve their payoff by switching
strategies
Zero Sum Games (Cont)
19/12/23 55

Zero Sum Games- To do problem!
Let’s take the following example: Two TV channels (1 and 2) are competing for an
audience of 100 viewers. The rule of the game is to simultaneously announce the type of
show the channels will broadcast. Given the payoff matrix below, what type of show should
channel 1 air?
19/12/23 56

dominance method Steps (Rule)
Step-1: 1. If all the elements of Column-i are greater than or equal to the corresponding
elements of any other Column-j, then the Column-i is dominated by the Column-j and it
is removed from the matrix.
eg. If Column-2 ≥ Column-4, then remove Column-2
Step-2: 1. If all the elements of a Row-i are less than or equal to the corresponding elements of
any other Row-j, then the Row-i is dominated by the Row-j and it is removed from the
matrix.
eg. If Row-3 ≤ Row-4, then remove Row-3
Step-3:
Again repeat Step-1 & Step-2, if any Row or Column is dominated, otherwise stop the
procedure.
Two-person zero-sum game –
Dominance property
19/12/23 57

Player A
Player B
B1 B2 B3 B4
A1 3 5 4 2
A2 5 6 2 4
A3 2 1 4 0
A4 3 3 5 2
Solutio
n
Player B
B3 B4
Player
A
A2 2 4
A4 5 2
Two-person zero-sum game –
Dominance property- To do problem!
19/12/23 58

•The prisoner’s dilemma is a universal concept. Theorists now realize that prisoner’s dilemmas
occur in biology, psychology, sociology, economics, and law.
•The prisoner’s dilemma is apt to turn up anywhere a conflict of interests exists -- and the
conflict need not be among sentient beings.
• Study of the prisoner’s dilemma has great power for explaining why animal and human
societies are organized as they are. It is one of the great ideas of the twentieth century, simple
enough for anyone to grasp and of fundamental importance (...).
• The prisoner’s dilemma has become one of the premier philosophical and scientific issues of
our time. It is tied to our very survival (W. Poundstone,1992, p. 9).
The Prisoner’s Dilemma
19/12/23 59

• Two members of a criminal gang are arrested
and imprisoned.
– They are placed under solitary confinement and have no chance of communicating with
each other
• The district attorney would like to charge them
with a recent major crime but has insufficient
evidence
– He has sufficient evidence to convict each of them of a lesser charge
– If he obtains a confession from one or both the criminals, he can convict either or both on
the major charge.
Prisoner’s Dilemma
19/12/23 60

• The district attorney offers each the chance to
turn state’s evidence.
– If only one prisoner turns state’s evidence and testifies against his partner he will go free
while the other will receive a 3 year sentence.
– Each prisoner knows the other has the same offer
– The catch is that if both turn state’s evidence, they each receive a 2 year sentence
– If both refuse, each will be imprisoned for 1 year on the lesser charge
Prisoner’s Dilemma
19/12/23 61

• The number of players
• Their strategies and their turn
• Their payoffs (profits, utilities etc) at the outcomes of the game
A game is described by
19/12/23 62

Payoff matrix
Normal- or strategic form
Left Right
Top 3, 0 0, -4
Bottom 2, 4 -1, 3
Player B
Player
A
Game Theory Definition
19/12/23 63

How to solve a situation like this?
• The most simple case is where there is a optimal choice of strategy no matter
what the other players do; dominant strategies.
• Explanation: For Player A it is always better to choose Top, for Player B it is
always better to choose left.
• A dominant strategy is a strategy that is best no matter what the other player
does.
Game Playing
19/12/23 64

• If Player A’s choice is optimal given Player B’s choice, and B’s
choice is optimal given A’s choice, a pair of strategies is a
Nash equilibrium.
• When the other players’ choice is revealed neither player like
to change her behavior.
• If a set of strategies are best responses to each other, the
strategy set is a Nash equilibrium.
Nash equilibrium
19/12/23 65

Left Right
Top 1, 1 2, 3*
Bottom 2, 3* 1, 2
Player B
Player
A
Payoff matrix
19/12/23 66

• Here you can find a Nash equilibrium; Top is the best
response to Right and Right is the best response to
Top. Hence, (Top, Right) is a Nash equilibrium.
• But there are two problems with this solution concept.
Solution
19/12/23 67

• A game can have several Nash equilibriums. In this case
also (Bottom, Left).
• There may not be a Nash equilibrium (in pure
strategies).
Problems
19/12/23 68

Left Right
Top 1, -1 -1, 1
Bottom
-1, 1
1, -1
Player B
Player
A
Payoff matrix
19/12/23 69

• Here it is not possible to find strategies that are best
responses to each other.
• If players are allowed to randomize their strategies we
can find s solution; a Nash equilibrium in mixed
strategies.
• An equilibrium in which each player chooses the
optimal frequency with which to play her strategies
given the frequency choices of the other agents.
Nash equilibrium in mixed
strategies
19/12/23 70

Two persons have committed a crime, they are held in
separate rooms. If they both confess they will serve
two years in jail. If only one confess she will be free
and the other will get the double time in jail. If both
deny they will be hold for one year.
The prisoner’s dilemma
19/12/23 71

Confess Deny
Confess -2, -2 0, -4
Deny -4, 0 -1, -1*
Prisoner B
Prisoner
A
Prisoner’s dilemma
Confess is a dominant strategy for both. If both Deny
they would be better off. This is the dilemma.
Solution
19/12/23 72

HENRY
JANE
L R
U 8,7 4,6
D 6,5 7,8
McD (1)
KFC
(1)
L R
U 9,9* 1,10
D 10,1 2,2
COKE
PEPSI
L R
U 6,8* 4,7
D 7,6 3,7
B
A
L R
U 7,6* 5,5
D 4,5 6,4
Nash Equilibrium – To do Problems!
19/12/23 73

GAME PLAYING & MECHANISM DESIGN
Mechanism Design is the design of games or
reverse engineering of games; could be called
Game Engineering
Involves inducing a game among the players
such that in some equilibrium of the game,
a desired social choice function is implemented
19/12/23 74

Example 1: Mechanism Design
Fair Division of a Cake
Mother
Social Planner
Mechanism Designer
Kid 1
Rational and
Intelligent
Kid 2
Rational and
Intelligent
19/12/23 75

Example 2: Mechanism Design
Truth Elicitation through an Indirect Mechanism
Tenali Rama
(Birbal)
Mechanism Designer
Mother 1
Rational and
Intelligent Player
Mother 2
Rational and
Intelligent Player
Baby
19/12/23 76

MECHANISM DESIGN: EXAMPLE 3 : VICKREY AUCTION
One Seller, Multiple Buyers, Single Indivisible Item
Example: B1: 40, B2: 45, B3: 60, B4: 80
Winner: whoever bids the highest; in this case B4
Payment: Second Highest Bid: in this case, 60.
Vickrey showed that this mechanism is Dominant Strategy
Incentive Compatible (DSIC) ;Truth Revelation is good for
a player irrespective of what other players report
19/12/23 77

Four Basic Types of Auctions
1
n
Seller
Buyers
Winner = 4
Price = 60
1
2
3
4
Dutch Auction
Vickrey Auction
Winner = 4
Price = 60
1
2
3
4
50
First Price Auction
55
60
40 40
45
60
80
1
n
Auctioneer or seller
English Auction
Buyers
Buyers Buyers
0, 10, 20, 30,
40, 45, 50, 55,
58, 60, stop.
100, 90, 85, 75,
70, 65, 60, stop.
78

• It uses just condition-action rules
● The rules are like the form “if … then …”
● efficient but have narrow range of applicability
● Because knowledge sometimes cannot be stated explicitly
● Work only
● if the environment is fully observable
Simple reflex agents
19/12/23 79

19/12/23 80

19/12/23 81

percepts
(size, motion)
RULES:
(1) If small moving object,
then activate SNAP
(2) If large moving object,
then activate AVOID and inhibit SNAP
ELSE (not moving) then NOOP
Action: SNAP or AVOID or NOOP
needed for
completeness
A Simple Reflex Agent in Nature
19/12/23 82

• For the world that is partially observable
● the agent has to keep track of an internal state
● That depends on the percept history
● Reflecting some of the unobserved aspects
● E.g., driving a car and changing lane
• Requiring two types of knowledge
● How the world evolves independently of the agent
● How the agent’s actions affect the world
Model-based Reflex Agents
19/12/23 83

Saw an object ahead,
and turned right, and
it’s now clear ahead
Go straight
Saw an object Ahead,
turned right, and object
ahead again
Halt
See no objects ahead Go straight
See an object ahead Turn randomly
IF THEN
Example Table Agent
With Internal State
19/12/23 84

Actions: left, right, straight, open-door
Rules:
1. If open(left) & open(right) and open(straight) then
choose randomly between right and left
2. If wall(left) and open(right) and open(straight) then straight
3. If wall(right) and open(left) and open(straight) then straight
4. If wall(right) and open(left) and wall(straight) then left
5. If wall(left) and open(right) and wall(straight) then right
6. If wall(left) and door(right) and wall(straight) then open-door
7. If wall(right) and wall(left) and open(straight) then straight.
8. (Default) Move randomly
start
Example Reflex Agent With Internal State
Wall-Following
19/12/23 85

The agent is with memory
19/12/23 86

19/12/23 87

• Current state of the environment is always not
enough
• The goal is another issue to achieve
● Judgment of rationality / correctness
• Actions chosen  goals, based on
● the current state
● the current percept
Goal-based agents
19/12/23 88

• Conclusion
● Goal-based agents are less efficient
● but more flexible
● Agent  Different goals  different tasks
● Search and planning
● two other sub-fields in AI
● to find out the action sequences to achieve its goal
Goal-based agents
19/12/23 89

• Goals alone are not enough
● to generate high-quality behavior
● E.g. meals in Canteen, good or not ?
• Many action sequences  the goals
● some are better and some worse
● If goal means success,
● then utility means the degree of success (how
successful it is)
Utility-based agents
19/12/23 91

Utility-based agents(4)
19/12/23 92

• it is said state A has higher utility
● If state A is more preferred than others
• Utility is therefore a function
● that maps a state onto a real number
● the degree of success
Utility-based agents
19/12/23 93

• Utility has several advantages:
● When there are conflicting goals,
● Only some of the goals but not all can be achieved
● utility describes the appropriate trade-off
● When there are several goals
● None of them are achieved certainly
● utility provides a way for the decision-making
Utility-based agents (3)
19/12/23 94

• After an agent is programmed, can it work
immediately?
● No, it still need teaching
• In AI,
● Once an agent is done
● We teach it by giving it a set of examples
● Test it by using another set of examples
• We then say the agent learns
● A learning agent
Learning Agents
19/12/23 95

• Four conceptual components
● Learning element
● Making improvement
● Performance element
● Selecting external actions
● Critic
● Tells the Learning element how well the agent is doing with
respect to fixed performance standard.
(Feedback from user or examples, good or not?)
● Problem generator
● Suggest actions that will lead to new and informative
experiences.
Learning Agents
19/12/23 96

Constraint Satisfaction Problem
► Search Space is constrained by a set of conditions and
dependencies
► Class of problems where the search space is constrained called as
CSP
► For example, Time-table scheduling problem for lecturers.
► Constraint,
► Two Lecturers can not be assigned to same class at the same time
► To solve CSP, the problem is to be decomposed and analyse the
structure
► Constraints are typically mathematical or logical relationships
19/12/23 98

Solving Constraint Satisfaction Problem
19/12/23 99

CSP
► Any problem in the world can be mathematically
represented as CSP
► Hypothetical problem does not have constraints
► Constraint restricts movement, arrangement, possibilities
and solutions
► Example: if only concurrency notes of 2,5,10 rupees are
available and we need to give certain amount of money,
say Rs. 111 to a salesman and the total number of notes
should be between 40 and 50.
► Expressed as, Let n be number of notes,
• 40 < n < 50
• and
• c1
X1
+ c2
X2
+ c3
X3
= 111
19/12/23 100

CSP
► Types of Constraints
► Unary Constraints - Single variable
► Binary Constraints - Two Variables
► Higher Order Constraints - More than two variables
► CSP can be represented as Search Problem
► Initial state is empty assignment, while successor function is a
non-conflicting value assigned to an unassigned variables
► Goal test checks whether the current assignment is complete and
path cost is the cost for the path to reach the goal state
► CSP Solutions leads to the final and complete assignment with
no exception
19/12/23 101

Cryptarithmetic puzzles
► Cryptarithmetic puzzles are also represented as CSP
► Example: MIKE + JACK =
JOHN
► Replace every letter in puzzle with single number
(number should not be repeated for two diﬀerent
alphabets)
► The domain is { 0,1, ... , 9 }
► Often treated as the ten-variable constraint problem
► where the constraints are:
► All the variables should have a diﬀerent value
► The sum must work out
19/12/23 102

► M * 1000 + I * 100 + K * 10 + E + J * 1000 + A * 100 + C * 10 + K
= J * 1000 + O * 100 + H * 10 + N
► Constraint Domain is represented by Five-tuple and
represented by,
• D = {var, f , O, dv, rg}
► Var stands for set variables, f is set functions, O stands for the
set of legitimate operators to be used, dv is domain variable and
rg is range of function in the constraint
► Constraint without conjunction is referred as Primitive
constraint (for Eg., x < 9 )
► Constraint with conjunction is called as non-primitive constraint or
a generic constraint (For Eg., x < 9 and x > 2)
19/12/23 103

TO
GO
---
OUT
------
21
81 G = 8/9
---
102
------ 2 + G = U + 10
Var Value
T 2
O 1
G 8
U 0
Crypt arithmetic puzzles
– Solved Example
19/12/23 104

• SEND + MORE = MONEY
c4 c3 c2 c1
S E N D
M O R E
------------------------
M O N E Y
9 5 6 7
1 0 8 5
----------
1 0 8 52
19/12/23 105

CSP- Room Coloring Problem
-CSP as a Search Problem
► Let K for Kitchen, D for Dining Room, H is for Hall, B2
and B3
are bedrooms 2 and 3, MB1
is
master bedroom, SR is the store Room, GR is Guest Room and Lib is Library
► Constraints
► All bedrooms should not be colored red, only one can
► No two adjacent rooms can have the same color
► The colors available are red, blue, green and violet
► Kitchen should not be colored green
► Recommended to color the kitchen as blue
► Dining room should not have violet color
19/12/23 106

Room Coloring Problem – Representation as a
Search Tree
► Soft Constraints are that they are cost-oriented or preferred
choice
► All paths in the Search tree can not be accepted because of
the violation in constraints
19/12/23 107

Backtracking Search for CSP
► Assignment of value to any additional variable within constraint can
generate a legal state (Leads to successor state in search tree)
► Nodes in a branch backtracks when there is no options are available
19/12/23 108

Example: Map-Coloring
• Variables WA, NT, Q, NSW, V, SA, T
• Domains Di
= {red,green,blue}
• Constraints: adjacent regions must have different colors
e.g., WA ≠ NT, or (WA,NT) in {(red,green),(red,blue),(green,red),
(green,blue),(blue,red),(blue,green)}
19/12/23 109

110
Backtracking example
19/12/23

111
19/12/23

112
19/12/23

Algorithm for Backtracking
Pick initial state
R = set of all possible states
Select state with var assignment
Add to search space
check for con
If Satisfied
Continue
Else
Go to last Decision Point (DP)
Prune the search sub-space from DP
Continue with next decision option
If state = Goal State
Return Solution
Else
Continue
19/12/23 113

CSP-Backtracking, Role of heuristic
► Backtracking allows to go to the previous decision-making node to eliminate the
invalid search space with respect to constraints
► Heuristics plays a very important role here
► If we are in position to determine which variables should be assigned next, then
backtracking can be improved
► Heuristics help in deciding the initial state as well as subsequent selected
states
► Selection of a variable with minimum number of possible values can help in
simplifying the search
► This is called as Minimum Remaining Values Heuristic (MRV) or Most Constraint
Variable Heuristic
► Restricts the most search which ends up in same variable (which would make the
backtracking ineffective)
19/12/23 114

Heuristic
► MRV cannot have hold on initial selection process
► Node with maximum constraint is selected over other
unassigned variables - Degree Heuristics
► By degree heuristics, branching factor can not be reduced
► Selection of variables are considered not the values for it, so
the order in which the values of particular variable can be
arranged is tackled by least constraining value heuristic
19/12/23 115

Heuristic – Most Constraining Variable
19/12/23 116

Heuristic- Minimum Remaining Values
19/12/23 117

Forward Checking
► To understand the forward checking, we shall see 4 Queens
problem
► If an arrangement on the board of a queen x , hampers the
position of queensx +1
, then this forward check ensures that the
queen x should not be placed at the selected position and a new
position is to be looked upon
19/12/23 118

Forward Checking
► Q1
and Q2
are placed in row 1 nad 2 in the left sub-tree, so, search
is halted, since No positions are left for Q3
and Q4
► Forward Checking keeps track of the next moves that are
available for the unassigned variables
► The search will be terminated when there is no legal move
available for the unassigned variables
19/12/23 119

CSP- Room Coloring Problem
-CSP as a Search Problem
► Let K for Kitchen, D for Dining Room, H is for Hall, B2
and B3
are bedrooms 2 and 3, MB1
is
master bedroom, SR is the store Room, GR is Guest Room and Lib is Library
► Constraints
► All bedrooms should not be colored red, only one can
► No two adjacent rooms can have the same color
► The colors available are red, blue, green and violet
► Kitchen should not be colored green
► Recommended to color the kitchen as blue
► Dining room should not have violet color
19/12/23 120

Forward Checking – Room Coloring Problem
► For Room Coloring problem, Considering all the constraints the mapping can be done in
following ways;
► At first, B2
is selected with Red (R). Accordingly, R is deleted from the adjacent nodes
► Kitchen is assigned with Blue (B). So, B is deleted form the adjacent Nodes
► Furthermore, as MB1
is selected green, no color is left for D.
19/12/23 121

Constraint Propagation
► There is no early detection of any termination /
failure that would possible occur even though the
information regarding the decision is propagated
► Constraint should be propagated rather than the
information
19/12/23 122

Constraint Propagation
► Step 2 shows the consistency propagated from D
to B2
► Since D is can have only G value and B2
being
adjacent to it, the arc is drawn
► It is mapped as D → B2
or Mathematically,
• A → B is consistent ↔ ∀ legal value a ∈ A, ∃
• non-conflicting value b ∈ B
► Failure detection can take place at early stage
19/12/23 123

Algorithm for Arc Assignment
► Algorithm for arc assignment is:
► Let C be the variable which is being assigned at a given instance
► X will have some value from D{} where D is domain
► For each and every assigned variable, that is adjacent to X, Say Xj
•1Perform forward check (remove values from domain D that
conflict the decision of the current assignment)
• 2For every other variable X jj
that are adjacent or connected to
• Xj
;
•iRemove the values from D from X jj
that can’t be taken as further
unassigned variables
• iiRepeat step 2, till no more values can be removed or discarded
► Inconsistency is considered and constraints are propagated in Step
(2)
19/12/23 124

125
• Idea:
• Keep track of remaining legal values for unassigned variables
• Terminate search when any variable has no legal values
Forward checking
19/12/23

WA RGB
NT GB
SA B
Q R
NSW G
Y R
T RGB
19/12/23 126

127
• Idea:
Forward checking
19/12/23

128
• Idea:
Forward checking
19/12/23

129
• Idea:
Forward checking
19/12/23

130
• Forward checking propagates information from assigned to unassigned
variables, but doesn't provide early detection for all failures:
• NT and SA cannot both be blue!
• Constraint propagation repeatedly enforces constraints locally
Constraint propagation
19/12/23

131
• Simplest form of propagation makes each arc consistent
• X 🡪Y is consistent iff
for every value x of X there is some allowed y
constraint propagation propagates arc consistency on the graph.
Arc consistency
19/12/23

132
Arc consistency
19/12/23

133
• If X loses a value, neighbors of X need to be rechecked
Arc consistency
19/12/23

134
• If X loses a value, neighbors of X need to be rechecked
• Arc consistency detects failure earlier than forward checking
• Can be run as a preprocessor or after each assignment
■ Time complexity: O(n2
d3
)
Arc consistency
19/12/23

Intelligent Backtracking
► Conflict set is maintained using forward checking and
maintained
► Considering the 4 Queens problem, Conflict needs to
be detected by the user of conflict set so that a
backtrack can occur
► Backtracking with respect to the conflict set is
called as conflict-directed back jumping
► Back jumping approach can’t actually restrict the earlier
committed mistakes in some other branches
19/12/23 136

Intelligent Backtracking
► Chronological backtracking: The BACKGRACKING-SEARCH in
which, when a branch of the search fails, back up to the preceding
variable and try a different value for it. (The most recent decision
point is revisited).
• e.g: Suppose we have generated the partial assignment {Q=red,
NSW=green, V=blue, T=red}.
• When we try the next variable SA, we see every value violates a
constraint.
• We back up to T and try a new color, it cannot resolve the
problem.
► Intelligent backtracking: Backtrack to a variable that was responsible for making
one of the possible values of the next variable (e.g. SA) impossible.
Conflict set for a variable: A set of assignments that are in conflict with some value
for that variable.
(e.g. The set {Q=red, NSW=green, V=blue} is the conflict set for SA.)
Backjumping method: Backtracks to the most recent assignment in the conflict set.
(e.g. backjumping would jump over T and try a new value for V.)
19/12/23 137

21CSC206T_UNIT3.pptx.pdf ARITIFICIAL INTELLIGENCE

More Related Content

Similar to 21CSC206T_UNIT3.pptx.pdf ARITIFICIAL INTELLIGENCE

Recently uploaded

21CSC206T_UNIT3.pptx.pdf ARITIFICIAL INTELLIGENCE