Markov decision process

Markov Decision Process
Prof. Neeraj Bhargava
Kapil Chauhan
Department of Computer Science
School of Engineering & Systems Sciences
MDS University, Ajmer

Introduction
 Reinforcement Learning is a type of Machine
Learning.
 It allows machines and software agents to
automatically determine the ideal behavior within a
specific context, in order to maximize its
performance.

Cont..
 In the problem, an agent is supposed to decide the
best action to select based on his current state.
 When this step is repeated, the problem is known as
a Markov Decision Process.

Markov Decision Process
(MDP) model contains:
 A set of possible world states S.
 A set of Models.
 A set of possible actions A.
 A real valued reward function R(s,a).
 A policy the solution of Markov Decision Process.

Model:
 A State is a set of tokens that represent every state that
the agent can be in.
 A Model (sometimes called Transition Model) gives
an action’s effect in a state. In particular, T(S, a, S’)
defines a transition T where being in state S and taking
an action ‘a’ takes us to state S’ (S and S’ may be same).

Cont..
 An Action A is set of all possible actions. A(s) defines
the set of actions that can be taken being in state S.
 A Reward is a real-valued reward function. R(s)
indicates the reward for simply being in the state S.
 A Policy is a solution to the Markov Decision Process.
A policy is a mapping from S to a. It indicates the
action ‘a’ to be taken while in state S.

Assignment
 Explain Markov Decision Process with example.

Markov decision process

More Related Content

What's hot

Similar to Markov decision process

More from chauhankapil

Recently uploaded

Markov decision process