Markov Decision Process representation


I'm attempting to model a simple process using a Markov Decision Process.

Let $A$ be a set of $3$ actions : $ A \in \{b,s\}$. $T(s,a,s')$ represents the probability of if in state $s$ , take action $a$ and end up in state $s'$

Notation for the MDP diagram is as follows :

enter image description here

Here is my MDP diagram which models 7 states:

The outgoing actions for each state sum to 1.

enter image description here

$T(1,b,2) = .7 $

$T(1,b,3) = .3 $

$T(1,s,4) = .9 $

$T(1,s,5) = .05 $

$T(1,s,6) = .05 $

I've tried to keep this as simple as possible to check my understanding. Are my representations & probabilities correct ?


What is action 'h`? That is not being modeled.

I've not included 'h' , it can be modeled but I've not included it. I've removed 'h' for clarity, thanks.



Looks 'correct' to me, in the sense that it satisfies the requirements for being an MDP. Whether it models the underlying real-world problem correctly cannot be validated with the information given here.

