2

I'm attempting to model a simple process using a Markov Decision Process.

Let $A$ be a set of $3$ actions : $ A \in \{b,s\}$. $T(s,a,s')$ represents the probability of if in state $s$ , take action $a$ and end up in state $s'$

Notation for the MDP diagram is as follows :

Here is my MDP diagram which models 7 states:

The outgoing actions for each state sum to 1.

$T(1,b,2) = .7 $

$T(1,b,3) = .3 $

$T(1,s,4) = .9 $

$T(1,s,5) = .05 $

$T(1,s,6) = .05 $

I've tried to keep this as simple as possible to check my understanding. Are my representations & probabilities correct ?

What is action 'h`? That is not being modeled. – Brian Spiering – 2020-02-09T23:49:50.897

@BrianSpiering I've not included 'h' , it can be modeled but I've not included it. I've removed 'h' for clarity, thanks. – blue-sky – 2020-02-10T19:52:17.977