What does the notation ${s'\sim T(s,a,\cdot)}$ mean?

2

I have been seeing notations on Expectations with their respective subscripts such as $E_{s_0 \sim D}[V^\pi (s_0)] = \Sigma_{t=0}^\infty[\gamma^t\phi(s_t)]$. This equation is taken from https://ai.stanford.edu/~ang/papers/icml04-apprentice.pdf and $Q^\pi(s,a,R) = R(s) + \gamma E_{s'\sim T(s,a,\cdot)}[V^\pi(s',R)]$ ,in the case of the Bayesian IRL paper.(https://www.aaai.org/Papers/IJCAI/2007/IJCAI07-416.pdf)

I understand that $s_0 \sim D$ means that the starting state $s_0$ is drawn from a distribution of starting states $D$. But how do we understand the latter with subscript ${s'\sim T(s,a,\cdot)}$ ? How is $s'$ drawn from a distribution of transition probabilities?

calveeen

Posted 2020-03-29T15:33:37.707

Reputation: 909

Answers

2

The dot ($.$) at the end of $T(s,a,.)$ shows all possible states that we can go from state $S$ by doing action $a$. As you know there are some probabilities here for choosing those states, that the sum of these probabilities is equal to 1. Hence, $T(s,a,.)$ is a probability distribution.

OmG

Posted 2020-03-29T15:33:37.707

Reputation: 1 020