Why are the value functions sometimes written with capital letters and other times with lower-case letters?



Why are the state-value and action-value functions are sometimes written in small letters and other times in capitals? For instance, why in the Q-learning algorithm (page 131 of Barto and Sutton's book but not only), we the capitals are used $Q(S, A)$, while the Bellman equation it is $q(s,a)$?


Posted 2020-06-10T02:46:22.760

Reputation: 193



In the Sutton and Barto book $q(s,a)$ is used to denote the true expected value of taking action $a$ in state $s$, whereas capital $Q(s,a)$ is used to denote an estimate of $q(s,a)$. However, there is likely to be a lot of inconsistency in the literature as each author has their own preference on how to denote things. I would encourage you to consider whether the value you are reading is to denote an estimate or the true value.

David Ireland

Posted 2020-06-10T02:46:22.760

Reputation: 1 942


Ordinary variables vs Random Variables

The difference is whether you're talking about a ordinary variable or a random variable.

For instance, the q-function (lowercase) is an expectation value (i.e. not a random variable), conditioned on a specific state-action pair: $$ q(s,a)\ =\ \mathbb{E}_t\left\{ R_t+\gamma R_{t+1} + \gamma^2R_{t+2}+\dots\,\Big|\, S_t=s, A_t=a \right\} $$ Then, in some case, some authors may abuse notation slightly by feeding in a random variable into the q-function, e.g. $q(S_t,a)$, $q(s,A_t)$ or even $q(S_t,A_t)$, thereby undoing some or all of the conditioning in the definition of the q-function as an expectation value.

Feeding a random variable into a function like the q-function results in an output that is a random variable in its own right. It is for this reason that some authors choose to give the function itself an uppercase letter as well.

My advice would be to think to yourself, is this a random variable? For the rest, I would interpret upper/lowercase as no more than a hint to the reader.


Posted 2020-06-10T02:46:22.760

Reputation: 151

to add to your answer, capital Q is typically used when it is an estimate of the true q-value function. this follows from what you are saying as if it is an estimate then it is a stochastic approximation i.e. a random variable. – David Ireland – 2020-06-10T09:50:22.637

It's also worth mentioning that the notation in the first edition of Barto and Sutton's wasn't very consistent, but it should have improved in the second edition. – nbro – 2020-06-10T10:15:05.327

@Kris, could you maybe clarify what would passing a random variable into a function q mean? – d56 – 2020-06-12T02:55:39.763