What does the notation $\nabla_\theta \mathcal{L}$ mean?


Here's the general algorithm of maximum entropy inverse reinforcement learning.

enter image description here

This uses a gradient descent algorithm. The point that I do not understand is there is only a single gradient value $\nabla_\theta \mathcal{L}$, and it is used to update a vector of parameters. To me, it does not make sense because it is updating all elements of a vector with the same value $\nabla_\theta \mathcal{L}$. Can you explain the logic behind updating a vector with a single gradient?

─░brahim Abbasov

Posted 2018-07-06T21:26:06.797

Reputation: 51



This is standard backpropagation. The gradient term you see is in fact a vector of partial derivatives where each element is the partial derivative of the log-likelihood with respect to each element of the parameter vector $\theta$. Therefore, it has the same dimensionality as $\theta$. Each element of the parameter vector is then updated with the respective term in the vector of partial derivatives, which are generally not the same.


Posted 2018-07-06T21:26:06.797

Reputation: 851