2

I've been following this website to check out how parameter-shift works for calculation of gradients for backpropagation in Variational Quantum Machine Learning Circuits

Most of it made makes sense until I got to the example part where they started explaining how one could calculate the gradient in case of a Pauli gate:

$$ U_i(\theta_i) = exp(-i\frac{\theta_i}{2}\hat P_i) $$

The Gradient of this unitary is: $$ \nabla_{\theta_i}U_i(\theta_i) = -\frac{i}{2}\hat P_i U_i(\theta_i) = -\frac{i}{2}U_i(\theta_i)\hat P_i $$

which makes sense, the problem starts with the following sentence:

Substituting this into the quantum circuit function $f(x;\theta_i)$, we get: $$ \nabla_{\theta_i}f(x;\theta) = \frac{i}{2} \langle \psi_{i-1}|U_i^{\dagger}(\theta_i)(P_i \hat B_{i+1} - \hat B_{i+1}P_i)U_i(\theta_i)|\psi_{i-1}\rangle$$ where $f(x;\theta)$ was given by the following: $$ \langle 0 | U_0^{\dagger}(x)U_i^{\dagger}(\theta_i)\hat BU_i(\theta_i)U_0(x)|0\rangle$$

Now I have no idea how that came to be, what am I missing?

1there is some detail missing in the post, but the gist should be to simply proceed with the chain rule. You have $\nabla_i\langle U_0^\dagger U_i^\dagger B U_i U_0\rangle = \frac{i}{2}\langle U_0^\dagger (P_i U_i^\dagger B U_i - U_i^\dagger B U_i P_i)U_0\rangle$, and I guess the rest should follow from the definitions of $B,B_i, U_0,\psi_k$. Notice also that $[U_i,P_i]=0$, which helps rearranging terms – glS – 2020-08-14T06:35:01.827