## Initial assumption of the unitary that allows us to estimate the label function

1

1

You can find the paper here , in which they describe the architecture of a QNN that can be used to learn binary functions and correctly classify unseen data.

They say that for each binary label function $$l(z)$$ where $$l(z) = -1$$ or $$l(z) = 1$$, there exists a unitary $$U_l$$ such that, for all input strings $$z = z_0z_1...z_{n-1}$$ (where each $$z_i = -1,1)$$, $$\langle z,0 | U_l^{\dagger} Y_{n+1} U_l |z,0 \rangle = l(z)$$

If you assume that $$U_l = \text{exp}(i\frac{\pi}{4}l(z)X_{n+1})$$, then it can be easily proven that $$\langle z,0 | U_l^{\dagger} Y_{n+1} U_l |z,0 \rangle = l(z)$$

Now let's consider the subset parity problem. Here, $$l(z) = 1-2B(z)$$, where $$B(z) = \oplus^{n-1}_{j=0} \phantom{a} a_j \cdot \frac{1}{2}(1-z_j)$$, which, when plugged into $$U_l$$ gives us $$\text{exp}(i\frac{\pi}{4}X_{n+1}) \prod^{n-1}_{j=0} \text{exp}(-i \frac{\pi}{2}a_j \cdot \frac{1}{2}(1-z_j))$$

Now, for the subset parity problem, what you want to learn is $$\frac{\pi}{2}a_j$$, which you do not know beforehand.

So, during learning, you assume that $$U_l(\vec\theta) = \text{exp}(i\frac{\pi}{4}X_{n+1}) \prod^{n-1}_{j=0} \text{exp}(-i \theta_j \cdot \frac{1}{2}(1-z_j))$$ (our goal is to update $$\vec\theta$$ s.t when we compute the estimated label, we get close to the actual label)

This method seems to be working fine for this problem (I get an accuracy of 96%).

Right now, I am trying to use a QNN for another binary classification problem. Contrary to the subset parity problem, I do not actually know $$l(z)$$ (which I thought was perfect, because the QNN allows me to design a circuit that correctly classifies my strings). Therefore, I assumed that $$U_l(\vec\theta) = \text{exp}(i\frac{\pi}{4}X_{n+1}) \prod^{n-1}_{j=0} \text{exp}(-i \theta_j \cdot \frac{1}{2}(1-z_j))$$, just like the subset parity problem.

It seems to be working fine. I get an accuracy of 76%, which isn't bad. However, I am not sure if I can assume this and I am starting to wonder if my initial assumption about $$U_l$$ for this new problem is legit or not (it could be a coincidence or an error in my code).

Unfortunately just saw this today. Might have been fun to read the paper and figure out the answer for you. But too late now since it's almost midnight and when I wake up the bounty will be over. Good luck! – user1271772 – 2020-06-14T03:17:56.500

As far as I understand from the paper, eq. (13) gives $$U_l$$ as a product of two qubit unitaries, independently of $$l(z)$$. Then the authors present two cases, subset parity and subset majority, and derive their specific $$U_l$$. Thus I guess your classification problem will need its own specialization of eq. (13). If you get an acceptable accuracy with the subset parity $$U_l$$, it may be a coincidence. Or maybe it is not, it depends on how your classification problem (that we do not know) relates to subset parity.