## Quantum-Assisted Neural Network Training (Is my design reasonable?)

1

I'm a college student with a slight interest in quantum mechanics. I think I have a decent understanding of the Copenhagen and Many Worlds interpretations of quantum mechanics, and was considering how this could be used to improve machine learning efficiency. I want to check my understanding of quantum mechanics/computing using a design I came up with for a neural network training algorithm.

The following is a graphical representation of my algorithm. To read the diagram, follow the colored circles. The arrows show the direction in which data flows, not sequential steps in the program. The sequential steps are represented by the colored circles. Note that all state in this system would be finite.

1. The user pre-configures their training data into the system. This consists of network input and expected network output pairs.
2. The user pre-configures the cost threshold, a guess for the lowest accumulated cost value.
3. The algorithm starts the iteration over training data pairs. The network input is fed into the neural network, along with the weights which are represented in qbits. This produces a network output, which is also represented in qbits. (Each superposition of the network output should be entangled with a particular superposition of the weights.) A cost function then computes a cost (represented in qbits) based on the expected network output and the network output. An accumulator accumulates these costs over each iteration.
4. Once the iteration is finished, we compare the accumulated cost with the cost threshold. If it is less than the cost threshold, we display the weights that are entangled with the accumulated cost to the outside world. There may be multiple branches that pass the threshold, but it doesn't matter which one we output. From the outside world's perspective, the machine should have collapsed to a single set of weights that produce an accumulated cost that is less than the cost threshold. If nothing is displayed, it means no branch passed the cost threshold, so the user should start from step 2 with a higher cost threshold.
5. Repeat steps 2, 3, 4 until satisfied with the displayed weights and cost threshold.

My idea is that by setting up the system in this way, you could reduce the problem of finding weights to a linear guess and see process (gradually increment the cost threshold one unit at a time until the machine stops displaying an output, at that point you should have found the optimal configuration of weights according to your cost function). Would this work in practice, and are there any glaring flaws with my design or understanding of quantum mechanics?

Welcome to the community! While the question is certainly detailed, I think there may not be enough math to effectively convey key differences in the architecture. I suggest comparing this approach to existing ML architectures and detail more specifically how/why a qc is being used. – C. Kang – 2020-07-10T15:13:02.440

@C.Kang Yeah, I realize this isn't very formalized in terms of how quantum problems are usually posed. I was more wondering if my conceptual understanding of how entanglement and quantum branches / decoherence behave was accurate, but I can see how this could be insufficient to really answer that. I am not very familiar with the math, which is why I wrote this on a conceptual level. I might have to dig into the math and something called "quantum annealing". As far as measuring the spins of qbits for computation, I assume one would use the same plane of measurement everywhere in the machine. – Jerry Fielder – 2020-07-10T16:21:29.020

I will try to add additional information when I find the time over the next day or two. – Jerry Fielder – 2020-07-10T16:26:44.320

I'd recommend by starting with an understanding of qubits / quantum algorithms. QC is really all math – C. Kang – 2020-07-10T16:52:48.450

I don't really know much about ML or Neural Networks, but there is something to be said for abstracting a QC as a black box that, given a classical input returns either a classical or quantum output, if this could be a useful thing to do, although if the maths is necessary, it's necessary. The bit that I'm not seeing here is this: how do you compare the accumulated cost (which appears to be some sort of quantum state) with the cost threshold (which appears to be a classical input)? – Mithrandir24601 – 2020-07-11T23:43:22.967

I don't quite understand how this scheme differs from a flowchart showing how regular (classical) NNs are trained (or how pretty much any optimisation algorithm works for that matter). What's quantum about this? – glS – 2020-07-13T15:44:37.863

@Mithrandir24601 In having the internal workings of this be a black box, I'm sort of relying on the idea that to the outside world, everything inside the box is in a superposition. This might not be possible to achieve in practice, but the idea is that once any branch sends information to the world outside the box, the wave function, from the outside world's perspective, will collapse to that branch. As the weights are qubits, there should be a branch for each possible permutation of weights, and so we have only the desirable branch collapse the wave function. – Jerry Fielder – 2020-07-13T16:10:26.257

@Mithrandir24601 To compare the accumulated cost with the threshold, the qubits would certainly have to be measured. My assumption is that this measuring does not collapse the wave function from the outside world's perspective, but that it would merely split the wave function into 2^n branches. The wave function is only collapsed from the outside world's perspective when one branch of the wave function inside the box sends information outside, entangling itself with the outside world and establishing itself as the "real" branch from our perspective. More or less Schrodinger's cat. – Jerry Fielder – 2020-07-13T16:13:09.940

@glS The quantum part comes from the fact that some of the information is stored in qubits (I wrote it as "qbits" in the diagram, should have been "qubits"), enabling the system to have superpositions. My idea was to encompass the system in a box to cut it off from the outside world so that the inside of the box could have superpositions, and so we could control the collapse of the wave function (relative to the outside world) by pre-instating a threshold inside the box, which only desirable branches of the wave function would surpass and send information outside. – Jerry Fielder – 2020-07-13T16:20:29.053

@JerryFielder Ah, I'm afraid the devil's in the details here - "My assumption is that this measuring does not collapse the wave function from the outside world's perspective, but that it would merely split the wave function into 2^n branches." is almost there but not quite - performing this 'internal measurement' would decohere the state into a classical probability distribution so that, on the outside, no-one would know what the result is, but the resulting state would nonetheless be classical – Mithrandir24601 – 2020-07-13T17:00:07.390

@Mithrandir24601 I think in that case, though, it would come down to the measurement problem. Like in the quantum erasure experiment, you can use the same apparatus to conduct the measurement, but if it's set up in a way that destroys the measurement information, it does not appear to cause decoherence. If the mechanism for decoherence were interactions between particles, then I think it could be plausible that a branch of the wave function inside the black box that sends no information outside would not decohere the state. But the black box would have to be a perfectly sealed system. – Jerry Fielder – 2020-07-13T18:36:49.173

1@JerryFielder if the input of the NN is a quantum state, then you cannot simply apply the map corresponding to the NN to the vector representing the quantum state. NNs are inherently nonlinear functions, by design. Quantum mechanics only allows for linear mappings between input and output state vectors. Any nonlinearity requires probabilistic schemes, and if you go this direction you need to be very careful about the actual probabilities at all stages. This is not an easy problem to tackle. – glS – 2020-07-14T06:41:18.197