Reinforcement learning with a quantum agent


Is it an open question whether we can do reinforcement learning where the quantum agent is not present in the environment, that is, doesn't contribute noise to the environment? In a classical environment, it seems that the agent is observing the universe while also contributing to it simultaneously. Most papers in this field seem to analyze quantum speed-ups and quantum-inspired algorithms, but I wonder more about classical improvements of the architecture in general.

My aim is to understand if there is a way to arrange an environment where an agent is in a quantum state. The purpose is to separate the agent (actor-critic) from the system. For example, through superposition. In other words, can we do learning with a passive agent?

Jonas Kgomo

Posted 2019-05-27T20:42:24.957

Reputation: 185

Question was closed 2019-06-05T07:17:56.030


First of all, I would argue that pretty much everything is still an open question on this topic. But anyway, it is a bit unclear what you are asking. The notion of a "quantum agent" has been explored, see e.g. refs in Dunjko 1811.08676. But what do you mean exactly with "the agent is not present in the environment, that is, doesn't contribute noise to it"? If the agent doesn't interact with the environment, how can it be learning from it? Similarly, what does "arrange an environment" mean?

– glS – 2019-05-27T20:55:43.427

For example in the double-slit experiment, using the wave-particle duality, we can arrange such an environment for learning,such that if the intensity is I then see the light as a photon, otherwise observe the light as a wave. I think the idea of learning in absence is studied by Alp and Dunjko as "quantum computational learning theory". I am imagining that the agent can be encoded into the environment or seen as an object in superposition, if results are desired, remove its presence, if they are not, retain it.

– Jonas Kgomo – 2019-05-27T21:20:42.417

2again, when you say "such an environment" I don't know what are you referring to. Similarly, what does "learning in absence" mean? – glS – 2019-05-28T01:34:33.477

Mahadev published an article that argued the idea of blind computing(learning in absence), using an interactive protocol that allows inference and verification that computation has been done, where there is a classical verifier and a quantum prover. Such an environment means the learning environment that is used in a RL mode, we define it by how it governs the behaviour of agents. Learning with errors property is used here.

– Jonas Kgomo – 2019-06-13T13:30:20.130

No answers