Why is it hard to prove the convergence of the deep Q-learning algorithm?

1

Why is it hard to prove the convergence of the DQN algorithm? We know that the tabular Q-learning algorithm converges to the optimal Q-values, and with a linear approximator convergence is proved.

The main difference of DQN compared to Q-Learning with linear approximator is using DNN, the experience replay memory, and the target network. Which of these components causes the issue and why?

Afshin Oroojlooy

Posted 2020-05-10T16:01:31.993

Reputation: 143

Thanks for the link. I carefully read the post, it does not actually answer my question. – Afshin Oroojlooy – 2020-05-10T23:25:53.340

No answers