Can DQN perform better than Double DQN?


I'm training both kind of agents against an environment but DQN performs significantly better than Double DQN. As I've saw here, Double DQN use to perform better than DQN. Am I doing something wrong or is it possible?


That may happen when the value of the state is bad. You can find the example and explain about that in the link below.

See this:


What do you mean by "the value of the state is bad"? As I'm using an OpenAI-Gym environment, the value of the state is just the observation that I'm getting from it. – Angelo – 2019-04-09T07:53:45.827

@Angelo you can read the blog above to understand the answer . As you calculate all actions at one state but all these actions do not affect the environment in a relevant way. – i_th – 2019-04-09T08:11:27.903

@Angelo In most states, the choice of action has no effect. – i_th – 2019-04-09T08:37:43.857


There is no thorough proof, theoretical or experimental that Double DQN is better then vanilla DQN. There are a lot of different tasks, paper and later experiments only explore some of them. What practitioner can take out of it is that on some tasks DDQN is better. That's the essence of Deep Mind's "Rainbow" approach - drop a lot of different methods into bucket and take best results.


