Tag: dqn

13 Why does DQN require two different networks? 2018-07-02T07:47:23.303

8 Huge action space size in Reinforcement Learning 2019-03-06T13:23:32.007

7 Why does reinforcement learning using a non-linear function approximator diverge when using strongly correlated data as input? 2020-01-29T08:47:11.317

5 What is the difference between DQN and AlphaGo Zero? 2019-02-27T06:17:02.450

5 Can DQN perform better than Double DQN? 2019-04-08T09:08:16.597

5 Why don't people use projected Bellman error with deep neural networks? 2019-04-12T05:02:52.887

5 Deciding on a reward per each action in a given state (Q-learning) 2019-05-12T00:15:10.860

5 Can exogenous variables be state features in reinforcement learning? 2019-08-25T07:12:44.720

5 What are some online courses for deep reinforcement learning? 2020-03-25T14:46:24.230

4 Ensure convergence of DDQN if true Q-values are very close 2018-10-24T18:29:00.283

4 Which kind of prioritized experience replay should I use? 2019-05-05T10:05:47.557

4 What could be causing the drastic performance drop of the DQN model on the Pong environment? 2019-05-31T20:09:58.620

4 Why does the DQN not converge when the start or goal states can change dynamically? 2019-07-01T12:11:40.230

4 How to represent players in a multi agent environment so each model can distinguish its own player 2019-07-29T17:50:10.420

4 How can a DQN backpropagate its loss? 2020-01-14T17:51:38.797

4 What is the target Q-value in DQNs? 2020-04-19T03:25:51.150

4 Upper limit to the maximum cumulative reward in a deep reinforcement learning problem 2020-07-18T13:27:17.247

4 Why do DQNs tend to forget? 2020-07-27T11:51:00.447

4 My Deep Q-Learning Network does not learn for OpenAI gym's cartpole problem 2020-08-11T15:42:37.790

3 DQN input representation for a card game 2018-07-06T11:57:23.427

3 Reason for issues with correlation in the dataset in DQN 2018-11-02T08:06:45.560

3 What does it mean by high dimensional state in DQN? 2019-01-03T16:42:59.147

3 Each training run for DDQN agent takes 2 days, and still ends up with -13 avg score, but OpenAi baseline DQN needs only an hour to converge to +18? 2019-01-30T11:39:25.787

3 My DQN is stuck and can't see where the problem is 2019-02-22T20:55:03.887

3 What are the differences between the DQN variants? 2019-03-23T12:38:31.063

3 Why Q2 is a more or less independant estimate in Twin Delayed DDPG (TD3)? 2019-03-24T05:26:49.420

3 What can be considered a deep recurrent neural network? 2019-04-08T10:04:21.293

3 DQN Agent not learning anymore - what can I do to fix this? 2019-04-22T09:00:45.757

3 Experience Replay Not Always Giving Better Results 2019-04-29T15:30:20.570

3 How do I represent a multi-dimensional state using a neural network? 2019-05-16T06:16:22.003

3 Why do authors track $\gamma_t$ in Prioritized Experience Replay Paper? 2019-05-31T02:47:46.293

3 Deep Q-Network (DQN) to learn the game 2048 2019-06-12T21:17:30.437

3 Should importance sample weighting be compensated for by dynamically increasing learning rate? 2019-08-26T10:36:57.950

3 Is a state that includes only the past n-step price records partially observable? 2019-08-29T11:49:35.990

3 How is the gradient of the loss function in DQN derived? 2019-09-07T14:18:13.677

3 How important is the choice of the initial state? 2019-09-11T12:16:29.353

3 What is the difference between random and sequential sampling from the reply memory? 2019-09-19T13:10:13.547

3 DQN, how to choose the reward fucntion? 2019-12-09T16:47:19.020

3 Can experience replay be used for training after completing every single epoch? 2020-03-06T06:58:21.853

3 How does the optimization process in hindsight experience replay exactly work? 2020-03-12T10:19:55.543

3 Representation of state space, action space and reward system for Reinforcement Learning problem 2020-03-22T20:22:34.407

3 Are Q values estimated from a DQN different from a duelling DQN with the same number of layers and filters? 2020-04-13T03:46:47.127

3 If agent chooses an action that the environment can't operate, how should I handle this situation? 2020-05-19T03:39:50.190

3 How would researchers determine the best deep learning model if every run of the code yields different results? 2020-05-24T02:03:49.680

3 What's the right way of building a deep Q-network? 2020-06-05T17:30:33.073

3 How to take actions at each episode and within each step of the episode in deep Q learning? 2020-06-05T20:32:22.853

3 How to know if my DQN is optimized? 2020-06-20T05:38:04.100

3 What happens when you select actions using softmax instead of epsilon greedy in DQN? 2020-06-23T16:47:51.683

3 How to handle the final state in experience replay? 2020-06-24T02:59:18.853

3 How should I choose the target's update frequency in DQN? 2020-08-17T17:16:59.297

3 Is there a logical method of deducing an optimal batch size when training a Deep Q-learning agent with experience replay? 2020-08-25T22:16:33.010

2 Why use semi-gradient instead of full gradient in RL problems, when using function approximation? 2018-04-24T23:11:25.637

2 Convergence in multi-agent environment 2018-07-08T12:46:01.443

2 Can DQN announce it has things in its hand in a card game? 2018-07-08T19:09:46.223

2 Is Deep Q Neural Network (DQN) applicable only with images as inputs? 2018-07-13T16:26:06.620

2 Can the opponent's turn affect the reward for a DQN agent action? 2018-07-15T11:44:17.030

2 In DQN, updating target network every N steps or slowly update every step is better? 2019-02-28T03:56:42.890

2 Why do DQNs use linear activations on cartpole? 2019-04-08T04:40:34.507

2 Why experience reply memory in DQN instead of a RNN memory? 2019-04-22T12:00:35.690

2 How to stop DQN Q function from increasing during learning? 2019-04-24T14:15:02.803

2 New transition priorities in Prioritized Experience Replay? 2019-06-01T02:26:03.443

2 Gym dict space as keras DQN agent input 2019-06-08T16:27:22.713

2 Deep Reinforcement Learning: Rewards suddenly dip down 2019-06-18T19:45:18.273

2 Will the target network, which is less trained than the normal network, output inferior estimates? 2019-07-20T03:17:34.007

2 Reinforcement learning: How to deal with illegal actions? 2019-08-14T11:20:23.583

2 What could be the cause of the drop of the total reward when using DQN to solve the cart-pole environment? 2019-11-21T20:12:56.263

2 NoisyNet DQN with default parameters not exploring 2020-01-14T11:14:03.630

2 How is the expected value in the loss function of DQN approximated? 2020-02-27T21:41:46.513

2 Intutitive explanation of why Experience Replay is used in a Deep Q Network? 2020-03-01T19:12:16.580

2 How to correctly implement self-play with DQN? 2020-03-17T12:49:49.053

2 Why is my DQN model not getting better? 2020-03-26T15:22:05.283

2 How much time does it take to train DQN on Atari environment? 2020-04-01T09:23:18.910

2 How was the DQN trained to play many games? 2020-04-04T13:54:05.900

2 What should the target be when the neural network outputs multiple Q values in deep Q-learning? 2020-05-04T02:41:33.243

2 How to evaluate a Deep Q-Network 2020-05-15T11:48:43.013

2 Does the concept of validation loss apply to training deep Q networks? 2020-05-18T12:54:58.817

2 Are the final states not being updated in this $n$-step Q-Learning algorithm? 2020-06-02T14:10:10.190

2 How to convert sequences of images into state in DQN? 2020-06-06T14:32:49.413

2 If the minimum Q value is decreasing and the maximum Q value increasing, is this a sign that dueling double DQN is diverging? 2020-06-07T16:24:40.417

2 How to handle changing goals in a DQN? 2020-06-23T10:59:33.633

2 Should illegal moves be excluded from loss calculation in DQN algorithm? 2020-06-27T19:02:10.683

2 Should the agent play the game until the end or until the winner is found? 2020-06-29T16:33:07.773

2 Why does shifting all the rewards have a different impact on the performance of the agent? 2020-07-01T01:57:16.453

2 Why do some DQN implementations not require random exploration but instead emulate all actions? 2020-07-05T09:25:12.933

2 Prioritised Remembering in Experience Replay (Q-Learning) 2020-07-17T07:09:59.120

2 How does the target network in double DQNs find the maximum Q* value for each action? 2020-07-21T14:20:27.200

2 Reinforcement learning with action consisting of two discrete values 2020-07-26T15:43:39.957

2 When using experience replay in reinforcement learning, which state is used for training? 2020-08-12T12:53:08.957

2 How to compute the target for double Q-learning update step? 2020-08-13T14:24:47.487

2 How should I compute the target for updating in a DQN at the terminal state if I have pseudo-episodes? 2020-08-19T06:53:02.550

2 Combine DQN with the Average Reward setting 2020-08-22T23:36:52.027

1 Deep Q-Network concepts and implementation 2018-06-20T15:16:52.060

1 Using a DQN with a variable amount of Valid Moves per turn for a Board Game 2018-10-03T23:11:36.607

1 DQN Breakout adding an extra negative reward to help training? 2018-11-21T13:33:32.310

1 Exploration rate decay and training in Q learning 2018-11-28T05:03:20.983

1 DQN exploration strategy for large grid-world environment 2018-12-05T19:58:07.433

1 OpenAI-Gym excess of actions 2019-03-13T21:17:24.320

1 Comparison and understanding of different version of DDQN? 2019-03-14T12:52:07.787

1 Do we need to reset the DQN network after every episode? 2019-03-26T13:14:18.797