I'm currently working on a college project in which I'm designing a Deep Q-Network that takes images/frames as an input.
I've been searching online to see how other people have designed their convolutional stage and I've seen many different implementations.
Some projects, such as DeepMinds Atari 2600 project, use 3 convolutional layers and no pooling (from what I can see).
However, other projects use fewer convolutional layers and add a pooling layer onto the end.
I understand what both layers do, I was just wondering is there a benefit to how DeepMind did it and not use pooling or should I be using a pooling layer and fewer convolutional layers?
Or have I completely missed out on something? Is Deep Mind actually using pooling after each convolutional layer?