14

7

I have a small sub-question to this question.

I understand that when back-propagating through a max pooling layer the gradient is routed back in a way that the neuron in the previous layer which was selected as max gets all the gradient. What I'm not a 100% sure of is how the gradient in the next layer gets routed back to the pooling layer.

So the first question is if I have a pooling layer connected to a fully connected layer - like the image below.

When computing the gradient for the cyan "neuron" of the pooling layer do I sum all the gradients from the FC layer neurons? If this is correct then every "neuron" of the pooling layer has the same gradient?

For example if the first neuron of FC layer has a gradient of 2, second has a gradient of 3, and third a gradient of 6. What are the gradients of the blue and purple "neurons" in the pooling layer and why?

And the second question is when the pooling layer is connected to another convolution layer. How do I compute the gradient then? See the example below.

For the topmost rightmost "neuron" of the pooling layer (the outlined green one) I just take the gradient of the purple neuron in the next conv layer and route it back, right?

How about the filled green one? I need to multiply together the first column of neurons in the next layer because of the chain rule? Or do I need to add them?

Please do not post a bunch of equations and tell me that my answer is right in there because I've been trying to wrap my head around equations and I still don't understand it perfectly that's why I'm asking this question in a simple way.

With regards to your first question, backpropagation is to see what weights and inputs influences your loss in what way. In case of max pooling only the max of the neurons influences the output (except for when there is a tie). So only propagate the error to the neuron that had the maximum activation value. – Jan van der Vegt – 2016-08-21T10:53:42.987

Yes, I understand this and I also said this in a recap at the beginning of my post. But I don't understand how do I "combine" the gradient of the next layer neurons to propagate back. Hope you know what I mean. – Majster – 2016-08-21T13:03:44.353