Why do the inputs and outputs of a convolutional layer usually have the same depth?

2

Here's the famous VGG-16 model.

VGG16

Do the inputs and outputs of a convolutional layer, before pooling, usually have the same depth? What's the reason for that?

Is there a theory or paper trying to explain this kind of setting?

Captain Tomato

Posted 2019-06-10T09:08:59.117

Reputation: 21

I edited this post in order to save it. It wasn't clear what you meant by "input/output channels". To answer what I think is your question: no, the depth of the inputs and outputs of a convolutional layer are not typically the same. – nbro – 2020-07-04T20:32:31.787

Answers

0

Keeping the same channel size allows the model to maintain rank but i would say the main reason is convenience. Its easier book keeping.

Also in many model cases output features need some form of alignment with the input (example being all models using residual units -- $\hat{x} = F(x) + x$

mshlis

Posted 2019-06-10T09:08:59.117

Reputation: 1 845

What do you mean by "rank" in this context? It is also not clear the question because the depth of the input and output volumes are usually different. – nbro – 2019-06-10T17:11:21.153

also good point about alignment because GPU and CPU vector ops need this – user8426627 – 2019-06-10T17:54:47.610