Can I shuffle image channel data as a form of data augmentation?


If I want to augment my dataset, is shuffling or permuting the channels (RGB) of an image a sensible augmentation for training a CNN? IIRC, the way convolutions work is that a kernel operates over parts of the image but maintains the order of the kernels.

For example, the kernel has $k \times k$ weights for each channel and the resulting output is the multiplication of the weights and the pixel values of the image and is finally averaged to form a new pixel in the next feature map.

In this case, if we shuffle the channels of the image (GBR, BGR, RBG, GRB, etc.), a CNN that is only trained on the ordering RGB would do poorly on such images. Therefore, is it not sensible to shuffle the channels of the image as a form of data augmentation? Or will this have a regularizing effect on the CNN model?

Syafiq Kamarul Azman

Posted 2019-12-04T10:26:07.680

Reputation: 123



As a rule of thumb for image data augmentation, look at the augmented images:

  • Can you correctly classify or measure your target label from the augmented images?

  • Could something similar to the augmented images appear in the environment where you want to run inferences on previously unseen inputs?

For your suggested augmentation of shuffling the channels, it may pass the first test. However, the second test shows that you are probably taking a step too far.

will this have a regularizing effect on the CNN model?

Yes, but it might not be that useful to have strong cross-channel regularisation.

If there is important information for your task in the separate colour channels, then shuffling the channels makes it harder for the neural network to use that (it is not impossible, the CNN can still learn filters that will trigger most strongly on features that tend to appear in red channel and not blue in your problem for instance).

If there is not important information for your task in the colour information, then you may find it simpler and easier to turn your images into single channel greyscale instead, and use that throughout. Although that is not completely the same, for many image types it will achieve a similar effect (and possible boost to accuracy) for a fraction of the effort.

Neil Slater

Posted 2019-12-04T10:26:07.680

Reputation: 14 632

Understood, I guess the no free lunch theorem applies. I used RGB images to get the point across. On a more general level, if you had layers of image (say cross-sections of an MRI) and these layers was input in no particular order, it would be useful to know if the CNN is "layer-order-invariant" – Syafiq Kamarul Azman – 2019-12-09T16:23:10.857