FFT-Convolution. With Pixel Stride and Padding


I want to build CNN from scratch only using NumPy in Python. And I going to test it with CIFAR-10. I have used plain NumPy with im2col trick, but computations are slow, ~15 sec for 1 minibatch(size => 64 for per each minibatch) or fully ~4-5 hour for one epoch(1 epoch => 784 minibatches). I have tried to use plain TensorFlow computations (without native conv2d), but couldn't write im2col and col2im tricks. So, now I want to use FFT. For optimizing and getting fast results I decided to use FFT-Covolution between 4D Image with shape [batch_size, width, height, channels] and 4D filters with shape [filter_width, filter_height, in_channel, out_channel], and then add 1D bias with shape [out_channel, ]. Before FFT_Conv2D(4D data), as I understood, I should pad Image, then got out shapes. Then I need to do FFT_Conv2D(image, filter, pad, stride) => IFFT(FFT(image)*FFT(filter)). But this formula is not full for me. And now I against with some questions about FFT:

  1. Is there any full formula of FFT-Convolution between 4D data?
  2. How to use Convolution stride(also called pixel stride) with FFT?
  3. How to back-propagate with all this?


Posted 2020-09-02T19:17:07.653

Reputation: 1

Hi. Can you please put your main question in the title? If you have multiple question, this is a good sign that you need to split this post into multiple ones: one for each question. – nbro – 2020-09-02T23:09:15.400

No answers