6

2

I'm captivated by autoencoders and really like the idea of convolution. It seems though that both Theano and TensorFlow only support conv2d to go from an array of 2D-RGB (n 3D arrays) to an array of higher-depth images. That makes sense from the traditional tensor-product math, c_ijkl = sum{a_ijk*b_klm}, but means it's hard to 'de-convolve' an image.

In both cases, if I have an image (in #batch, depth, height, width form), I can do a conv to get (#batch, num_filters, height/k, width/k). I'd really like to do the opposite, like going from (#batch, some_items, height/k, width/k) to (#batch, depth, height, width).

TensorFlow had the hidden deconv2d function for a while (in 0.6, I think, undocumented), but I'd like to know if there's a math trick I can use to get a bigger output in the last two dimensions after a convolution than the input. I'd settle for a series of differentiable operations, like conv -> resize, but I want to avoid just doing a dense matrix multiplication -> resize like I've been doing so far.

EDIT: As of today (2016/02/17) TensorFlow 0.7 has the tf.depth_to_space method, which helps greatly in this endeavor. (https://www.tensorflow.org/api_docs/python/tf/depth_to_space) I would still love a Theano based solution, too, to complete my understanding of the material.

Alas, I've seen this link. Thank you, though. The reason it didn't fit for me was it relied on tiling the output for a 2x upscaling. I could perhaps do as you suggested and produce a feature vector much larger than the convolution output, then resize, but it'll take me a while to figure out how that reshape changes the flow of data. – Joseph Catrambone – 2016-03-21T23:20:41.257