What is an intuitive explanation for the Importance Weighted Autoencoder?



I have been reading a paper by Burda et al. on Importance Weighted Autoencoders(IWAE) but I can't quite grasp what they mean by sampling the terms h1...hk. Do they mean you have separate models from which you sample and then average over?


post something more from the paper explaining what are those terms – Francesco Pegoraro – 2018-10-03T12:58:35.217

