Other Deep Learning Networks for Visual Place Recognition?



I am doing a project on Visual Place Recognition in Changing Environments. The CNN used here is mostly AlexNet, and a feature vector is constructed from Layer 3. Does anyone know of similar work using other CNN's e.g. VGGnet (which I am trying to use) and the corresponding layers please?

I have been trying out the different layers of VGGnet-16. I am trying to get the nearest correspondence to the query image by using the cosine difference between query image and database images. So far no good results.


Daniel Wong

Posted 2018-03-01T12:50:55.907

Reputation: 51

You should consider using a ResNet architecture and make use of BatchNorm layers. This is a state-of-the-art architecture that is much easier to train and will give much better results. The depth is up to you and should be as much as you need to accuratly solve your task (e.g. start with a ResNet50 or ResNet20). AlexNet and VGG are not really up-to-date anymore and much harder to optimze than ResNet architectures. – Marcel_marcel1991 – 2018-09-30T12:58:01.537



Neural networks construct increasingly complex representations of data on each of their layers , so you are free to choose any neural network architechture for this purpose . since the lower layers of neural network (layers near the input layer) mostly compute low level representations of the image (like gabor filters etc) most architechtures won't have much difference at this level. so you can use VGGnet if you want with proper fine-tuning from layer 3 itself.


Posted 2018-03-01T12:50:55.907

Reputation: 920

Thanks for replying. AlexNet is one of the simplest CNNs, in VGGnet, there are 3 layer 3s and because VGGnet is deeper, each of the layers are derived from more preceding layers (and suceeding layers in the case of back propagation). I don't get good results from all layers of layer 3. – Daniel Wong – 2018-06-01T22:28:28.460

alexnet is simple only in terms of number of layers ,but its quite heavy on parameters , consumes more memory than vggnet , and results can be mostly due to bad finetuning , can you post the full details of model? or perhaps code? – riemann77 – 2018-06-02T05:18:42.337