What are the differences between backbones, frontends, models and architectures in applied deep learning?



I'm trying to dive into deep learning for tasks on images, and trying to figure out how to reuse some well-known structures* that have been published, mainly on github.
( *Here, structure can be replaced by here by one or more of the concepts used hereafter.)


But while reading articles, blog posts and watching videos or paper presentations on deep learning, especially about ConvNets and researches applied on images (classification, object detection, semantic segmentation or scene understanding) I'm struggling with these concepts; backbones, frontends, models, networks, and architectures.

For me, they are almost interchangeable, except maybe for the model, which is according to my current knowledge, the resulting weights matrix of the learning process (which undoubtedly has to be associated with the network used for this training phase).


I would be very kind if someone can define these concepts and explain their differences thoroughly and rigorously (maybe with links to papers if they have been commonly accepted in the scientific literature).


Posted 2020-07-10T11:44:46.517

Reputation: 101

1Hi and welcome! Can you please cite some sources that use those terms? It's very likely that people use the terms in a non-rigorous or inconsistent manner, so the context could be useful to provide a more precise answer. Also, I would focus on the comparison of two terms. – nbro – 2020-07-10T12:03:32.717

No answers