What does end-to-end training mean?


In simple words, what does end-to-end training mean, in the context of deep learning?

Färid Alijani

Posted 2019-11-17T16:10:57.923

Reputation: 285


You have a couple of good answers to the following Quora question What does it mean for a neural network to be trained end-to-end?.

– Brale – 2019-11-17T16:55:39.803



Another explanation of deep learning as an end-to-end framework is in deep learning, pre-processing or feature extraction steps are not necessary. So it only uses a single processing step, which is to train the deep learning model. In other traditional machine learning methods, some separated feature extraction steps usually required.

enter image description here

For example in image classification, deep learning frameworks like CNN can receive a raw image and then trained to classify it directly. If we didn't use deep learning, we need to extract some features using more steps, like edge detection, corner detection, color histogram, etc.

you can also watch Andrew Ng's explanation here


Posted 2019-11-17T16:10:57.923

Reputation: 2 112

Technically you often need a stage of feature preparation/pre-processing, to normalise inputs into ranges suitable for NNs. However, that is quite simple. – Neil Slater – 2019-11-18T08:19:18.860


This is relevant when you have two or more neural networks serving as components to a larger architecture. Training this architecture in an end-to-end manner means simultaneously training all components (i.e. training it as a single network).

The best example I can think of are image captioning architectures. These usually comprise of two networks: a CNN whose role is to extract features from the input images and a RNN that accepts the CNN's features and generates the output captions.

You have two options for training:

  1. First, train the CNN first for some arbitrary task (e.g. image classification) in hopes that it learns how to extract features. Then use the CNN to extract features from the input images and use those as inputs to train the RNN. This procedure trains the two components in two completely separate phases.

  2. Treat the whole architecture as a single network and backpropagete the gradients to the CNN so that it also can be trained. This procedure trains the two components simultaneously. This is what we call end-to-end training.


Posted 2019-11-17T16:10:57.923

Reputation: 2 624


End to end means deep learning is the only thing that is used.

Many people have doubts on its viability though, I certainly do. I wouldn't trust an end-to-end DL based self driving car.


Posted 2019-11-17T16:10:57.923

Reputation: 554

I think that this answer is more consistent with the usage of end-to-end training.

– nbro – 2020-03-13T18:06:13.627

I have done stuff in the robotics department at my university and End to End DL explicitly refers to DL being used to generate "intelligent" responses in an end to end fashion. – FourierFlux – 2020-03-13T21:52:37.053