What is the use of the $\epsilon$ term in this back-propagation equation?


I am currently looking at different documents to understand back-propagation, mainly at this document. Now, at page 3, there is the $\epsilon$ symbol involved:

Gradient equation

While I understand the main part of the equation, I don't understand the $\epsilon$ factor. Searching for the meaning of the $\epsilon$ in math, it means (for example) a error value to be minimized, but why should I multiply with the error (it is denoted as E anyways).

Shouldn't the $\epsilon$ be the learning rate in this equation? I think that would be what makes sense, because we want to calculate by how much we want to adjust the weight, and since we calculate the gradient, I think the only thing that's missing is the multiplication with the learning rate. The thing is, isn't the learning rate usually denoted with the $\alpha$?


Posted 2018-12-03T18:50:01.203

Reputation: 307



The change in nomenclature from what you expected is Lisa Meeden's choice, for unknown reasons. Those with whom she published in the past used $\epsilon$ to represent error, the result of a loss function. Why she did not use $\alpha$ may be because the letter $a$ was used elsewhere in the formula, but, if that was the reasons, that wasn't a great one.

Changes in nomenclature between articles is common, for instance between Carnegie Mellon literature and Cal Tech. Robotics control systems people use different nomenclature from those who write game competition systems. The online mappings between choice of Greek letters and their meaning cannot be relied upon. This is not particular to AI. I've seen it in science and engineering literature, even between text books in courses that are part of the same university curriculum.


Posted 2018-12-03T18:50:01.203

Reputation: 356