0

Here is my understanding of those 2 terms:

**Hyper-parameter:** A variable that is set by a human before the training process starts. Examples are the number of hidden-layers in a Neural Network, the number of neurons in each layer, etc. Some models don't have any hyper-parameters, like the linear model.

**Parameter:** A variable in which the training process will update. For instance, the weights of a Neural Network are parameters as they are updated as we train the network and there is no human intervention on the process. Another example would be the slope and the y-intercept in a simple linear model.

Having said that, what would the *learning rate parameter* ($\eta$) be?

$$ \Theta_{i+1} = \Theta_{i} + \eta \nabla J(\Theta_{i} ) $$

My understanding is that $\eta$ is set before the training starts to a large value but then, as the training progresses and the function gets closer and closer to a local minimum, the learning rate is decreased. In that case, doesn't the learning parameter satisfy both the definitions of a parameter and of a hyper-parameter?

4Your definitions are wrong. If you are setting it, it is a hyperparameter. If you are

estimatingit, then it is a parameter. This can in fact be done, but not as you indicated. In your model, your learning rate has aschedulebut it is still a hyperparameter. – Emre – 2017-07-31T04:39:28.543" then, as the training progresses and the function gets closer and closer to a local minimum, the learning rate is decreased"that only depends on your learning strategy (as in, the optimizer). There is usually no "learning" of the learning rate unless in a few cases in literature (meta-learning). – E_net4 wants more flags – 2017-07-31T13:49:30.747