Tag: loss-functions

19 Understanding GAN loss function 2017-06-13T10:50:56.343

11 Loss jumps abruptly when I decay the learning rate with Adam optimizer in PyTorch 2018-09-20T13:14:32.060

6 What loss function to use when labels are probabilities? 2019-04-14T22:13:13.363

6 What's the advantage of log_softmax over softmax? 2019-04-30T15:36:39.950

5 Can the mean squared error be negative? 2018-11-17T15:47:15.667

5 What is the formula used to calculate the loss in the FaceNet model? 2019-11-06T14:22:45.860

4 How to define a loss function for a classifier where the confusion between some classes is more important than the confusion between others? 2018-10-10T07:02:05.440

4 Which function $(\hat{y} - y)^2$ or $(y - \hat{y})^2$ should I use to compute the gradient? 2019-05-31T12:54:50.300

4 Why does the binary cross-entropy work better than categorical cross-entropy in a multi-class single label problem? 2019-11-09T21:30:07.420

4 Why is Jensen-Shannon divergence preferred over Kullback-Leibler divergence in measuring the performance of a generative network? 2019-11-11T16:01:15.070

4 Is there a reason to choose regular momentum over Nesterov momentum for neural networks? 2020-02-04T21:58:02.993

4 How to calculate the advantage in policy gradient functions? 2020-03-17T08:49:36.980

4 How do weights changes handles during back-propagation when there are unknown labels 2020-03-27T21:56:43.117

4 Loss function for choosing a subset of objects 2020-04-21T19:17:54.560

3 Why is the hyperbolic tangent with MSE better than the sigmoid with cross-entropy? 2018-02-28T14:52:02.613

3 Should the input to the negative log likelihood loss function be probabilities? 2018-09-01T04:03:34.277

3 How do I calculate the gradient of the hinge loss function? 2018-10-06T11:49:18.957

3 Dice loss gives binary output whereas binary crossentropy produces probability output map 2019-01-08T11:57:24.937

3 How is equation 8 derived in the paper "Self-critical sequence training for image captioning"? 2019-02-21T11:22:08.810

3 Could error surface shape be useful to detect which local minima is better for generalization? 2019-03-01T20:46:51.720

3 Alphazero policy head loss not decreasing 2019-04-24T09:08:25.843

3 If loss reduction means model improvement, why doesn't accuracy increase? 2019-06-16T13:16:04.567

3 When to use RMSE as opposed to MSE and vice versa? 2019-08-31T16:44:30.220

3 Advantages of Kullback-Leibler over L1/L2? 2019-09-11T06:49:49.830

3 When should I create a custom loss function? 2019-10-22T08:57:44.060

3 What's the difference between RMSE and Euclidean distance, and when to use a custom loss? 2019-11-15T07:27:21.777

3 What's the function that SGD takes to calculate the gradient? 2020-01-14T22:02:19.673

3 In which cases is the categorical cross-entropy better than the mean squared error? 2020-01-19T00:56:46.310

3 How to add weights to one specific input feature to ensure fair training in the network? 2020-04-14T19:24:40.270

2 Extend the loss function from the single action to the n-action case per time step 2018-04-26T13:42:48.293

2 Chess policy network 2018-11-05T17:42:12.927

2 Comparing and studying Loss Functions 2019-01-16T13:36:55.310

2 How do I get multiple loss per sample in keras evaluate? 2019-03-01T11:46:15.743

2 Why is MSE used over other quadratic loss functions? 2019-03-03T08:39:20.207

2 Heavy loss and inaccurate answer in pytorch 2019-03-27T15:06:57.007

2 How to stop DQN Q function from increasing during learning? 2019-04-24T14:15:02.803

2 Should I use the hyperbolic distance loss in the case of Poincarè Disk Model? 2019-05-03T14:11:27.203

2 How do we get the true value in the prediction objective in reinforcement learning? 2019-05-28T16:40:29.937

2 Which loss functions for transforming a density function to another density function? 2019-07-07T20:45:41.580

2 Which loss function should I use for binary classification? 2019-07-08T09:39:01.317

2 Understanding log probabilities of actions in the PPO objective 2019-07-25T14:12:08.937

2 What are the major differences between cost, loss, error, fitness, utility, objective, criterion functions? 2019-07-29T11:04:08.310

2 Are the training loss and validation loss plotted per sample or per batch? 2019-07-31T10:01:03.963

2 What loss function is appropriate for finding "points of interest" in a array of x,y inputs 2019-08-02T23:22:37.243

2 Is it possible to use Reward Function of type R(s, a, s') if more than one action is applied? 2019-08-07T13:25:17.847

2 CNN clasification model loss stuck at same value 2019-09-26T20:12:22.630

2 When and how to use a mix of loss functions for back-propagation? 2019-10-15T11:34:25.603

2 Maximize loss on non-target variable 2019-10-19T12:43:28.273

2 How to implement loss function of H-GAN model 2019-10-27T12:52:22.367

2 Loss function for increasing the quality of the image when labels are not perfectly alligned 2019-11-19T22:15:05.103

2 How do you interpret this learning curve? 2019-11-27T00:22:15.423

2 How to understand my CNN's training results? 2019-12-29T16:45:25.080

2 Tversky Loss paper implementation: Recall/Precision do not improve as stated 2020-02-12T10:07:30.837

2 How should I penalize the model proportionally to the error? 2020-02-16T19:58:04.347

2 Why is the loss associated with my neural network increasing? 2020-03-22T15:56:31.100

2 Why does GAN loss converge to log(2) and not -log(2)? 2020-03-27T16:07:28.180

2 Is there any wrong in my focal loss derivation? 2020-03-30T03:51:16.157

2 Is Mean Squared Error Loss function a good loss function for continuous variables $0 < x < 1$ 2020-04-12T14:47:32.183

2 Why does TensorFlow docs discourage using softmax as activation for the last layer? 2020-04-13T07:38:47.727

2 Single label classification into hierarchical categories using a neural network 2020-04-14T16:02:57.057

2 Is there a way of deriving a loss function given the neural network and training data? 2020-05-28T15:55:34.237

2 Should illegal moves be excluded from loss calculation in DQN algorithm? 2020-06-27T19:02:10.683

2 Why L2 loss is more commonly used in Neural Networks than other loss functions? 2020-07-27T17:57:19.977

2 Why is the mean used to compute the expectation in the GAN loss? 2020-08-21T05:01:32.977

2 Generation of 'new log probabilities' in continuous action space PPO 2020-08-26T20:02:03.287

1 How to understand marginal loglikelihood objective function as loss function (explanation of an article)? 2018-10-17T20:16:58.747

1 Training by one batch of examples, what does it mean 2018-11-22T00:03:37.663

1 What is the derivative function used in backpropagration? 2018-12-18T07:59:36.253

1 How do I calculate $max_{a′}Q(s′,a′,w−)$ when it is represented as a neural network? 2019-01-05T11:08:11.157

1 How to obtain a formula for loss, when given an iterative update rule in gradient descent? 2019-02-12T12:32:36.097

1 Why am I getting spikes in the values of the loss function during training? 2019-02-20T16:17:21.213

1 Which local minima to choose according to the shape of the error surface? 2019-03-01T21:47:19.977

1 Why isn't the reverse KL divergence commonly used in supervised learning? 2019-04-05T09:08:56.710

1 Understanding how the loss was calculated for the SQuAD task in BERT paper 2019-04-20T01:13:46.130

1 Train and Test Accuracy of GRU network not increasing after 2nd epoch 2019-05-16T18:50:27.717

1 A2C Critic Loss Interpretation 2019-05-25T14:53:36.397

1 Add a layer derivative in the loss function 2019-06-05T06:05:35.527

1 Unit integral condition on the output layer 2019-06-08T00:45:20.980

1 Why such a big difference in number between training error and validation error? 2019-06-15T13:50:26.107

1 Limits for a bottleneck 2019-06-19T19:34:09.387

1 What is the best loss function for convolution neural network and autoencoder? 2019-07-11T11:59:21.147

1 Could the Jensen-Shannon divergence and Kullback-Leibler divergence be used as loss functions of non-generation problems? 2019-07-24T13:07:19.477

1 LSTM text classifier shows unexpected cyclical pattern in loss 2019-07-26T12:39:24.413

1 What are the loss functions used in teacher-student learning models? 2019-08-07T04:17:09.163

1 How to interpret a large variance of the loss function? 2019-08-08T17:49:45.647

1 Why is image classification tasks are dominated by minimizing cost function instead of maximizing ones? 2019-10-14T01:28:18.200

1 How is the percentage or the probablity calculated using Loss function in Facenet Model? 2019-11-11T13:53:17.760

1 How would the "best function" been constructed if there are no computationally limitations? 2019-11-21T16:42:06.473

1 A generalized quadratic loss and Newton iteration for Support Vector Regression, why doesn't it generalize well? 2019-11-26T08:05:25.553

1 Using U-NET for image semantic segmentation 2019-12-11T15:39:23.340

1 Should you use the log of the independent variable to train if you're using RMSLE? 2019-12-17T22:43:07.927

1 Outliers detection problem in neural networks 2019-12-23T12:58:50.090

1 Deduce properties of the loss functions from the training loss curves 2020-01-05T00:42:30.370

1 Why does PyTorch use a different formula for the cross-entropy? 2020-01-15T05:18:46.020

1 What is the difference between batches in deep Q learning and supervised learning? 2020-01-15T09:48:10.630

1 RealNVP gives wrong probabilities 2020-02-02T02:11:39.033

1 Keras MLP returns always loss 0.0 2020-02-13T08:53:29.510

1 How to reduce fluctuation of a neural network? 2020-02-18T14:01:18.790

1 Face recognition model loss not decreasing 2020-03-05T21:52:52.377