212 How to set class weights for imbalanced classes in Keras? 2016-08-17T09:35:45.110

154 What is the "dying ReLU" problem in neural networks? 2015-05-07T04:11:56.600

143 When to use GRU over LSTM? 2016-10-17T11:47:45.340

140 How to draw Deep learning network architecture diagrams? 2016-11-03T03:10:24.893

126 How do you visualize neural network architectures? 2016-07-18T17:08:17.237

101 Choosing a learning rate 2014-06-16T18:08:38.623

80 Time series prediction using ARIMA vs LSTM 2016-07-11T16:45:21.020

75 When to use (He or Glorot) normal initialization over uniform init? And what are its effects with Batch Normalization? 2016-07-28T17:12:29.933

61 Adding Features To Time Series Model LSTM 2017-02-21T22:17:40.000

58 What is the difference between "equivariant to translation" and "invariant to translation" 2017-01-04T08:41:15.700

56 Why mini batch size is better than one single "batch" with all training data? 2017-02-07T12:40:25.200

55 How to fight underfitting in a deep neural net 2014-07-13T09:04:39.703

55 Cross-entropy loss explanation 2017-07-10T10:26:39.450

53 Does batch_size in Keras have any effects in results' quality? 2016-07-01T11:54:14.957

51 Number of parameters in an LSTM model 2016-03-09T11:14:20.163

51 What is the difference between Gradient Descent and Stochastic Gradient Descent? 2018-08-04T06:36:04.657

45 Why should the data be shuffled for machine learning tasks 2017-11-09T07:42:15.517

44 In softmax classifier, why use exp function to do normalization? 2017-09-20T05:53:18.477

42 Deep Learning vs gradient boosting: When to use what? 2014-11-20T06:49:00.357

41 How to get accuracy, F1, precision and recall, for a keras model? 2019-02-06T13:29:24.533

40 What is Ground Truth 2017-03-24T12:09:14.510

40 Data science related funny quotes 2018-12-14T14:37:31.253

39 Choosing between CPU and GPU for training a neural network 2017-05-25T23:48:26.343

39 Multi GPU in Keras 2017-10-18T20:30:52.027

37 Merging two different models in Keras 2017-12-29T08:12:48.523

35 Intuitive explanation of Noise Contrastive Estimation (NCE) loss? 2016-08-05T03:36:04.553

34 Paper: What's the difference between Layer Normalization, Recurrent Batch Normalization (2016), and Batch Normalized RNN (2015)? 2016-07-23T09:46:42.783

34 How does Keras calculate accuracy? 2016-10-07T08:10:51.287

34 Does gradient descent always converge to an optimum? 2017-11-09T16:41:20.940

34 How to set the number of neurons and layers in neural networks 2018-01-13T15:26:31.233

33 Are there free cloud services to train machine learning models? 2017-11-03T12:41:54.203

33 Why is ReLU used as an activation function? 2018-01-10T13:07:47.997

30 Are there any rules for choosing the size of a mini-batch? 2017-04-17T16:18:22.793

28 Why do convolutional neural networks work? 2016-12-23T12:43:47.203

27 Can machine learning learn a function like finding maximum from a list? 2019-07-31T11:06:16.047

26 Why are NLP and Machine Learning communities interested in deep learning? 2014-10-11T10:24:01.393

26 PyTorch vs. Tensorflow Fold 2017-02-08T10:26:16.887

26 Time Series prediction using LSTMs: Importance of making time series stationary 2017-11-16T07:57:54.843

26 Should we apply normalization to test data as well? 2018-02-08T16:53:24.653

26 What is the difference between fit() and fit_generator() in Keras? 2018-07-13T20:21:41.933

26 Keras vs. tf.keras 2019-03-21T20:20:04.660

25 What is weight and bias in deep learning? 2017-05-20T21:40:08.787

24 Deep learning basics 2014-12-08T22:37:32.777

24 Keras difference beetween val_loss and loss during training 2017-11-30T19:33:23.220

24 What does Logits in machine learning mean? 2018-04-30T14:55:54.370

22 Choosing between TensorFlow or Theano as backend for Keras 2015-12-07T16:42:04.107

22 Keyword/phrase extraction from Text using Deep Learning libraries 2016-02-03T10:56:51.447

22 Early stopping on validation loss or on accuracy? 2018-08-20T12:22:25.053

22 How to use LeakyRelu as activation function in sequence DNN in keras?When it perfoms better than Relu? 2018-10-02T04:06:47.510

21 Convolutional neural network overfitting. Dropout not helping 2017-08-22T23:52:26.863

21 local minima vs saddle points in deep learning 2017-09-05T19:14:30.057

20 Why convolutions always use odd-numbers as filter_size 2017-09-20T17:53:58.877

20 Uploading images folder from my system into Google Colab 2018-03-23T18:52:28.867

20 How to add non-image features along side images as the input of CNNs 2018-05-08T12:13:16.647

19 Hyperparameter search for LSTM-RNN using Keras (Python) 2016-01-17T18:26:54.320

19 How to calculate the mini-batch memory impact when training deep learning models? 2016-07-07T13:51:42.563

19 How to get predictions with predict_generator on streaming test data in Keras? 2016-09-07T15:14:56.833

19 How to add a new category to a deep learning model? 2016-12-10T01:43:09.343

19 What are kernel initializers and what is their significance? 2018-08-24T04:30:57.397

17 Bagging vs Dropout in Deep Neural Networks 2015-11-16T14:41:08.553

17 Why ReLU is better than the other activation functions 2017-10-03T14:17:09.163

17 Prediction interval around LSTM time series forecast 2017-11-06T12:16:39.393

17 Is there away to change the metric used by the Early Stopping callback in Keras? 2018-01-19T15:53:48.463

16 Advantages of stacking LSTMs? 2017-08-29T16:48:40.890

16 Should I use GPU or CPU for inference? 2017-09-26T22:13:18.027

16 Parameterization regression of rotation angle 2017-11-21T15:33:00.287

16 Updating the weights of the filters in a CNN 2017-12-17T21:51:57.997

16 Image resizing and padding for CNN 2018-04-25T13:46:47.773

15 What is a 1D Convolutional Layer in Deep Learning? 2017-02-28T08:12:08.210

15 Can we generate huge dataset with Generative Adversarial Networks 2017-04-04T11:26:17.290

15 Why should the initialization of weights and bias be chosen around 0? 2017-08-09T07:30:39.773

15 PyTorch vs. Tensorflow eager 2017-11-07T17:12:14.060

15 What is LSTM, BiLSTM and when to use them? 2017-12-14T01:53:22.020

15 Multi task learning in Keras 2018-02-05T19:56:47.897

15 Can the number of epochs influence overfitting? 2018-02-07T13:34:05.250

15 Why does adding a dropout layer improve deep/machine learning performance, given that dropout suppresses some neurons from the model? 2018-08-16T12:18:54.423

15 What is the difference between upsampling and bi-linear upsampling in a CNN? 2018-09-11T20:49:50.477

15 Can BERT do the next-word-predict task? 2019-02-28T08:37:42.190

14 Visualizing deep neural network training 2014-12-10T10:15:00.940

14 How word2vec can be used to identify unseen words and relate them to already trained data 2015-12-26T03:47:48.800

14 How are deep-learning NNs different now (2016) from the ones I studied just 4 years ago (2012)? 2016-10-04T13:13:15.930

14 Is there a person class in ImageNet? Are there any classes related to humans? 2018-02-11T08:21:22.517

14 How to maximize recall? 2018-03-09T15:36:05.657

14 Understanding Timestamps and Batchsize of Keras LSTM considering Hiddenstates and TBPTT 2018-08-31T12:26:19.527

14 What is the relationship between the accuracy and the loss in deep learning? 2018-12-14T09:08:14.053

14 In the context of Deep Learning, what is training warmup steps 2019-07-19T10:10:22.070

14 Gumbel-Softmax trick vs Softmax with temperature 2019-08-29T10:30:50.857

13 Validation loss and accuracy remain constant 2016-08-23T06:19:59.023

13 What is the difference between Dilated Convolution and Deconvolution? 2017-08-18T14:09:42.870

13 What is Monte Carlo dropout? 2019-01-16T02:09:44.783

13 Is Gradient Descent central to every optimizer? 2019-03-12T10:04:15.807

12 Deep Learning with Spectrograms for sound recognition 2016-01-29T15:39:26.277

12 Reshaping of data for deep learning using Keras 2016-05-12T13:41:11.543

12 Question about bias in Convolutional Networks 2016-05-20T17:29:10.243

12 Machine Learning vs Deep Learning 2017-01-20T10:45:27.660

12 Reason for square images in deep learning 2017-01-29T10:02:56.527

12 Could Deep Learning be used to crack encryption? 2017-01-31T05:32:36.167

12 deep learning for non-image non-NLP tasks? 2017-03-08T11:01:26.670

12 Relation between convolution in math and CNN 2017-06-27T14:23:57.873