362 The Two Cultures: statistics vs. machine learning? 2010-07-19T19:14:44.080

315 What is the difference between test set and validation set? 2011-11-28T11:05:15.103

301 How to understand the drawbacks of K-means 2015-01-16T04:38:13.310

188 Bagging, boosting and stacking in machine learning 2011-11-24T16:51:07.883

187 What is the difference between data mining, statistics, machine learning and AI? 2010-11-30T11:26:15.473

184 Why is Euclidean distance not a good metric in high dimensions? 2014-05-18T17:50:50.803

167 How to know that your machine learning problem is hopeless? 2016-07-05T08:22:25.707

159 What does the hidden layer in a neural network compute? 2013-07-02T15:59:07.463

133 Why the sudden fascination with tensors? 2016-02-23T09:38:31.690

117 ROC vs precision-and-recall curves 2011-02-14T17:10:17.143

108 Obtaining knowledge from a random forest 2012-01-16T11:09:29.237

107 Detecting a given face in a database of facial images 2011-02-14T22:41:09.187

107 Generative vs. discriminative 2011-06-27T20:40:22.503

101 Choice of K in K-fold cross-validation 2012-05-04T03:52:09.103

100 What skills are required to perform large scale statistical analyses? 2011-03-02T19:05:46.350

99 Training with the full dataset after cross-validation? 2011-06-05T16:50:50.747

85 Conditional inference trees vs traditional decision trees 2011-06-20T21:45:43.460

85 How does a Support Vector Machine (SVM) work? 2012-02-16T13:25:16.237

84 What is the influence of C in SVMs with linear kernel? 2012-06-23T19:54:55.740

81 What are the advantages of ReLU over sigmoid function in deep neural networks? 2014-12-02T02:13:49.903

80 How to select kernel for SVM? 2011-11-07T11:12:21.673

74 A list of cost functions used in neural networks, alongside applications 2015-05-31T19:37:16.517

74 Explain "Curse of dimensionality" to a child 2015-08-28T09:11:08.653

73 Why is Newton's method not widely used in machine learning? 2016-12-29T01:00:02.270

72 Help me understand Support Vector Machines 2010-10-24T15:11:52.427

71 Having a job in data-mining without a PhD 2012-05-01T23:39:27.387

71 How do you calculate precision and recall for multiclass classification using confusion matrix? 2013-03-04T15:56:01.283

70 Best way to present a random forest in a publication? 2010-09-03T13:50:51.707

69 What are the main differences between K-means and K-nearest neighbours? 2013-04-18T17:15:43.803

66 Skills hard to find in machine learners? 2014-06-24T07:11:36.400

66 When should linear regression be called "machine learning"? 2017-03-20T22:10:20.387

65 What is an embedding layer in a neural network? 2015-11-20T16:43:12.653

64 Gradient Boosting Tree vs Random Forest 2015-09-20T20:44:06.297

63 Is it possible to train a neural network without backpropagation? 2016-09-20T01:48:21.347

61 Feature selection for "final" model when performing cross-validation in machine learning 2010-09-02T10:25:42.330

60 Proper way of using recurrent neural network for time series analysis 2011-03-08T07:16:01.813

58 How to split the dataset for cross validation, learning curve, and final evaluation? 2014-04-30T10:44:06.227

57 Euclidean distance is usually not good for sparse data? 2012-06-01T13:55:13.253

56 How to compute precision/recall for multiclass-multilabel classification? 2012-01-23T12:54:06.603

56 What is the difference between a neural network and a deep belief network? 2013-03-04T04:18:42.890

56 What does a "closed-form solution" mean? 2013-09-23T23:31:26.477

56 Variable selection for predictive modeling really needed in 2016? 2016-05-28T20:13:33.140

55 Solving for regression parameters in closed-form vs gradient descent 2012-02-20T01:47:19.123

53 How to tune hyperparameters of xgboost trees? 2015-09-04T02:23:37.617

52 Machine Learning using Python 2011-03-27T04:00:59.400

52 Machine learning cookbook / reference card / cheatsheet? 2011-06-27T03:33:31.423

52 Using deep learning for time series prediction 2013-08-29T11:37:17.993

51 What algorithm should I use to detect anomalies on time-series? 2015-05-16T21:10:31.350

48 Why isn't Logistic Regression called Logistic Classification? 2014-12-07T18:44:41.497

48 What makes the Gaussian kernel so magical for PCA, and also in general? 2015-01-02T08:18:21.320

46 How can I help ensure testing data does not leak into training data? 2011-12-19T22:49:14.553

46 Why is accuracy not the best measure for assessing classification models? 2017-11-09T07:32:57.200

45 tanh activation function vs sigmoid activation function 2014-06-08T06:11:24.523

44 Where to start with statistics for an experienced developer 2015-10-13T01:57:02.817

43 What's the difference between feed-forward and recurrent neural networks? 2010-08-30T15:33:28.180

43 How and why do normalization and feature scaling work? 2012-11-01T20:20:48.747

42 Book for reading before Elements of Statistical Learning? 2011-11-26T03:12:43.130

41 Understanding Naive Bayes 2012-01-27T17:29:14.873

41 Are all models useless? Is any exact model possible -- or useful? 2015-04-02T00:59:47.333

41 How to intuitively explain what a kernel is? 2015-05-18T19:43:42.813

40 Is machine learning less useful for understanding causality, thus less interesting for social science? 2011-11-09T03:54:36.453

40 Random Forest, is it a boosting algorithm? 2013-11-19T16:34:47.087

40 Perform feature normalization before or within model validation? 2013-11-22T13:16:15.647

40 How large should the batch size be for stochastic gradient descent? 2015-03-07T21:18:36.213

40 Why are neural networks becoming deeper, but not wider? 2016-07-09T06:35:12.870

40 Neural network references (textbooks, online courses) for beginners 2016-08-02T16:35:34.477

39 Neural networks vs support vector machines: are the second definitely superior? 2012-06-08T02:59:39.850

37 Variance and bias in cross-validation: why does leave-one-out CV have higher variance? 2013-06-14T20:14:49.827

37 Linear kernel and non-linear kernel for support vector machine? 2013-10-17T02:21:02.553

37 What are alternatives of Gradient Descent? 2014-05-09T07:21:38.047

37 Why do Convolutional Neural Networks not use a Support Vector Machine to classify? 2015-08-20T14:43:48.633

37 What is the difference between off-policy and on-policy learning? 2015-12-02T14:21:48.643

36 Application of machine learning methods in StackExchange websites 2011-04-22T22:27:24.467

36 Why not approach classification through regression? 2012-02-05T05:43:32.493

36 Clustering with K-Means and EM: how are they related? 2013-11-18T11:47:06.623

35 Objective function, cost function, loss function: are they the same thing? 2015-10-25T22:03:48.117

34 Cloud computing platforms for machine learning 2011-11-03T21:33:09.320

34 Recall and precision in classification 2013-06-26T09:22:21.867

34 Recurrent vs Recursive Neural Networks: Which is better for NLP? 2015-05-22T17:50:20.360

34 Binary classification with strongly unbalanced classes 2016-09-19T18:39:25.333

33 Creating a "certainty score" from the votes in random forests? 2011-06-27T22:33:57.440

33 Things to consider about masters programs in statistics 2012-04-02T17:12:55.010

33 Does the optimal number of trees in a random forest depend on the number of predictors? 2012-09-12T14:07:49.333

33 Is a strong background in maths a total requisite for ML? 2012-10-20T10:44:44.513

33 Pandas / Statsmodel / Scikit-learn 2013-01-17T01:02:28.963

33 Hold-out validation vs. cross-validation 2014-06-25T13:41:15.927

33 Class imbalance in Supervised Machine Learning 2015-01-05T12:14:33.273

33 Why use gradient descent for linear regression, when a closed-form math solution is available? 2017-05-10T16:52:30.517

32 Data mining: How should I go about finding the functional form? 2011-05-05T16:26:00.037

32 Implementation of CRF in python 2012-09-28T20:19:56.680

32 Guideline to select the hyperparameters in Deep Learning 2014-04-28T12:48:35.280

32 Why downsample? 2014-11-02T19:25:07.250

32 Pre-training in deep convolutional neural network? 2015-07-28T18:30:02.047

32 Can a random forest be used for feature selection in multiple linear regression? 2015-07-30T21:52:22.147

32 Why use regularisation in polynomial regression instead of lowering the degree? 2016-07-31T14:36:14.337

32 Is there any supervised-learning problem that (deep) neural networks obviously couldn't outperform any other methods? 2017-02-20T00:46:38.153

31 Can you overfit by training machine learning algorithms using CV/Bootstrap? 2012-05-29T03:04:46.313

31 Understanding "almost all local minimum have very similar function value to the global optimum" 2016-03-23T17:02:05.307

31 When is unbalanced data really a problem in Machine Learning? 2017-06-02T12:08:34.323