132 What does AUC stand for and what is it? 2015-01-09T10:35:35.967
101 Choice of K in K-fold cross-validation 2012-05-04T03:52:09.103
96 Cohen's kappa in plain English 2014-01-13T19:14:38.847
85 How does a Support Vector Machine (SVM) work? 2012-02-16T13:25:16.237
72 Help me understand Support Vector Machines 2010-10-24T15:11:52.427
71 How do you calculate precision and recall for multiclass classification using confusion matrix? 2013-03-04T15:56:01.283
70 Best way to present a random forest in a publication? 2010-09-03T13:50:51.707
68 How to produce a pretty plot of the results of k-means cluster analysis? 2012-06-25T17:47:20.637
61 Feature selection for "final" model when performing cross-validation in machine learning 2010-09-02T10:25:42.330
56 How to compute precision/recall for multiclass-multilabel classification? 2012-01-23T12:54:06.603
53 Alternatives to logistic regression in R 2010-08-31T10:02:07.947
51 How to calculate Area Under the Curve (AUC), or the c-statistic, by hand 2015-04-09T17:53:46.377
48 How to plot ROC curves in multiclass classification? 2010-08-27T01:56:42.663
48 Why isn't Logistic Regression called Logistic Classification? 2014-12-07T18:44:41.497
46 How can I help ensure testing data does not leak into training data? 2011-12-19T22:49:14.553
46 Why is accuracy not the best measure for assessing classification models? 2017-11-09T07:32:57.200
40 Random forest assumptions 2013-05-15T14:13:52.850
40 Why is logistic regression a linear classifier? 2014-04-12T19:34:29.373
40 Why are neural networks becoming deeper, but not wider? 2016-07-09T06:35:12.870
37 Linear kernel and non-linear kernel for support vector machine? 2013-10-17T02:21:02.553
36 Why not approach classification through regression? 2012-02-05T05:43:32.493
35 Why do naive Bayesian classifiers perform so well? 2012-02-08T20:39:06.780
35 Features for time series classification 2013-02-25T12:34:01.680
34 Binary classification with strongly unbalanced classes 2016-09-19T18:39:25.333
32 How to interpret OOB and confusion matrix for random forest? 2012-06-18T17:43:15.950
32 Implementation of CRF in python 2012-09-28T20:19:56.680
32 Why downsample? 2014-11-02T19:25:07.250
31 Statistical classification of text 2010-07-19T21:17:30.543
31 Training a decision tree against unbalanced data 2012-05-08T16:13:27.683
31 When is unbalanced data really a problem in Machine Learning? 2017-06-02T12:08:34.323
30 Free data set for very high dimensional classification 2010-07-29T12:02:28.347
30 Improve classification with many categorical variables 2014-04-25T17:14:28.573
29 Softmax vs Sigmoid function in Logistic classifier? 2016-09-06T15:46:09.037
27 SVM, Overfitting, curse of dimensionality 2012-08-28T20:12:54.357
26 Variable selection procedure for binary classification 2010-07-22T11:10:29.417
26 How to statistically compare the performance of machine learning classifiers? 2012-12-13T16:20:14.137
25 Area under curve of ROC vs. overall accuracy 2013-09-01T10:21:11.063
24 Detecting patterns of cheating on a multi-question exam 2011-03-04T23:19:54.100
24 What is the difference between Multiclass and Multilabel Problem 2011-06-13T05:35:36.353
24 How to determine the quality of a multiclass classifier 2012-11-23T12:46:16.983
24 How to interpret F-measure values? 2013-02-04T11:38:17.883
23 How to measure/rank "variable importance" when using CART? (specifically using {rpart} from R) 2011-01-23T22:06:03.373
23 Which search range for determining SVM optimal C and gamma parameters? 2012-11-19T16:33:43.513
23 What can cause PCA to worsen results of a classifier? 2013-03-19T23:52:05.573
23 Logistic regression vs. LDA as two-class classifiers 2014-04-25T23:20:54.617
22 Visualizing the calibration of predicted probability of a model 2012-03-29T14:52:38.517
22 Restricted Boltzmann machines vs multilayer neural networks 2012-10-17T17:09:14.977
22 Does it make sense to combine PCA and LDA? 2014-07-07T23:25:30.227
22 How is Naive Bayes a Linear Classifier? 2015-03-17T22:52:27.903
21 Top five classifiers to try first 2011-02-25T09:45:02.317
21 Is cross validation a proper substitute for validation set? 2011-11-23T23:33:35.550
21 Why does the least square solution give poor results in this case? 2012-11-18T06:00:17.873
21 Cross-validation or bootstrapping to evaluate classification performance? 2013-09-26T19:54:34.270
21 How can top principal components retain the predictive power on a dependent variable (or even lead to better predictions)? 2015-03-15T20:09:34.417
21 Apply word embeddings to entire document, to get a feature vector 2016-07-01T17:16:48.650
21 What is the root cause of the class imbalance problem? 2016-11-25T19:02:49.697
20 Why do researchers use 10-fold cross validation instead of testing on a validation set? 2013-02-10T16:36:53.663
20 How large a training set is needed? 2013-03-06T17:06:01.510
20 PCA and the train/test split 2013-04-10T14:06:16.037
20 Why is AUC higher for a classifier that is less accurate than for one that is more accurate? 2014-03-20T03:24:12.147
20 Relative importance of a set of predictors in a random forests classification in R 2014-04-03T00:17:35.487
20 What is the difference between a loss function and decision function? 2014-06-27T09:00:41.720
20 Bag-of-Words for Text Classification: Why not just use word frequencies instead of TFIDF? 2015-05-19T18:30:00.167
19 What's the correct way to test the significance of classification results 2012-02-08T16:20:56.557
19 Supervised clustering or classification? 2012-09-19T14:40:21.963
19 Three versions of discriminant analysis: differences and how to use them 2013-09-30T16:18:36.080
19 When is it appropriate to use an improper scoring rule? 2016-04-21T06:14:30.320
19 Is there any algorithm combining classification and regression? 2016-11-14T18:42:08.790
19 Classification probability threshold 2017-11-06T07:10:51.293
18 Alternatives to classification trees, with better predictive (e.g: CV) performance? 2010-10-10T09:27:49.817
18 Social network datasets 2010-11-11T17:50:04.680
18 Difference between naive Bayes & multinomial naive Bayes 2012-07-27T14:17:18.010
18 How to control the cost of misclassification in Random Forests? 2013-01-04T11:02:00.003
18 Convolutional neural network for time series? 2014-12-10T18:52:44.653
18 How to interpret Mean Decrease in Accuracy and Mean Decrease GINI in Random Forest models 2016-02-22T00:19:15.903
17 When are Shao's results on leave-one-out cross-validation applicable? 2010-09-03T16:15:14.543
17 Summary of "Large p, Small n" results 2011-07-25T23:17:00.903
17 Large scale text classification 2011-08-26T16:08:13.640
17 Semi-supervised learning, active learning and deep learning for classification 2011-10-06T21:04:45.743
17 From the Perceptron rule to Gradient Descent: How are Perceptrons with a sigmoid activation function different from Logistic Regression? 2015-02-18T17:34:05.330
17 Quiz: Tell the classifier by its decision boundary 2017-08-05T16:59:07.987
16 Is building a multiclass classifier better than several binary ones? 2012-06-18T15:12:49.837
16 How to handle the difference between the distribution of the test set and the training set? 2012-11-16T04:43:16.293
16 I want to build a crime index and political instability index based in news stories 2012-11-24T03:59:01.870
16 Test for linear separability 2013-01-17T04:44:10.380
16 Test accuracy higher than training. How to interpret? 2013-05-21T14:40:13.903
16 Random forest is overfitting? 2013-08-04T23:53:27.050
16 What is meant by 'weak learner'? 2014-01-13T03:43:47.203
16 interpreting y axis of a partial dependence plots 2014-10-24T21:51:20.273
16 When should I not use an ensemble classifier? 2015-06-24T00:08:05.220
16 State of the art in general learning from data in '69 2016-02-01T15:31:09.600
15 What is the best out-of-the-box 2-class classifier for your application? 2010-07-20T09:43:23.910
15 What is a good resource that includes a comparison of the pros and cons of different classifiers? 2011-10-16T11:04:56.067
15 Why does ridge regression classifier work quite well for text classification? 2011-10-29T18:14:54.547
15 Predicting with both continuous and categorical features 2012-04-19T14:56:45.380
15 Combining classifiers by flipping a coin 2012-05-06T02:49:56.133
15 Low classification accuracy, what to do next? 2012-09-28T20:41:24.573
15 training approaches for highly-imbalanced data set 2012-11-06T21:28:43.453
15 Training a basic Markov Random Field for classifying pixels in an image 2014-02-06T15:41:01.240