13 Micro Average vs Macro average Performance in a Multiclass classification setting 2016-12-29T17:39:07.967

5 Difference between using RMSE and nDCG to evaluate Recommender Systems 2014-06-14T18:53:32.243

5 Neural Networks - Loss and Accuracy correlation 2016-08-25T13:20:18.243

5 How many features to sample using Random Forests 2017-10-10T10:50:22.720

4 Why are there currently no content-based evaluation metrics for information retrieval? 2015-11-16T15:14:37.750

4 Assessing significance / confidence of a crossvalidated performance measure 2016-01-28T13:13:52.680

4 How to define a custom performance metric in Keras? 2016-08-30T08:52:40.127

4 Evaluating Logistic Regression Model in Tensorflow 2017-06-20T13:03:28.923

4 How can RL agents be monitored? 2017-12-06T08:58:46.353

3 When do I have to use aucPR instead of auROC? (and vice versa) 2015-11-24T11:50:46.290

3 How can conclusions be drawn from recommendation systems evaluation? 2016-09-22T15:33:14.370

3 What is the efficiency difference between different cost functions in case of neural networks? 2017-08-25T11:45:29.727

3 Splitting hold-out sample and training sample only once? 2017-12-19T15:06:57.890

2 Correlation as an evaluation metric for regression 2016-01-23T08:04:13.130

2 XGBoost increase the error when changing evaluation function 2016-08-19T14:53:27.307

2 how to evaluate top n recommendation system with movie lens dataset? 2016-10-02T13:13:15.027

2 In XGBoost, how to change eval function and keeping same objective? 2017-05-17T13:36:53.513

2 Find threshold in rate to determine reason for lost customer 2018-02-07T09:19:44.687

1 Modelling on one Population and Evaluating on another Population 2014-08-02T00:07:09.267

1 How to evaluate the clustering result when cluster numbers are not equal to data set class 2015-11-26T03:45:19.333

1 Estimating precision & recall 2016-02-02T07:36:26.677

1 How to improve an existing (trained) classifier? 2016-05-02T19:28:18.620

1 Is there any PageRank-like method on weighted graph? 2016-08-04T02:26:25.090

1 Using tensorflow to test a variable amount of correct labels 2016-09-24T13:16:15.190

1 Can an algorithm tested only on artificial data be accepted in a high rank conference? 2016-10-03T20:53:14.283

1 How to represent ROC curve when using Cross-Validation 2016-10-06T10:03:15.077

1 How to get the inertia at the begining when using sklearn.cluster.KMeans and MiniBatchKMeans 2016-10-18T02:48:46.287

1 roc_auc score GridSearch 2016-12-01T19:41:30.380

1 How do you evaluate ML model already deployed in production? 2016-12-06T00:00:03.877

1 Train/Test Split after perform SMOTE 2016-12-09T00:19:45.343

1 How can I fix this "convex" problem ? Is it just a matter of overfitting? 2016-12-09T07:10:44.187

1 How to test People similarity measure? 2016-12-11T23:59:42.707

1 Comparing Non-deterministic Binary Classifiers 2016-12-12T18:44:09.567

1 How to compare performance of Cosine Similarity and Manhatten Distance? 2017-05-24T07:09:27.113

1 How to compare LDA and TF-IDF? 2017-06-14T07:05:19.683

1 python xgboost DMatrix - get feature values or convert to np.array 2017-07-11T10:55:04.823

1 Can tuning individual precision and recall classification thresholds improve deep learning models? 2017-08-28T03:04:28.523

1 Why exactly using a test set for model evaluation is a bad idea? 2017-09-25T21:43:15.577

1 What does NIST information weights refer to? 2017-09-28T03:00:21.633

1 Irregular Precision-Recall Curve 2017-11-21T18:44:09.630

1 How to evaluate multi label image retrieval model 2017-12-01T10:37:27.800

1 How to evaluate sequence to sequence models? 2017-12-11T09:33:50.547

1 Do I need to use Bayes to combine a sample's class probability with the performance of the overall model? 2017-12-22T18:46:45.913

1 Evaluate new features 2018-01-08T16:28:12.343

1 Performance Metric for topic extraction when there is no ground truth 2018-02-10T06:00:41.010

0 How to evaluate clusters base on a label? 2016-08-17T05:11:01.990

0 How to evaluate clusters base on an attribute of the dataset? 2016-08-20T11:08:40.737

0 What do this Classification evaluation results mean to you? Do they are suspicious or not? 2016-09-06T13:00:52.300

0 Is Gini coefficient a good metric for measuring predictive model performance on highly imbalanced data 2017-06-15T20:15:12.750

0 Classifier runtime evaluation 2017-07-04T20:41:28.407

0 Metrics show badly performing model for multiclass 2017-11-13T12:00:01.883

0 recommender system: how to compare different scores when calculated individually? 2017-11-14T18:09:26.207

0 How to get an intuitive value for regression module evaluation? 2017-11-17T12:11:05.573

0 Clustering documents - how to evaluate results? 2017-12-12T15:41:28.937

0 What Are Stats Metrics To Analyze How Well One Column Predicts Another Column 2017-12-21T23:50:10.587

0 How to measure F1 score and NMI for clustering task? 2017-12-31T17:10:35.307

0 Benchmark when evaluating performance of a similar documents retrieval? 2018-01-03T04:06:22.537

0 Is it possible to adjust or fix regression predictions when both error and correlation coefficient are high? 2018-01-12T00:56:56.717

0 Estimating bias when humans are bad 2018-01-18T05:23:37.740

0 LightGBM - Cross validation is performing slightly better with lower iterations than the "best-iteration" used in the model 2018-01-25T17:02:59.823

0 What is the advantage of using Dunn index over other metrics for evaluating clustering algorithm? 2018-02-19T03:51:57.977

0 Python - Calculate Cost profitability and benefit of the model 2018-03-02T17:12:28.610

-1 Plotting Precision Recall Curve 2016-11-23T13:25:52.420

-1 How much should I pay attention to the f1 score on this case? 2017-04-20T22:05:03.280

-1 In sklearn's classification report, is f1 the best accuracy measure? 2017-10-31T09:04:52.247

-1 What are the drawbacks of V-measure clustering evaluation method? 2017-11-20T07:35:07.007