315 What is the difference between test set and validation set? 2011-11-28T11:05:15.103

38 How to draw valid conclusions from "big data"? 2012-02-09T08:30:49.303

33 Hold-out validation vs. cross-validation 2014-06-25T13:41:15.927

22 As a reviewer, can I justify requesting data and code be made available even if the journal does not? 2011-08-17T16:52:31.727

19 How can we judge the accuracy of Nate Silver's predictions? 2016-10-08T13:28:18.917

15 How bad is hyperparameter tuning outside cross-validation? 2015-02-12T21:22:33.327

15 How to select a clustering method? How to validate a cluster solution (to warrant the method choice)? 2016-02-13T23:19:42.710

15 Can I use a tiny Validation set? 2017-09-26T04:20:00.993

14 How do you use test data set after Cross-validation 2015-05-18T21:02:26.110

13 When building a regression model using separate modeling/validation sets, is it appropriate to "recirculate" the validation data? 2013-06-11T14:30:28.143

12 What is the procedure for "bootstrap validation" (a.k.a. "resampling cross-validation")? 2012-04-16T14:16:23.627

12 Why isn't the holdout method (splitting data into training and testing) used in classical statistics? 2015-01-29T05:31:24.873

11 Name of mean absolute error analogue to Brier score? 2012-01-04T16:11:31.740

10 Prediction evaluation metric for panel/longitudinal data 2013-06-04T20:49:12.507

10 Should final (production ready) model be trained on complete data or just on training set? 2015-11-29T11:40:11.230

9 Calculating ratio of sample data used for model fitting/training and validation 2010-07-26T18:24:35.737

9 What is a consistency check? 2010-09-17T04:36:00.473

9 What is the intuition behind the variation of information (VI) metric for cluster validation? 2013-11-19T17:37:56.733

9 How to make sure that a machine learning algorithm's implementation is correct? 2015-05-13T11:19:23.957

9 Do we need a test set when using k-fold cross-validation? 2016-07-27T17:30:34.207

8 Best practices for measuring and avoiding overfitting? 2011-09-15T11:29:34.663

8 Is the Error rate a Convex function of the Regularization parameter lambda? 2017-08-17T21:46:59.493

8 Name of "reshuffle trick" (randomly permute the dataset to estimate the bias of an estimator) 2017-10-18T07:03:45.873

7 Verifying neural network model performance 2012-01-08T13:08:39.570

7 Internal validation via bootstrap: What ROC curve to present? 2014-06-14T20:32:38.537

7 Within-group sum of squares of cluster 2015-03-06T17:39:56.240

7 What is the difference between sensitivity analysis and model validation? 2016-02-05T19:30:16.523

7 Can Frank Harrell's method be used to obtain optimism-corrected regression coefficients? 2016-05-26T09:14:52.640

6 How to make representative sample set from a large overall dataset? 2011-02-20T09:54:18.693

6 Minimal number of samples/conversions for statistical validity 2011-03-08T11:16:35.207

6 What do you do with your testing data? 2012-01-11T13:04:39.570

6 Validating a logistic regression for a specific $x$ 2012-10-08T01:38:05.700

6 Validate cluster analysis in R 2013-07-18T03:30:02.143

6 Optimism bias - estimates of prediction error 2014-03-05T15:36:41.283

6 Classification score for Random Forest 2014-11-27T19:34:33.313

6 How to validate if a sample is independent and identically distributed 2014-12-26T17:31:18.897

6 What is a good way to test a simple Recurrent Neural Network 2015-05-29T03:40:22.027

6 How to account for case weights when generating folds for K-fold cross-validation? 2016-03-08T20:01:21.003

6 Is overfitted model with higher AUC on test sample better than not overfitted one 2016-06-27T09:20:28.597

6 Should I get 100% classification accuracy on training data? 2016-07-08T19:35:19.843

5 Logistic regression performs better on validation data 2012-01-03T08:47:25.900

5 Model validation after fitting a negative binomial GLM in R 2012-03-28T17:00:33.620

5 Determining the number of weak classifiers to use in adaboost without overfitting? 2012-10-04T13:48:43.450

5 Logistic Regression Cost Function issue in Matlab 2012-12-12T09:45:08.280

5 Using Adaboost for feature selection? 2013-03-08T06:02:38.527

5 Should my test set be balanced or imbalanced? 2013-05-31T22:20:47.513

5 How exactly to partition training-set for k-fold cross validation on multi-class dataset? 2014-02-26T17:39:54.127

5 How to do external validation of logistic regression models and perform model benchmarking 2014-03-17T14:37:27.457

5 Are world cup predictions testable? 2014-06-16T09:26:21.710

5 Resample random forest OOB to choose number of trees? 2015-04-15T15:50:53.523

5 What can be inferred from this residual plot? 2016-03-05T15:52:27.947

5 Classification accuracy increasing while overfitting 2016-04-22T10:51:14.440

4 Variance explained of a mixed effects model in a new data set 2011-03-19T15:56:42.623

4 Compare modeled (fitted) paired data to actual data in forecasting problem (Excel sheet included) 2011-10-19T15:55:47.487

4 What is the meaning of orthogonal in validation testing? 2012-06-16T21:24:23.343

4 Validity of pseudo-panel data constructed from repeated cross sectional data as a panel data 2012-07-21T18:35:31.967

4 Diagnostic plots for lmer 2012-08-02T10:38:49.063

4 Best method to validate a multiply imputed Cox model with R? 2013-01-10T03:22:47.257

4 In logistic regression, does the lack of significance of the parameter estimates in a test sample indicate overfitting? 2013-01-17T11:21:27.717

4 Computing c-index for an external validation of a Cox PH model with R 2013-01-22T22:29:37.047

4 How do I validate my multiple linear regression model? 2013-06-09T22:24:50.703

4 How to do external validation of regression models 2013-08-03T19:45:47.637

4 Validation of a questionnaire in a new population 2014-01-11T19:40:27.290

4 Predicting customer churn - train & test sets 2014-09-25T19:13:09.313

4 Hold out sample vs. cross validation for time series, and how to perform in R 2014-11-19T20:41:00.087

4 Performing k-means clustering on a set of lines 2015-05-29T13:15:05.043

4 ML / train-test-validate: What is allowed when? 2015-11-02T21:27:29.810

4 Cross-validation techniques for time series data 2016-02-13T16:50:50.453

4 How to Validate a Monte Carlo Simulation 2016-06-20T12:41:17.747

4 Cross-validation scheme used in the Introduction to Statistical Learning, Chapter 6, Lab 3 2016-07-13T19:35:11.087

4 How to validate Cox Proportional Hazards model? 2016-09-09T13:37:08.720

4 Scikit correct way to calibrate classifiers with CalibratedClassifierCV 2017-02-22T12:02:45.610

4 Why are the predictions of my models getting worse? 2017-03-04T18:25:19.623

4 Variational Autoencoder - understanding the latent loss 2017-09-21T12:42:43.823

3 What resources/methods exist for testing/validation or evaluation of Statistical Methods 2010-08-10T13:47:10.963

3 How to choose training and test sets 2011-04-04T11:02:16.770

3 Weight variables for predictive model 2011-05-25T23:57:37.597

3 What is a good academic citation for cross-validation? 2012-01-12T01:40:42.647

3 What are acceptable validation or cross validation error rates? 2012-06-14T23:38:31.140

3 What is the best way to compute classifier performance metrics given a confusion matrix? 2012-11-15T00:02:34.443

3 Mixing User Data For Cross-Validation 2013-02-21T23:14:15.510

3 Validation of a scale for a different population (CFA) 2013-03-10T19:42:22.427

3 Determining values of correction factor based on x bins in observed vs. actual data 2013-06-18T02:39:53.797

3 Statistical measures for data validation 2013-07-19T10:13:44.633

3 Can holdout validation be systematically biased? 2013-07-31T23:04:08.820

3 Validation of mixed-effect models 2013-11-09T20:20:46.143

3 External model validation using new data for prediction: How large of a drop in $R^2$ is significant? 2013-11-25T06:43:02.747

3 How to validate a Multinomial Logit and Probit Model fit? 2014-01-30T18:12:10.233

3 Validating statistical tests for value at risk and expected shortfall 2014-02-07T19:29:15.223

3 rms validate on models with a predict function such as coxph and glmnet 2014-03-18T16:17:06.203

3 Cross-validation for Comparing Clustering Techniques 2014-04-28T01:16:01.607

3 The size of the sample for split validation 2014-07-16T09:57:36.537

3 How to deal with floor effect 2014-07-27T15:44:17.707

3 Out-of-sample vs. test set 2014-07-31T09:49:04.230

3 What does a negative Somers' D say about model discriminative power? 2014-10-13T20:55:41.057

3 What is out of time validation in logistic regression model? 2015-02-16T08:44:38.443

3 Does the position at which maximum distance occurs in a KS test make a difference? 2015-02-20T11:43:18.467

3 Using simulated data to check when patterns in GLMM residual plots are acceptable 2015-05-24T00:37:03.670

3 Where can I find tests that validate the output of popular statistical software? (e.g. R, SPSS, SAS) 2015-12-24T05:42:11.313