53 Why Is Overfitting Bad in Machine Learning? 2014-05-14T18:09:01.940

46 Should a model be re-trained if new observations are available? 2016-07-13T11:03:54.740

26 Time Series prediction using LSTMs: Importance of making time series stationary 2017-11-16T07:57:54.843

23 Predicting a word using Word2vec model 2016-01-14T07:13:45.810

22 How to predict probabilities in xgboost? 2015-09-08T03:14:09.230

18 Merging sparse and dense data in machine learning to improve the performance 2016-04-06T05:14:11.457

15 Why are ensembles so unreasonably effective 2016-05-25T13:08:06.693

15 Train Accuracy vs Test Accuracy vs Confusion matrix 2018-02-28T21:07:32.770

15 What does "baseline" mean in the context of machine learning? 2018-04-26T23:17:16.687

14 How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries 2017-04-20T13:24:46.320

13 Is feature selection necessary? 2017-01-04T08:46:42.270

13 In industry, what type of new data science algorithms does one develop? 2020-01-17T19:02:07.497

12 Hashing Trick - what actually happens 2014-10-10T03:48:54.660

11 Relationship between KS, AUROC, and Gini 2014-11-23T01:05:06.473

11 What regression to use to calculate the result of election in a multiparty system? 2014-11-29T16:05:08.810

11 How to avoid overfitting in random forest? 2015-07-07T18:05:23.903

11 Can regression trees predict continuously? 2015-12-16T11:39:09.137

11 How to perform Logistic Regression with a large number of features? 2017-07-28T09:32:13.880

10 Server log analysis using machine learning 2015-11-27T18:11:03.323

10 Machine Learning Steps 2016-02-04T08:43:12.847

10 Machine Learning Best Practices for Big Dataset 2016-09-07T22:40:00.723

10 How to impute Missing values not the usual way? 2020-01-11T07:52:56.467

9 R - Interpreting neural networks plot 2015-07-08T12:05:49.663

9 How to use Cohen's Kappa as the evaluation metric in GridSearchCV in Scikit Learn? 2015-09-11T03:00:48.897

9 Export weights (formula) from Random Forest Regressor in Scikit-Learn 2016-01-08T11:57:50.097

9 Which, if any, machine learning algorithms are accepted as being a good tradeoff between explainability and prediction? 2016-05-22T23:56:24.217

9 What are some of the best practices for sharing data and models with colleagues? 2017-03-17T18:45:16.867

8 Ideas for prospect scoring model 2016-06-15T11:30:33.413

8 Model for Differing Number of Rows per Observation 2019-04-17T16:47:56.343

8 Is it valid to shuffle time-series data for a prediction task? 2019-06-21T17:48:55.183

8 Which classification algorithms are negatively affected by class imbalances? 2019-07-03T19:45:48.660

7 How to interpret a decision tree correctly? 2016-02-11T01:47:47.487

7 How does one deploy a model, after building it in Python or Matlab? 2017-03-10T23:25:09.697

7 How would you describe the trade-off between model interpretability and model prediction power in layman's terms? 2018-01-11T08:56:20.023

7 TypeError: Expected binary or unicode string, got [ 2018-02-19T15:25:18.180

6 Looking for a strong Phd Topic in Predictive Analytics in the context of Big Data 2014-09-25T20:18:46.880

6 Can Machine Learning be applied in software developement 2014-11-26T08:47:49.650

6 Decision trees, categorizacion and oversampling 2014-12-03T14:23:38.830

6 Data driven approach to define a churn user 2015-08-04T13:44:22.333

6 Naive about which Naive Bayes in article 2015-09-05T13:49:06.513

6 Improve a regression model and feature selection 2015-12-24T17:21:26.850

6 How can I predict the acceptance of an article by publisher? 2016-01-04T20:22:46.043

6 Estimating the battery capacity using current power consumption and battery percentage 2016-01-27T14:02:57.430

6 Why aren't languages like C, C++ used for data analytics instead of R, Python? 2016-04-07T18:41:02.600

6 Balanced Train set to predict Imbalanced Prediction set 2016-09-01T07:36:40.657

6 How to use survival analysis for predictive maintenance for time series data? 2016-09-28T06:48:41.113

6 Why are RNN/LSTM preferred in time series analysis and not other NN? 2017-09-14T14:15:57.707

6 How can we model the class which maximizes the event probability? 2018-03-03T23:53:07.283

6 How to predict customer's next purchase 2018-04-16T09:41:06.617

6 Is it possible to cluster data according to a target? 2018-04-26T09:21:30.230

6 What is the best algorithm/solution for predicting the following? 2019-04-30T13:54:11.643

6 Regression: How to deal with positive skewness in continuous target variable 2019-12-26T13:22:33.397

6 Machine learning model bundled with a library vs. an API 2020-01-07T15:02:33.653

6 Identifying and Accounting for trend/seasonality in Predictor Variables 2020-02-15T06:28:41.520

5 Predictive modeling based on RFM scoring indicators 2014-09-15T13:14:40.797

5 Predictive models with class value belonging to a set of observations 2015-09-25T23:04:36.723

5 Predictive analysis of rare events 2015-10-29T12:42:35.213

5 Best regression model to use for sales prediction 2016-01-12T13:54:34.180

5 Any case studies using Bayesian Networks for system design trades? 2016-01-18T19:48:29.680

5 How to measure confidence in prediction? 2016-01-23T08:11:03.673

5 Xgboost predict probabilities 2016-10-14T11:50:40.087

5 General strategy for imbalanced, semi-supervised, sparse problem 2016-12-28T16:07:24.727

5 Theoretical background for model arhitecture choosing 2017-05-19T20:07:05.323

5 How to model a Bimodal distribution of target variable 2017-07-13T11:55:59.080

5 Sales Prediction for Fashion Retail Data 2017-08-06T15:54:43.137

5 Does it make sense to combine PCA with an artificial neural network? 2018-01-16T10:03:13.640

5 Predicting with multiple time series 2018-05-02T14:30:12.520

5 TypeError: float() argument must be a string or a number, not 'function' 2018-05-21T10:13:25.087

5 Why does balancing the test dataset improve precision-recall curve? 2018-10-29T15:35:24.967

5 Difference between sklearn make_pipeline and imblearn make_pipeline 2019-08-21T06:45:04.380

5 How can we convert time series data to supervised learning problem? 2019-12-02T19:16:35.017

5 How to adjust cofounders in Logistic regression? 2019-12-27T10:22:26.207

5 Training on derived features that won't be present in a test set 2020-01-01T16:56:36.580

5 Calculate confidence score of a neural network prediction 2020-01-21T10:39:58.417

5 Stacking and Ensembling methods in Data Science 2020-06-29T11:00:23.687

4 Rank players of any given sport 2015-02-03T04:23:37.463

4 machine learning on athlete performances to predict the time in a future race 2015-02-05T07:41:10.443

4 Predicting Soccer: guessing which matches a model will predict correctly 2015-03-02T21:15:34.480

4 Denormalise data in Neural Networks 2015-06-30T22:01:36.153

4 Propensity Modeling for Retail Marketing: Model Adjustments Based on Consumer Life Changes. 2015-09-23T15:02:20.537

4 Equipment failure prediction 2015-09-28T21:17:21.150

4 Features & Models to compute the probability of certain customer accepting an offer/product from a bank? 2015-11-17T07:38:41.220

4 Predicting app usage on mobile phone 2015-12-05T15:04:43.810

4 Predicting most likely application to be opened 2016-03-07T12:58:22.040

4 Use forecast weather data or actual weather data for prediction? 2016-04-15T21:43:51.117

4 Prediction model for marketing to prospective customers (using pandas) 2016-04-22T13:32:17.873

4 How to start prediction from dataset? 2016-06-09T00:02:39.277

4 Parking Prediction based on Mobile application 2016-07-01T18:52:56.730

4 Scikit Learn Missing Data - Categorical values 2016-07-15T10:43:58.690

4 What does this linear regression summary tells us? 2016-08-01T12:21:09.110

4 How can conclusions be drawn from recommendation systems evaluation? 2016-09-22T15:33:14.370

4 fix first two levels of decision tree? 2016-11-01T12:03:03.020

4 Use TSFRESH-library to forecast values 2016-11-12T14:35:25.700

4 What is a better approach for cross-validation with time-related predictors 2016-11-30T01:50:16.640

4 Xgboost quantile regression via custom objective 2016-12-22T17:06:56.187

4 How to evaluate performance of a time series model? 2017-03-02T11:42:44.543

4 Graph-Document-Recommendations 2017-07-21T11:19:48.160

4 Support Vector Regression trained with data sets 2017-07-30T13:03:37.460

4 Artificially Increasing Training data 2017-08-07T10:13:40.480