Tag: random-forest

78 ValueError: Input contains NaN, infinity or a value too large for dtype('float32') 2016-05-26T04:13:04.033

76 strings as features in decision tree/random forest 2015-02-25T01:07:14.717

38 Understanding predict_proba from MultiOutputClassifier 2017-09-01T10:57:57.723

33 Why do we need XGBoost and Random Forest? 2017-10-14T12:33:00.527

32 Do Random Forest overfit? 2014-08-23T16:54:06.380

32 When to use Random Forest over SVM and vice versa? 2015-08-20T04:16:43.303

27 Does modeling with Random Forests require cross-validation? 2015-07-20T13:42:21.283

22 RandomForestClassifier OOB scoring method 2016-08-02T15:47:47.503

20 How to increase accuracy of classifiers? 2014-07-16T09:49:15.933

18 Choose binary classification algorithm 2014-06-15T14:01:38.233

15 Is stratified sampling necessary (random forest, Python)? 2017-01-12T00:58:27.320

15 How many features to sample using Random Forests 2017-10-10T10:50:22.720

14 On-line random forests by adding more single Decisions Trees 2014-10-20T08:48:42.167

14 Feature importance with scikit-learn Random Forest shows very high Standard Deviation 2016-08-05T07:39:52.900

13 Feature selection using feature importances in random forests with scikit-learn 2015-08-04T17:44:35.277

13 Is feature selection necessary? 2017-01-04T08:46:42.270

12 When to choose linear regression or Decision Tree or Random Forest regression? 2015-12-02T01:06:28.243

12 Feature importance with high-cardinality categorical features for regression (numerical depdendent variable) 2017-04-05T18:23:12.657

12 Is a 100% model accuracy on out-of-sample data overfitting? 2018-02-08T09:13:24.217

11 How to avoid overfitting in random forest? 2015-07-07T18:05:23.903

11 Unbalanced classes -- How to minimize false negatives? 2015-11-12T16:09:57.543

11 Find optimal P(X|Y) given I have a model that has good performance when trained on P(Y|X) 2017-12-26T13:23:28.223

11 How can I fit categorical data types for random forest classification? 2018-01-04T13:03:28.490

10 Parameters in GridSearchCV in scikit-learn 2016-08-13T17:58:19.430

10 Would you recommend feature normalization when using boosting trees? 2017-01-10T09:49:00.067

9 R random forest on Amazon ec2 Error: cannot allocate vector of size 5.4 Gb 2014-12-19T16:02:48.693

9 Prohibitive size of random forest when saved to disk 2015-10-09T01:54:16.800

9 Export weights (formula) from Random Forest Regressor in Scikit-Learn 2016-01-08T11:57:50.097

9 feature importance via random forest and linear regression are different 2016-06-10T08:35:44.360

9 Why `max_features=n_features` does not make the Random Forest independent of number of trees? 2017-02-07T10:12:01.437

8 Difference between tf-idf and tf with Random Forests 2014-09-16T08:14:06.307

8 Minimum number of trees for Random Forest classifier 2016-08-09T09:28:26.697

8 I got 100% accuracy on my test set,is there something wrong? 2018-07-19T08:16:21.663

8 Decision Trees - how does split for categorical features happen? 2019-08-08T17:25:02.850

8 Which ML approach to choose for the game AI when rewards are delayed? 2020-05-17T11:43:24.780

8 Is over fitting okay if test accuracy is high enough? 2020-05-23T04:54:25.113

7 Why isn't dimension sampling used with gradient boosting machines (GBM)? 2014-11-25T09:40:20.040

7 How to preprocess different kinds of data (continuous, discrete, categorical) before Decision Tree learning 2015-08-07T10:43:50.747

7 How does class_weights work in RandomForestClassifier 2016-05-03T13:23:35.380

7 Reproducing randomForest Proximity Matrix from R package in Python 2017-03-16T09:31:07.317

7 Custom metrics for unbalanced classes problem in RandomForest or SVM 2017-08-04T15:35:15.427

7 Understanding Classifier performance on text data 2020-04-17T11:32:00.730

7 Dropping features after final evaluation on test data 2020-12-29T17:37:32.627

6 Assumptions/Limitations of Random Forest Models 2015-06-05T05:18:26.460

6 Voting combined results from different classifiers gave bad accuracy 2015-10-06T23:08:51.327

6 Random Forests with Big Data - number of trees v. number of observations 2015-11-02T15:42:45.377

6 Overfitting for minority class after SMOTE w/ random forests 2016-05-09T14:18:45.320

6 Is there a R implementation of isolation forest for anomaly detection? 2017-03-31T06:36:54.547

6 Anyway to know all details of trees grown using RandomForestClassifier in scikit-learn? 2017-06-19T19:40:30.820

6 In a random forest, are all decision trees given same priority? 2018-05-30T05:23:58.680

6 'RandomForestClassifier' object has no attribute 'oob_score_ in python 2018-08-28T10:53:58.743

6 Why would a fake feature with random numbers get selected in feature importance? 2018-11-14T11:49:16.150

6 using sklearn class weight to increase number of positive guesses in extremely unbalanced data set? 2018-11-19T02:39:50.403

6 Will unnecessary features harm the tree based model? 2019-02-06T17:37:49.263

6 Regression vs Random Forest - Combination of features 2019-03-31T14:28:26.237

6 Random Forest VS LightGBM 2019-11-18T07:44:57.427

6 Why gradient boosting uses sampling without replacement? 2020-02-07T06:59:16.777

6 Boosting with highly correlated features 2020-03-29T16:44:19.973

5 Predictive analysis of rare events 2015-10-29T12:42:35.213

5 Random Forest Regression. How to represent really long list of categories for processing 2015-12-14T16:58:41.163

5 2 stage ensemble -- CV MSE valid in 1st stage but not in 2nd 2016-02-18T14:38:46.413

5 Sales Prediction for Fashion Retail Data 2017-08-06T15:54:43.137

5 What algorithms will stuck in the local minimum? 2018-06-06T23:11:33.107

5 What's the difference between feature importance from Random Forest and Pearson correlation coefficient 2019-03-21T06:21:14.577

5 convert predict_proba results using class_weight in training 2019-07-02T16:43:04.367

5 Bagging vs Boosting, Bias vs Variance, Depth of trees 2019-10-15T13:19:59.797

5 Time series forecasting dilemma. Could feature engineering overcome time dependency? 2019-10-29T16:11:28.407

5 RandomForest and tree feature importance in scikit-learn 2020-01-21T07:50:29.743

5 Search for hyperparameters whith different features using Random Forest 2020-01-26T01:05:50.337

5 Confused AUC ROC score 2020-07-20T14:55:01.023

5 Understand the equations of quantile regression forest (Meinshausen)? 2020-09-09T13:39:43.187

5 How to use "tree boosting" with a data-driven loss function 2020-10-02T18:25:51.057

5 Can one perform Feature Selection on a subset of training data? 2020-11-04T04:01:18.317

4 Why does the listed order of features specified in the data set matter to the random forest classifier 2015-05-13T02:24:54.520

4 Illustrating the dimensionality reduction done by a classification or regression model 2015-08-27T22:06:09.470

4 Do I need to include a squared and linear variable in a random forest to achieve a parabolic effect? 2015-12-14T21:58:59.800

4 Research in random forest algorithms able to switch data sets 2015-12-30T19:32:09.643

4 What's a good machine learning algorithm for low frequency trading? 2016-01-27T19:56:50.257

4 What's the best way to use binned data in a tree-based model? 2016-02-09T19:10:00.017

4 Can xgboost (or any other algorithm) give bad results with some bad features? 2016-02-27T15:34:34.027

4 Does random forest re-use features at each node when generating a decision tree? 2016-03-14T23:38:12.397

4 Why is the number of samples smaller than the number of values in my decision tree? 2016-06-21T12:23:37.307

4 Handling categorical variables in linear regression and random forest 2017-05-27T23:48:28.543

4 Why do we pick random features in random forest 2017-07-10T13:20:41.123

4 Which is better: Out of Bag (OOB) or Cross-Validation (CV) error estimates? 2017-08-04T10:50:38.383

4 Artificially Increasing Training data 2017-08-07T10:13:40.480

4 Has the Random Forest algorithm ever been used in Reinforcement Learning applications? 2017-08-14T22:02:23.303

4 Variable Importance Random Forest on R 2017-09-29T11:59:11.760

4 ROC curve for different hyperparameters of `RandomForestClassifier`? 2017-10-09T13:42:54.810

4 What are limitations of decision tree approaches to data analysis? 2017-12-14T12:26:06.487

4 Can feature importance change a lot between models? 2018-03-08T18:31:31.410

4 Why is cross-validation score so low? 2018-04-23T16:53:58.853

4 Should I remove outliers if accuracy and Cross-Validation Score drop after removing them? 2018-12-20T15:09:00.010

4 How to deal with count data in random forest 2019-02-12T22:59:07.520

4 What does it mean to take the "average" of two decision trees by 'voting' 2019-03-30T17:19:33.477

4 Random-Forest-based Similarity Matrix for clustering: how does it behave? 2019-04-17T12:59:06.607

4 How does the meta Random Forest Classifier determine the final classification? 2019-04-30T10:39:41.400

4 Hyperopt vs Default Values 2019-05-27T23:52:48.087

4 Force selecting samples in majority class with random forest 2019-08-14T15:33:52.153