Tag: random-forest

32 strings as features in decision tree/random forest 2015-02-25T01:07:14.717

15 How to increase accuracy of classifiers? 2014-07-16T09:49:15.933

15 Do Random Forest overfit? 2014-08-23T16:54:06.380

14 Choose binary classification algorithm 2014-06-15T14:01:38.233

11 When to use Random Forest over SVM and vice versa? 2015-08-20T04:16:43.303

10 Why do we need XGBoost and Random Forest? 2017-10-14T12:33:00.527

10 Find optimal P(X|Y) given I have a model that has good performance when trained on P(Y|X) 2017-12-26T13:23:28.223

10 Is a 100% model accuracy on out-of-sample data overfitting? 2018-02-08T09:13:24.217

9 Does modeling with Random Forests requre cross-validation? 2015-07-20T13:42:21.283

8 On-line random forests by adding more single Decisions Trees 2014-10-20T08:48:42.167

8 feature importance via random forest and linear regression are different 2016-06-10T08:35:44.360

8 Feature importance with scikit-learn Random Forest shows very high Standard Deviation 2016-08-05T07:39:52.900

7 R random forest on Amazon ec2 Error: cannot allocate vector of size 5.4 Gb 2014-12-19T16:02:48.693

7 How to avoid overfitting in random forest? 2015-07-07T18:05:23.903

7 Minimum number of trees for Random Forest classifier 2016-08-09T09:28:26.697

6 Why isn't dimension sampling used with gradient boosting machines (GBM)? 2014-11-25T09:40:20.040

6 Assumptions/Limitations of Random Forest Models 2015-06-05T05:18:26.460

6 Overfitting for minority class after SMOTE w/ random forests 2016-05-09T14:18:45.320

6 Is stratified sampling necessary (random forest, Python)? 2017-01-12T00:58:27.320

6 Sales Prediction for Fashion Retail Data 2017-08-06T15:54:43.137

5 Difference between tf-idf and tf with Random Forests 2014-09-16T08:14:06.307

5 Feature selection using feature importances in random forests with scikit-learn 2015-08-04T17:44:35.277

5 Random Forests with Big Data - number of trees v. number of observations 2015-11-02T15:42:45.377

5 Unbalanced classes -- How to minimize false negatives? 2015-11-12T16:09:57.543

5 Random Forest Regression. How to represent really long list of categories for processing 2015-12-14T16:58:41.163

5 2 stage ensemble -- CV MSE valid in 1st stage but not in 2nd 2016-02-18T14:38:46.413

5 ValueError: Input contains NaN, infinity or a value too large for dtype('float32') 2016-05-26T04:13:04.033

5 Parameters in GridSearchCV in scikit-learn 2016-08-13T17:58:19.430

5 Feature importance with high-cardinality categorical features for regression (numerical depdendent variable) 2017-04-05T18:23:12.657

5 Handling categorical variables in linear regression and random forest 2017-05-27T23:48:28.543

5 How many features to sample using Random Forests 2017-10-10T10:50:22.720

4 Why does the listed order of features specified in the data set matter to the random forest classifier 2015-05-13T02:24:54.520

4 How to preprocess different kinds of data (continuous, discrete, categorical) before Decision Tree learning 2015-08-07T10:43:50.747

4 Prohibitive size of random forest when saved to disk 2015-10-09T01:54:16.800

4 Research in random forest algorithms able to switch data sets 2015-12-30T19:32:09.643

4 What's a good machine learning algorithm for low frequency trading? 2016-01-27T19:56:50.257

4 Can xgboost (or any other algorithm) give bad results with some bad features? 2016-02-27T15:34:34.027

4 Why is the number of samples smaller than the number of values in my decision tree? 2016-06-21T12:23:37.307

4 Is feature selection necessary? 2017-01-04T08:46:42.270

4 Why `max_features=n_features` does not make the Random Forest independent of number of trees? 2017-02-07T10:12:01.437

4 Reproducing randomForest Proximity Matrix from R package in Python 2017-03-16T09:31:07.317

4 Artificially Increasing Training data 2017-08-07T10:13:40.480

4 What are limitations of decision tree approaches to data analysis? 2017-12-14T12:26:06.487

3 Various algorithms performance in a problem and what can be deduced about data and problem? 2015-05-15T13:51:50.087

3 Illustrating the dimensionality reduction done by a classification or regression model 2015-08-27T22:06:09.470

3 Voting combined results from different classifiers gave bad accuracy 2015-10-06T23:08:51.327

3 Predictive analysis of rare events 2015-10-29T12:42:35.213

3 Extremely dominant feature? 2015-12-14T10:33:28.333

3 Export weights (formula) from Random Forest Regressor in Scikit-Learn 2016-01-08T11:57:50.097

3 Random forest implementation with probability of choosing column or guarantee of choosing set of columns 2016-04-27T13:34:34.087

3 Random Forest where objective is not to replicate past classifications 2016-08-09T15:49:12.570

3 Would you recommend feature normalization when using boosting trees? 2017-01-10T09:49:00.067

3 Using random forest to learn Imbalanced Data (rare disease) 2017-03-22T22:22:49.650

3 Custom metrics for unbalanced classes problem in RandomForest or SVM 2017-08-04T15:35:15.427

3 Variable Importance Random Forest on R 2017-09-29T11:59:11.760

3 ROC curve for different hyperparameters of `RandomForestClassifier`? 2017-10-09T13:42:54.810

3 Bootstrapping or Randomly Dividing Dataset to reduce variance? 2018-01-11T12:21:11.653

3 Can feature importance change a lot between models? 2018-03-08T18:31:31.410

2 Creating obligatory combinations of variables for drawing by random forest 2014-09-09T06:33:00.730

2 Text Classification with mixed features in Random Forests 2014-09-22T15:48:32.697

2 Differences in scoring from PMML model on different platforms 2014-10-17T13:58:39.353

2 Post processing with Random Forest 2015-04-10T12:42:13.517

2 Random Forest, Type - Regression, Calculation of Importance Example 2015-06-02T08:39:41.997

2 How do I deal with non-IID data in gradient boosted random forest (for stock market)? 2015-06-05T15:03:41.860

2 How to combine two different random forest models into one in R? 2015-06-26T12:19:16.267

2 Any mini-batch implementation of Random Forest? 2015-09-03T09:57:16.537

2 Feature importance for random forest classification of a sample 2015-09-25T23:13:55.427

2 Case when Out of Bag Error and Test error differs a lot in Random Forest 2015-10-04T17:34:41.407

2 finding maximum depth of random forest given the number of features 2015-10-06T21:01:20.730

2 Scikit Learn's RandomForestRegressor is not giving results on large data set 2015-12-09T19:34:03.510

2 Do I need to include a squared and linear variable in a random forest to achieve a parabolic effect? 2015-12-14T21:58:59.800

2 How to decide the number of trees parameter for Random Forest algorithm in PySpark MLlib? 2016-01-21T22:51:03.573

2 Accept any suggestion to create training data from correlation matrix to find odd one out to identify difference in variation 2016-03-05T08:36:11.283

2 Random forest model in R - predictors and training data types mismatch 2016-03-25T18:10:45.370

2 what predictive analysis will work with this data set? 2016-03-31T13:56:33.917

2 In random forest, what happens if I add features that are correlated? 2016-04-29T18:02:04.490

2 EasyEnsemble explaination 2016-05-09T13:22:02.683

2 Understanding ROCs in imbalanced data-sets 2016-06-22T13:35:13.040

2 Searching interactions with RandomForest and/or GBM 2016-06-28T10:07:06.240

2 RandomForestClassifier OOB scoring method 2016-08-02T15:47:47.503

2 Hashing trick with random forest in scala 2016-09-22T08:34:17.947

2 sklearn random forest and fitting with continuous features 2016-10-19T02:59:11.717

2 Random Forest Modelling? 2016-11-23T22:33:24.623

2 randomForest::varImp VS conditional variable importance 2017-02-27T11:20:26.800

2 Is there a R implementation of isolation forest for anomaly detection? 2017-03-31T06:36:54.547

2 How to further Interpret Variable Importance? 2017-06-20T22:32:53.760

2 Why do we pick random features in random forest 2017-07-10T13:20:41.123

2 Classification with millions of records, thousands of categories - keep memory use efficient? 2017-07-10T19:00:35.390

2 IsolationForest Decision Function vs. Anomaly Prediction Question 2017-07-11T18:44:23.277

2 Code for Multivariate Random Forest in Python/R? 2017-07-22T23:35:36.983

2 Aggregation of Discount 2017-08-10T10:32:46.393

2 Has the Random Forest algorithm ever been used in Reinforcement Learning applications? 2017-08-14T22:02:23.303

2 New values for categorical variable in Prediction dataset 2017-08-19T12:21:47.793

2 Wrong train/test split strategy 2017-08-30T10:01:36.787

2 Can we implement random forest using fitctree in matlab? 2017-10-27T06:12:27.310

2 Features selection/combination for random forest 2017-12-21T16:03:56.423

2 XGBoost Classification Probabilities higher than RF or SVM? 2017-12-21T17:02:47.573

2 How can I fit categorical data types for random forest classification? 2018-01-04T13:03:28.490

2 Meaning of "TRUE" column in R RandomForest output for Importance()? 2018-02-09T09:46:08.143