Tag: feature-engineering

2 Feature Engineering 2016-08-31T19:30:25.727

2 Methods to reduce dimensionality within a feature? 2016-11-27T20:01:21.317

2 Regarding "modification" of feature columns in supervised learning 2017-06-10T14:43:24.360

2 Identifying important interactions between features using machine learning 2017-09-01T13:23:49.527

2 Feature extraction using autoencoder and assigning sub-features to the classes 2017-09-11T07:42:50.650

2 Should columns with close to zero variance be removed before or after one hot encoding? 2017-11-13T14:52:08.360

2 Dealing with a dataset where a subset of points live in a higher dimensional space 2017-12-04T06:30:36.457

2 Feature engineering for hierarchical data 2017-12-22T13:27:50.460

2 Ordinal Integer variable vs Continuous Integer variable 2018-01-04T17:38:58.427

2 Predicting with categorical data 2018-02-22T10:16:38.630

2 Different number of features after using OneHotEncoder 2018-03-13T12:54:33.827

2 How to represent a set of sets as a vector 2018-03-29T13:29:17.510

2 Exploratory analysis and feature engineering for time till failure prediction using sensor data of engines 2018-05-01T18:49:01.800

2 How to combine heterogeneous image features extracted with different algorithms for similar image retrieval? 2018-05-18T03:39:58.810

2 How to model a Machine learning problem considering links between features 2018-07-03T09:12:48.227

2 How to do feature engineering for email cleaning / text extraction? 2018-07-27T15:15:24.100

2 Dummy variable for Categorical values 2018-08-28T16:01:39.673

2 Imputation missing values other than using Mean, Median in python 2018-09-02T14:55:34.830

2 Equivalent of numeric encoding when rows can contain multiple values 2018-10-03T10:08:30.637

2 I want to create an additional feature(column) based on some manipulation of values from existing features 2018-10-24T10:19:19.133

2 Is a neural network able to learn to map a completely different feature vector to the same class 2018-11-13T16:45:57.047

2 How to include class features to linear SVM 2018-11-15T14:31:32.523

2 How can I improve a machine learning model? 2018-11-30T06:47:40.373

2 Should I create metafeatures for my XGBoost training set? 2018-12-14T11:23:34.713

2 How to discretize a numerival value with predefined ranges in Weka? 2019-01-05T18:56:34.277

2 Measure the "aggregate preference" of points on a 2D plane 2019-02-12T15:43:35.823

2 How to Work with Imbalanced Data 2019-03-04T23:28:53.910

2 How to deal with a potencially multiple categorical variable 2019-03-25T09:22:18.890

2 Creating a metric based on some features 2019-04-01T15:25:12.273

2 What are the audio features to best describe a music? 2019-04-24T16:41:04.180

2 How to choose an optimal threshold for binary discretization 2019-04-29T03:32:47.560

2 Can I add features that are parts of another feature? 2019-05-28T17:59:34.560

2 How valuable is a categorical feature that has a predominant category over all other ones? 2019-06-15T17:43:54.853

2 Training vs test data set for supervised learning in real life scenario 2019-07-26T09:41:37.283

2 Can you do automated feature engineering in R? 2019-08-20T06:41:15.627

2 Dealing with NaN for predictive models 2019-09-19T05:38:13.897

2 Aggregate categorical feature by the target 2019-11-24T09:57:55.600

2 is it a good idea to take the derivative or integral of some features and add them as new features in machine learning? 2019-12-11T14:14:46.260

2 Thoughts on Feature Engineering of a duration_in_program Variable 2019-12-11T23:10:15.520

2 How to interpret Shapley value plot for a model? 2019-12-23T07:32:02.600

2 How to do feature selection after using pre-trained word embeddings? 2019-12-29T10:48:16.557

2 Does it make sense to expand word embeddings so that each array index is a feature input or should the embedding itself be a model input? 2020-01-02T22:53:02.827

2 differences between feature weighting and feature selection 2020-01-06T16:21:17.907

2 Difference between Gibbs sampling and variational Bayes inference 2020-01-15T05:51:47.860

2 How much can the AUC improve comparing the raw dataset and the feature engineered dataset? 2020-01-17T21:30:28.047

2 Word2Vec and Tf-idf how to combine them 2020-01-30T13:28:41.633

2 How do I encode time in high dimensional space? 2020-02-03T23:52:00.937

2 Correlation based Feature Selection vs Feature Engineering 2020-02-16T08:37:50.987

2 Extract features from Decision tree leaf nodes 2020-03-13T08:48:56.677

2 Adding high p-value and low R square features in linear regression model to improve result 2020-04-07T09:16:55.580

2 Filling missing values for Embedded List in Python3 2020-04-28T13:48:24.093

2 Tensor Flow Time Series Tutorial Question 2020-05-12T08:42:23.020

2 Input with variable length Classification problem 2020-06-11T12:18:06.640

2 When is it appropriate to split a dataset on a categorical value and generate $n$ models instead? 2020-06-18T07:41:43.563

2 How to handle a feature vector that could be variable length? 2020-07-13T10:00:07.003

2 Encoding ML classification features that are relative to the dependant categories 2020-10-21T16:57:57.770

2 Can we use two independently measured features in a same ML model? 2020-10-29T10:46:04.367

2 How modelling is affected by similar feature distributions across classes? 2020-12-03T10:17:30.157

2 Should you use the same algorithm in your feature selection as your model 2021-01-05T14:54:48.713

2 Is it possible to change the input columns of a trained ML model while making predictions from it without affecting the accuracy? 2021-01-12T19:38:51.593

2 Cannot understand feature extraction 2021-01-18T21:19:17.963

1 How to replace the missing values in Age column from Titanic/kaggle project 2015-12-26T19:31:41.540

1 Skewed binomial data for small p 2016-03-01T15:59:41.713

1 What are the strategies for feature engineering in a hierarchical/relational structured data? 2016-04-01T16:47:30.473

1 xgboost performance with predicted values as input 2016-04-29T22:28:59.540

1 How can i add weights in a bag of words model in text analysis? 2016-05-03T06:50:14.997

1 How to best represent rate or proportion as a feature? 2016-05-23T16:51:53.530

1 Ground-truth and feature extraction for predictive modelling 2016-06-09T09:17:49.230

1 impact of old reviews on new reviews 2016-06-24T11:36:23.127

1 Xgboost (classification problem) feature importance per input not for the model 2016-12-26T05:12:30.053

1 Binning data in one of the columns of a dataframe(Using R) 2017-02-19T05:52:07.960

1 Feature reduction convenience 2017-02-28T08:49:23.827

1 HIgher Order Interaction Variables. How to use them in model? 2017-03-05T16:35:44.127

1 How To Merge Features in the Dataset Forest Cover Type Classification Problem? 2017-03-11T16:53:39.577

1 Transformation of Dependent and Independent Variables 2017-03-12T21:29:31.383

1 Same TF-IDF Vectorizer for 2 data inputs 2017-04-25T10:19:25.220

1 Detecting outlier with combining two vectors 2017-04-25T22:43:43.510

1 Feature engineering while using neural networks 2017-07-12T06:56:09.017

1 Feature importance varying a lot using same data with same features 2017-07-20T12:49:22.490

1 How to do Feature Extraction using Apache Spark 2017-08-18T06:34:25.383

1 Does feature selection removes highly corelated variables? 2017-10-07T18:04:50.497

1 How can I deal with circular features like hours? 2017-10-20T13:44:26.497

1 Classification: How to manage data sets where one data row depends on another data row 2017-10-30T05:10:47.697

1 Data balance -before or after feature selection/engineering 2017-10-30T08:35:01.917

1 Why is there a difference in performance across the feature descriptors for the same imaging modality? 2017-11-15T12:58:37.357

1 Fix missing data by adding another feature instead of using the mean? 2017-12-20T22:24:56.750

1 Need help in improving accuracy of text classification using Naive Bayes in nltk for movie reviews 2017-12-30T08:36:54.653

1 How to implement handmade features in a Keras Sequential model? 2018-01-21T03:53:33.837

1 Differences of get_dummies and labelbinarizer? 2018-01-28T02:19:24.000

1 Ways to Make Feature Values Relative for Each Participant 2018-02-12T09:13:44.400

1 Exploratory Data Analysis and selecting good predictor variables ? 2018-02-27T17:01:52.420

1 How can I prepare my data from multiple time series sources for time series regression? 2018-04-12T17:13:26.670

1 How to extract relative importance of features from a tensorflow DNNRegressor model? 2018-04-20T12:48:17.093

1 Feature building - phone device type 2018-05-17T20:07:40.927

1 Storing engineered features in a database 2018-05-30T18:47:43.783

1 feature engineering in test and train sets (on combined data or separately on train and test) 2018-06-05T13:25:30.773

1 Classifying variable types on a list of variables 2018-07-17T04:49:45.527

1 How to standarize feature vector with data in different scales? 2018-07-18T10:25:58.980

1 How would you deal with inf. or NA for rate or ratio as a feature variable 2018-07-19T06:20:57.923