## List of feature engineering techniques

14

8

Is there any resource with a list of feature engineering techniques? A mapping of type of data, model and feature engineering technique would be a gold mine

1

– Hobbes – 2016-08-04T18:27:13.593

10

There is no definite source on how to do feature engineering. It is often dependent on the problem you are trying to solve. Some say it is more of an art than it is science.

But I would go through some of the high scoring kaggle kernels / winning solutions if available. Just head over to kaggle and browse through the competitions. There is a lot of very useful material in there.

Also the journal of machine learning research has a lot of papers about feature engineering. Just search on their site http://www.jmlr.org/.

The following links are useful and to long to paraphrase:

• Some information about some best practices of feature engineering can be found on Quora, see this link

5

Missing Data Imputation:

1. Complete case analysis

2. Mean / Median / Mode imputation

3. Random Sample Imputation

4. Replacement by Arbitrary Value

5. Missing Value Indicator

6. Multivariate imputation

Categorical Encoding:

1. One hot encoding

2. Count and Frequency encoding

3. Target encoding / Mean encoding

4. Ordinal encoding

5. Weight of Evidence

6. Rare label encoding

7. BaseN, feature hashing and others

Variable Transformation:

1. Logarithm

2. Reciprocal

3. Square root

4. Exponential

5. Yeo-Johnson

6. Box-Cox

Discretisation:

1. Equal frequency discretisation

2. Equal length discretisation

3. Discretisation with trees

4. Discretisation with ChiMerge

Outlier Removal:

1. Removing outliers

2. Treating outliers as NaN

3. Capping, Windsorisation

Feature Scaling:

1. Standardisation

2. MinMax Scaling

3. Mean Scaling

4. Max Absolute Scaling

5. Unit norm-Scaling

Date and Time Engineering:

1. Extracting days, months, years, quarters, time elapsed

Feature Creation:

1. Sum, subtraction, mean, min, max, product, quotient of group of features

Aggregating Transaction Data:

1. Same as above but in same feature over time window

Extracting features from text:

1. Bag of words

2. tfidf

3. n-grams

4. word2vec

5. topic extraction

And finally extracting features from images.

A good article describing most of the above techniques: Feature Engineering a comprehensive overview