Effect of Skewness and data range in machine learning

3

I have a feature for machine learning as follow that skew to the left, and only have number in certain number range (here 0-2000). Will skewness and range of number affect the learning? If yes what should I do?

Your problem is addressable by Poisson regression. You may also want to take a look at Censored Poisson regression because your data is right-censored at 2000. – SmallChess – 2017-04-24T06:29:57.260

Answers

2

Typically, folks would transform the variable. When it is strictly greater than zero, a log transform is usually sufficient. If zero is included, as in your case, one popular alternative is the box-cox transformation.

The best way to fix skewness is to perform a log transformation. Please read How to deal with Skewed Dataset in Machine Learning? for better understanding.

