Effect of Skewness and data range in machine learning


I have a feature for machine learning as follow that skew to the left, and only have number in certain number range (here 0-2000). Will skewness and range of number affect the learning? If yes what should I do?

enter image description here


Posted 2017-02-23T04:01:44.687


The best way to fix skewness is to perform a log transformation. Please read How to deal with Skewed Dataset in Machine Learning? for better understanding.

– Ruthwik – 2018-03-27T16:53:32.057

Your problem is addressable by Poisson regression. You may also want to take a look at Censored Poisson regression because your data is right-censored at 2000. – SmallChess – 2017-04-24T06:29:57.260



Typically, folks would transform the variable. When it is strictly greater than zero, a log transform is usually sufficient. If zero is included, as in your case, one popular alternative is the box-cox transformation.


Posted 2017-02-23T04:01:44.687

Reputation: 1 230