Machine learning algorithm that uses the Pearson or Spearman correlation?



I've come across linear and multiple regression, SVM, random forests. Does any know of a machine learning algorithm that uses the Pearson correlation or Spearman correlation? Best, Dave

Dave Nguyen

Posted 2019-01-16T22:58:29.023

Reputation: 41



I do not think this exists. I suppose an algorithm could use pearson coefficients as starting coefficients, but honestly it seems like a waste of computational resources. Here are some reasons that occur to me as to why it is a bad idea:

  1. Pearson and Spearman correlations become decreasingly meaningful as the number of dimensions increase. I commonly work with millions of dimensions...Spearman correlations for individual features?
  2. In sparse matrices, these coefficients mean next to nothing as there will be only a very slight correlations between a feature and the target. Usually, it is a multi dimensional relationship that we are trying to find (lots of caveats placed here :P)
  3. Pearson and Spearman correlations assume certain parameters which are not usually true in ML applications ie homoscedasticity, linearity, normality, etc.

For the above and many other reasons imho it doesn't serve any purpose to use these anywhere in ML algorithms.

Joe B

Posted 2019-01-16T22:58:29.023

Reputation: 312


I think it depends on the context - whether or not you are interested in algorithms for modelling + prediction or algorithms for feature selection.

I'm not aware of any modelling + prediction algorithms which use correlation explicitly.

I can't see why feature selection approaches couldn't use correlation, although as noted in the earlier answer correlation relies on assumptions that may not hold true in the wild.


Posted 2019-01-16T22:58:29.023

Reputation: 1 387