Is autocorrelation of residuals a problem in machine learning?

4

2

Let's assume I have a random forest model and the residuals of the model are autocorrelated. Is this a problem?

As an example, let's assume I have two different random forest models, A and B, with a similar predictive performance. The residuals of model A are less autocorrelated than the residuals of model B. Should I prefer model A?

Funkwecker

Posted 2020-08-28T08:28:19.690

Reputation: 405

could you specify a bit more on residuals and auto-correlation? – Soumya Kundu – 2020-08-31T09:31:18.920

It is unclear what auto-correlation mean? Correlation is a relation between two random variables. There is only ONE random variable, Maybe your data is a time series? Then auto-correlation makes sense – Jacques Wainer – 2020-09-01T00:22:44.413

Answers

7

Yes, autocorrelation in residuals is a problem, but this is essentially because it is a clear illustration that there was more learnable information in the process you are modelling but your model missed it.

In the unlikely event that you have two equally performant models but one shows significant autocorrelation (you can test for this using the Durbin-Watson test as suggested in Noah Weber’s answer), this suggests neither model is working as well as we might hope (the autocorrelated model has failed to predict some predictable patterns and the other model is failing in some other way as its predictive power isn’t any better).

If you have two models that have different residuals but both are beating a naïve baseline, you’ve probably got models that will ensemble well.

Nicholas James Bailey

Posted 2020-08-28T08:28:19.690

Reputation: 1 442

6

Choose model A, if autocorrelation is significant

residuals="mistakes in predictions" should be completely random, i.e. follow White noise. Now if something is significantly autocorrelated it wont be truly random and the independent error model is incorrect and it wont be a robust variance estimator. Prefer model A

How to measure significant autocorrelation? Durbin–Watson test

Noah Weber

Posted 2020-08-28T08:28:19.690

Reputation: 4 932

3

If you fit a model and find a meaningful signal in the residuals, you should engineer more or better features to capture that signal.

A specific example is "Neglecting spatial autocorrelation causes underestimation of the error of sugarcane yield models" by Ferraciolli et al which found:

We showed that assuming independence when modeling yield leads to underestimating model errors and overfit …

They then changed feature selection process to reduce those errors.

Brian Spiering

Posted 2020-08-28T08:28:19.690

Reputation: 10 864