Group-by correlation vs All observations correlation


Let's suppose that I have a dataset with datapoints about footballers.

The data are about footballers' performance and information (e.g goals, assists, injuries, age, weight etc) on a monthly basis for the last 2 years.

My goal is to see how the performance and status of a footballer at a particular month is related to his performance and status of the next month.

At a first stage, I just want to run some correlation to detect some of these relationships.

In this case, does it make sense to run a separate correlation at each footballer's data of the last 2 years and then average the correlation results across players or directly run a correlation across all footballer's data at any month?


Posted 2019-06-04T14:42:24.273

Reputation: 995



The name of the procedure you are looking for is


This procedure looks a lot like the first you describe.

Answer fot this question in ResearchGate

Juan Esteban de la Calle

Posted 2019-06-04T14:42:24.273

Reputation: 2 102

Thanks but my question was not about HOW to implement the first method which I describe but about WHICH method to use (out of the two described at my post above) to test what I describe at my post. – Outcast – 2019-06-04T17:10:44.823

The first. I was giving you the name of the procedure so you could look for it easier – Juan Esteban de la Calle – 2019-06-04T17:19:09.843

Haha ok but in more scientific discussions (like the ones on Datascience.stackexchange) we tend also to explain WHY something should be implemented etc. – Outcast – 2019-06-04T17:33:49.620