Let's suppose that I have a dataset with datapoints about footballers.

The data are about footballers' performance and information (e.g goals, assists, injuries, age, weight etc) on a monthly basis for the last 2 years.

My goal is to see how the performance and status of a footballer at a particular month is related to his performance and status of the next month.

At a first stage, I just want to run some correlation to detect some of these relationships.

In this case, does it make sense to run a separate correlation at each footballer's data of the last 2 years and then average the correlation results across players or directly run a correlation across all footballer's data at any month?


The name of the procedure you are looking for is


This procedure looks a lot like the first you describe.

Answer fot this question in ResearchGate

Juan Esteban de la Calle

Thanks but my question was not about HOW to implement the first method which I describe but about WHICH method to use (out of the two described at my post above) to test what I describe at my post. – Outcast – 2019-06-04T17:10:44.823

The first. I was giving you the name of the procedure so you could look for it easier – Juan Esteban de la Calle – 2019-06-04T17:19:09.843

Haha ok but in more scientific discussions (like the ones on Datascience.stackexchange) we tend also to explain WHY something should be implemented etc. – Outcast – 2019-06-04T17:33:49.620