2

## The problem

I want to figure out how routers correlate between each other. Like, if a specific error occurred in router A, and almost at the same time the error occurs in router B, they probably have some connection with each other (are at one line).

## The Data

Suppose I have a dataframe that looks like this:

```
|Router|Error|Duration|Timestamp |
|DB-XX |GSM |26.5374 |2019-05-01 00:20:14|
|DT-XY |AUC |15.5400 |2019-05-01 01:15:01|
|DR-YY |AUC |02.0333 |2019-05-01 01:17:13|
|DP-YX |LOC |45.2609 |2019-05-01 00:01:10|
```

## The question

What is the best way to deal with it? Regression (one vs the rest) for each router? The problem is, that there are hundreds of models and I also want to reduce computational costs...

3A simple method would be to represent your router state as a time series (1-error, 0-no error) and compute the correlation matrix. If errors are small fraction of time, correlation is approximately equal to (duration A, B have error together)/sqrt((duration A error) x (duration B error)). – Valentas – 2019-07-05T10:36:33.067

Is your question language specific? (you put the python tag, in which case you should put your data into a pandas.DataFrame and use the corr() function. Maybe the get_dummie can be useful to transform categorical features into numeric ones)

– Manu H – 2019-07-05T11:47:14.950