3

7

So, I have a time series with many independent variables (X's) and an outcome variable Y (that I want to predict, think a 2 class logistic regression where output would either be 1 or a 0). Kindly see a sample below:

```
Timestamp X1 X2 X3 X4 Y
1:00 1 0.5 23.5 0 0
1:01 1 0.8 18.7 0 0
1:02 0 0.9 4.5 1 0
….
1:30 1 1.9 5.5 1 1
1:31 0 1.7 4.3 0 1
…
…
```

Now I want to predict or rather classify Y as 0 (stable) or 1 (unstable) (Note that when Y becomes 1 it remains 1 for certain interval of time, same when it is 0)

So Y will be dependent on sequence variables (Please note that it is a time series, and not a standard regression problem where every row can be fed to an Algorithm for classification, the output here is dependent on a sequence of inputs/rows), for instance Y may become 1 when X2 starts increasing and X3 starts decreasing and so on (there are many independent variables X1…XN).

The way I was thinking in order to solve this problem was to extract say m hours of data before Y becomes 1 and do some descriptive statistics on X in order to derive new features (like mean of X1, std of X2, last change point of X4 and so on for the set of this extracted data) to convert the X’s to a single row feature vector. The outcome ‘Y’ of this single row feature vector is 1 as we have just extracted the data before Y became 1. So this way I am able to convert a time series into a standard classification/prediction problem. Similarly I can take the other class i.e. Y=0 and follow the same process.

The other approach that I thought about was to incorporate a sequence model, something like Hidden Markov Model where the hidden states might be stable (say for Y=0) and unstable (for Y=1) and then I go about emission and transition probabilities. But this HMM will be multivariate considering there are many X’s on which Y is dependent. This seems a bit complex?

Any ideas on modeling the above problem will be appreciated.

Are you trying to predict binary outcomes for every second? If not this post might be misleading. – user61762 – 2018-10-31T10:16:22.907