I have been wracking my brain at this for a while and thought maybe someone here would know of a package or algorithm to handle the following:
I have nominal multivariant timeseries that look like the following:
Time Var1 Var2 Var3 Var4 Var5 ... VarN 0 A A B C A ... H 1 A A B D D ... H 2 B A C D D ... H ..
And so on from times 0 to 1,000,000. What I would like to do is search the time series for rules of the type:
Given Var3 is in state B in the previous step and Var5 is in state D in the previous step, than Var1 will be in state B. What I want to do is have the rules that include the time interval explicitly. A simpler case of interest would simply be to reduce the time series to
Time Var1 Var2 Var3 Var4 Var5 ... VarN 0 0 0 0 0 0 ... 0 1 0 0 0 1 1 ... 0 2 1 0 1 0 0 ... 0
Where the the variable is 1 if its state is different from the previous step and zero otherwise. Then I just want to have rules that say something like:
If Var4 and Var5 changed in the previous step than Var1 will change in the current step. Which would be easy for a lag of one, as I could just make the data into something like:
Var1 Var2 Var3 Var4 Var5 ... VarN Var1_t-1 Var2_t-1 Var3_t-1 ...
and then do sequence mining, but if I want to have rules that aren't just a single lag but could be lags from 1 to 500 than my data set begins to be a little difficult to work with.
Any help would be greatly appreciated.
Edit to respond to comment: Each column could be in one of 7 different states. As far as a target, it is non-specific, any rules between the columns would be of interest. However, predicting columns 30-40 and 62-75 would be particularly interesting.