I am a little bit confused.
When using mini-batches, it is a good idea to shuffle. This will not work if the training examples are dependent on each other, e.g. 5 minute voltage measurement data, where the algorithm should classify each 5 minute increment as a 1 or a 0 and information about the previous or next measurement is required in order to correctly classify the current measurement (e.g. prediction of last training example, difference between previous and current voltage measurement, difference between current and next voltage measurement, etc.).
So, is it not recommended to use mini-batches at all for this kind of data? Or am I missing something...
Also, would this be possible with a typical logistic regression or neural network, or would it require something like an RNN (long short-term memory).