Detecting off state in the magnitude of accelerometer data?

4

1

I have a univariate time series signal. It's the magnitude of an accelerometer attached to an engine.

I need to create an algorithm to detect off state, please see the black lines in the image below. The rest of signal is comprised of idle and active states. Idle state looks slightly higher and active state tend to have huge spikes and generally higher mean than both idle and off.

While this is simple, several machines have different mean values based on the size of the engine and the proximity of the sensor to the vibration. So the idea of using thresholds will not help.

I have considered K-Means algorithm. It worked pretty well when three states are available. When one or two are absent, the results are degraded significantly since it attempts to find two classes that don't exist in the data.

I have tried Hidden Markov Models. They looked promising, unfortunately they train themselves to identify the distribution of the states which will again change from one machine to another.

I thought of using standardisation. But I'm hesitant since the mean value of off state will change accordingly.

What unsupervised or semi-unsupervised approaches do you recommend on detecting off/idle/on states?

enter image description here

M-T-A

Posted 2020-02-27T15:53:41.193

Reputation: 161

Answers

2

Seems to be pretty straight-forward thanks to your "well-behaving" data.

Simplest

The naive approach is to check histogram! As you see, the OFF state has generally a low magnitude. Knowing that level of magnitude from the labeld data you have above, you can threshold the histogram.

Why is it naive? : because you see false positives (see the plot between 4 and 6 PM. there is a short state similar to OFF which is not OFF) and false negatives (immediately after 9 PM you see a peak in OFF state whose magnitude goes far beyond OFF)

what improves this naive approach? :

  • Taking the history into account (learning the sequential behavior)
  • Extracting more sophisticated features (e.g. in frequency domain using FFT or time-frequency domain using STFT, wavelet, etc.)
  • Applying an anomaly detection stage which helps you decide correctly about the peak which happened at 9 PM (that is pretty much an anomaly I assume)

A Bit More Complicated

Extract time frequency features (STFT, Wavelet, etc.) and in case you have labeled data, feed them to a classifier and in case you don't use those features to do some EDA to see if a clustering is possible there. According to your data I'm pretty confident you get good results.

Much More Complicated

I suppose Recurrence Analysis would be a wonderful option (and pretty fun for you!). Specially if the dynamic of the signal is non-stationary (many signals are in practice). The methods explained in the wikipedia page or in this page are not sophisticated.

Good Luck!

Kasra Manshaei

Posted 2020-02-27T15:53:41.193

Reputation: 5 323