Label construction for predictive maintenance


I am trying to do a binary classification related to predictive maintenance. The question I address is "What is the probability that the asset will fail in the next X units of time?" There is a guide from Microsoft, which describes how to do it quite well. It describes the labeling process as well. However, it only shows it for a single failure event: enter image description here

Label 1 means "about to fail", label 0 means "normal operation".

My question is: how to proceed when the failure point spans multiple records, i.e. the failure point spans multiple black dots?

I thought of discarding data points and to leave just a single black point per failure point. But I have doubts for cases when I calculate (for example) aggregated features for 2 months and there are multiple failures within 2 months. This means that I will remove some data, which might be relevant for the second failure.

UPD: To clarify the situation with multiple failures. The scenario is like this: asset operates normally, then fails at some point. It continues to report data although it is in a failure state. Then, the asset is repaired and continues to operate normally until the next failure. Each failure has a timestamp indicating when the failure has happened and a duration (i.e. time until asset is in normal operation). For simplicity, I can say that multiple failures are of the same type, i.e. assigned to the same class.


Posted 2019-01-02T17:13:28.207

Reputation: 11

By multiple failures, do you mean the machine or asset fails, then is repaired and operates normally, and then fails again? Or does the asset fail, and continues to work, and subsequently fails again? Are they the same type of failure (as in, could you model them as different types of fails? or failure #1, #2, so on)? – Alex L – 2019-01-02T21:10:47.777

No answers