I am trying to do a binary classification related to predictive maintenance. The question I address is "What is the probability that the asset will fail in the next X units of time?" There is a guide from Microsoft, which describes how to do it quite well. It describes the labeling process as well. However, it only shows it for a single failure event:
Label 1 means "about to fail", label 0 means "normal operation".
My question is: how to proceed when the failure point spans multiple records, i.e. the failure point spans multiple black dots?
I thought of discarding data points and to leave just a single black point per failure point. But I have doubts for cases when I calculate (for example) aggregated features for 2 months and there are multiple failures within 2 months. This means that I will remove some data, which might be relevant for the second failure.
UPD: To clarify the situation with multiple failures. The scenario is like this: asset operates normally, then fails at some point. It continues to report data although it is in a failure state. Then, the asset is repaired and continues to operate normally until the next failure. Each failure has a timestamp indicating when the failure has happened and a duration (i.e. time until asset is in normal operation). For simplicity, I can say that multiple failures are of the same type, i.e. assigned to the same class.