I want to find outliers in power consumption in real-time, at hourly rate, i.e., at the end of the hour, I should say whether power consumption in current hour was outlier/anomalous or not.
Approach: Till now, I am done with following steps
- Say I want to find whether power usage between 9 AM to 10 AM was anomalous? For this, I first find the usage of past n days during the same time interval, then I find the mean/median of all the previous usages
- Now, I have usage of the current day and the mean/median usage of previous n days. Which statistical measure should I use to declare whether current day usage was anomalous or not?
Using above approach, for 24 hours of a specific (test) day and using past 10 days consumption, I have obtained results as:
From the visual inspection, I can say that the usage between 07:10 - 08:00 and between 22:10 - 23:00 is anomalous as there is big difference between actual and previous mean/median usage. I don't know which statistical measure should I use to point out such anomalous instances automatically, using the discussed approach.