I have a problem for which I have not been able to find any answers in my search so far.
I am working on an anomaly detection problem on machines utilising an auto-encoder. I am building a model file per machine because the machines' temporal behaviour varies quite a lot.
I have 5 features:
- Numerical integer ranging between 0 and x (x varies per machine)
- The other 4 features are categorical (After trying LabelEncoding, my architecture prefers One-Hot encoding)
I have tried to scale the numerical feature using Normalisation (MinMaxScaler and StandardScaler) which did not yield very good results at all.
As an alternate to scaling the inputs - I decided to scale the outputs using MinMaxScaler from scikit. This is so that I can have 1 generic threshold I can apply across the different models to identify anomalies.
Although this has yielded the best results so far - in practice, the outputs become too polarised to either the 0 or the 1 and consequently I am missing outliers that I shouldn't be.
What scaling technique(s) can I use on the output from my auto-encoder such that I can apply 1 generic threshold across all models to identify anomalies?