I am working on an LSTM autoencoder in keras. The aim here is to obtain a latent space representation for the time sequences which I intend to use for clustering.
My input sequences (each feature) have very low variance among them. The input before normalization looks something like this:
This is one of the sequences, it has 4 features(the columns) and variable length (in this case 11 the number of rows).
The other sequences range from 11 to 200 in length. The number of features obviously remain constant. After normalization over the entire feature space (normalizing on each feature individually) these subtle differences in input sequences become even smaller. And I think the autoencoder is assuming this is noise and is not learning it (or rather behaving like a denoising autoencoder).
Any thoughts on how I can scale the data better? Should I make any changes to how I am treating the problem statement?
There is no problem with the code as I was able to generate very good latent representation on a toy dataset whose features were more evenly spaced out.
I have tried standardization (z score- subtracting by mean and dividing by standard deviation) but problem still persists.