Feature extraction using autoencoder and assigning sub-features to the classes


I have a dataset with N records and D numerical attributes belonign to C different classes. I use a Stacked Autoencoder for feature extraction for a classification task: it takes an input vector $x \in \mathbb{R}^d$, and maps it to a hidden representation $y \in \mathbb{R}^{d^{'}}$ ($ d^{'} < d $).

My question is: if we consider outputs of the middlemost layer ( i.e. $y \in \mathbb{R}^{d^{'}}$) as our feature set, for each class of data ($ c_i \in C $), how can we realize which subset of this feature set has the greatest impact on identifying the class $m_i$.

As an example: for MNIST we have N=60000, D=784, and C=10. An autoencoder with this architecture:

inp = Input(shape=(d,))
x = Dense(d, activation='relu')(inp)
x = Dense(d//2, activation='relu')(x)
x = Dense(d//8, activation='relu')(x)
y = Dense(d//128, activation='relu')(x)
x = Dense(d//8, activation='relu')(y)
x = Dense(d//2, activation='relu')(decoded)
x = Dense(d, activation='sigmoid')(x)
model = Model(input_img, z)

produces a $y\in \mathbb{R}^{6}$. For example, here we see the outputs of layer y for digit 5 and 9:

  class  y_1    y_2     y_3     y_4     y_5    y_6
  9      1.09   9.59    1.58    8.47    1.14   7.25
  9      2.13   1.34    4.00    8.59    1.53   1.36
  5      7.19   7.52    4.58    5.04    1.09   5.35
  5      9.80   1.55    1.46    5.06    6.49   3.51

If we connect layer y to a softmax dense layer:

out = Dense(num_classes, activation='softmax')(y)
encoded = Model(input_img, out)

the new model encoded give us a good accuracy about 98%. So, if we consider new representation, y, as an efficiently extracted feature vector, is it reasonable to assign a subset of this feature vector to each digit? Or, what is the right way to relate the vector y to the C different classes?


Posted 2017-09-11T07:42:50.650

Reputation: 1 055

No answers