2

I have a dataset with `N`

records and `D`

numerical attributes belonign to `C`

different classes. I use a Stacked Autoencoder for feature extraction for a *classification* task:
it takes an input vector $x \in \mathbb{R}^d$, and maps it to a hidden representation $y \in \mathbb{R}^{d^{'}}$ ($ d^{'} < d $).

My question is: if we consider outputs of the middlemost layer ( i.e. $y \in \mathbb{R}^{d^{'}}$) as our feature set, for each class of data ($ c_i \in C $), how can we realize which subset of this feature set has the greatest impact on identifying the class $m_i$.

As an example: for MNIST we have `N=60000`

, `D=784`

, and `C=10`

. An autoencoder with this architecture:

```
d=784
inp = Input(shape=(d,))
x = Dense(d, activation='relu')(inp)
x = Dense(d//2, activation='relu')(x)
x = Dense(d//8, activation='relu')(x)
y = Dense(d//128, activation='relu')(x)
x = Dense(d//8, activation='relu')(y)
x = Dense(d//2, activation='relu')(decoded)
x = Dense(d, activation='sigmoid')(x)
model = Model(input_img, z)
```

produces a $y\in \mathbb{R}^{6}$. For example, here we see the outputs of layer `y`

for digit `5`

and `9`

:

```
class y_1 y_2 y_3 y_4 y_5 y_6
9 1.09 9.59 1.58 8.47 1.14 7.25
9 2.13 1.34 4.00 8.59 1.53 1.36
5 7.19 7.52 4.58 5.04 1.09 5.35
5 9.80 1.55 1.46 5.06 6.49 3.51
```

If we connect layer y to a `softmax`

dense layer:

```
out = Dense(num_classes, activation='softmax')(y)
encoded = Model(input_img, out)
```

the new model `encoded`

give us a good accuracy about `98%`

. So, if we consider new representation, `y`

, as an efficiently extracted feature vector, is it reasonable to assign a subset of this feature vector to each digit?
Or, what is the right way to relate the vector `y`

to the `C`

different classes?