11

5

Suppose I have a set of time-domain signals with absolutely **no labels**. I want to cluster them in 2 or 3 classes. Autoencoders are unsupervised networks that learn to compress the inputs. So given an input $x^{(i)}$, weights $W_1$ and $W_2$, biases $b_1$ and $b_2$, and output $\hat{x}^{(i)}$, we can find the following relationships:

$$z^{(i)} =W_1x^{(i)}+b_1$$ $$\hat{x}^{(i)} =W_2z^{(i)}+b_2$$

So $z^{(i)}$ would be a compressed form of $x^{(i)}$, and $\hat{x}^{(i)}$ the reconstruction of the latter. So far so good.

What I don't understand is how this could be used for clustering (if there is any way to do it at all). For example, in the first figure of this paper, there is a block diagram I'm not sure I understand. It uses the $z^{(i)}$ as the inputs to the feed-forward network, but there is no mention to how that network is trained. I don't know if there is something I'm ignoring or if the paper is incomplete. Also, this tutorial at the end shows the weights learned by the autoencoder, and they seem to be kernels a CNN would learn to classify images. So... I guess the autoencoder's weights can be used somehow in a feed-forward network for classification, but I'm not sure how.

My doubts are:

- If $x^{(i)}$ is a time-domain signal of length $N$ (i.e. $x^{(i)}\in\mathbb{R}^{1\times N}$), can $z^{(i)}$ only be a vector as well? In other words, would it make sense for $z^{(i)}$ to be a
*matrix*with one of its dimensions greater than $1$? I believe it would not, but I just want to check. - Which of these quantities would be the input to a classifier? For example, if I want to use a classic MLP that has as many output units as classes I want to classify the signals in, what should I put at the input of this fully-connected network ($z^{(i)}$,$\hat{x}^{(i)}$, any other thing)?
- How can I use the learned weights and biases in this MLP? Remember that we assumed that
**absolutely no labels**are available, so it is impossible to train the network. I think the learned $W_i$ and $b_i$ should be useful somehow in the fully-connected network, but I don't see how to use them.

Observation: note that I used an MLP as an example because it is the most basic architecture, but the question applies to any other neural network that could be used to classify time-domain signals.

And in the last case, how would the weights of the other layers in the MLP be learned if the data is totally unlabeled? Or would that approach (i.e. autoencoder-MLP combination) only make sense if labels are available? – Tendero – 2017-12-15T20:21:42.217

Yes, an MLP (aka feed-forward neural network) is only really used if the data is labeled. Otherwise you have no info to use to update the weights. An autoencoder is sort of a 'trick' way to use neural networks because you are trying to predict the original input and don't need labels. – tom – 2017-12-15T20:27:07.390

So the only way to use a NN to do clustering would be the method you mentioned, right? Namely, use an autoencoder and then run a standard clustering algorithm such as K-means. – Tendero – 2017-12-15T20:28:52.203

That's the only way I know. If someone else has an idea I'd be happy to hear it. You might try other algorithms besides K-means though, since there are some pretty strict assumptions associated with that particular algorithm (but still it's a good thing to try first b/c it's fast and easy). – tom – 2017-12-15T21:30:19.680