How can we find find the input image which maximizes the class-probability for an ANN?

3

Let's assume we have an ANN which takes a vector $x\in R^D$, representing an image, and classifies it over two classes. The output is a vector of probabilities $N(x)=(p(x\in C_1), p(x\in C_2))^T$ and we pick $C_1$ iff $p(x\in C_1) \geq 0.5$. Let the two classes be $C_1= \texttt{cat}$ and $C_2= \texttt{dog}$. Now imagine we want to extract this ANN's idea of ideal cat by finding $x^* = argmax_x N(x)_1$. How would we proceed? I was thinking about solving $\nabla_xN(x)_1=0$, but I don't know if this makes sense or if it is solvable.

In short, how do I compute the input which maximizes a class-probability?

olinarr

Posted 2019-09-19T17:55:58.390

Reputation: 685

Answers

3

In deep networks there is actually a wide variety of solutions to the problem, but if you need to find one, any easy way to do this is just through normal optimization schemes
$$\hat x = argmin_x \ L(y,x)$$
where $L(y,x)$ is your loss function. Since ANN's are generally differentiable you can optimize this iteratively with some form gradient descent scheme:
$$x^{i+1} \leftarrow x^{i} - \lambda \nabla_{x^i}L(y,x^i)$$
where $\lambda$ is your learning rate.

mshlis

Posted 2019-09-19T17:55:58.390

Reputation: 1 845

3

Probably the simplest way to search for an image with the highest probability of being a cat is to use a technique similar to Deep Dream:

  • Load the network for training, but freeze all the network weights

  • Create a random input image, and connect it to the network as a "variable" i.e. data that can be changed through training

  • Set a loss function based on maximising the pre-sigmoid value in the last layer (this is easier to handle than working with 0.999 etc probability)

  • Train using backpropagation, but instead of using gradients to change the weights, back propagate all the way to the input layer and use gradients to change the input image.

  • Typically you will also want to normalise the input image between iterations.

There is a good chance that the ideal input you find which triggers "maximum catness" will be a very noisy jumbled mess of cat-related features. You may be able to encourage something more visually apppealing, or at least less noisy, by adding a little movement - e.g. minor blurring, or a slight zoom (then crop) between each iteration. At that point, it becomes a more an artistic endeavour than mathematical.

Here is something I produced using some TensorFlow Deep Dream code plus zooming and blurring to encourage larger scale features to dominate:

enter image description here

Technically the above maximises a single internal feature map of a CNN, not a class probability, but it is the same thing conceptually.

Neil Slater

Posted 2019-09-19T17:55:58.390

Reputation: 14 632