Meaning of equation for CNN probabilities


enter image description here

So the first equation above refers to a CNN (rather a committee of CNNs) for image classification. I am unable to understand exactly what the author is trying to do in the first equation.

So far, I think they're calculating the index of max likehlihood probabilities for all committees, then adding up the probabilities for all committees for those indices, and finally taking the maximum index.

But this seems overly convoluted and I'm not really sure. Could someone clarify this?


Posted 2019-02-27T08:07:24.143

Reputation: 33



I agree the equation might not be clear, but you can decompose it into something like the following:

  • First, the term $\operatorname{argmax}_k p^i (y=k|\mathbf{x})$ tells you which label has the higher probability from model $i$ given the input object $\mathbf{x}$.
  • Then, this "iterates" over all models in the Committee, computing for each the label that is most likely.
  • Finally finding which label is the most common one (that $\operatorname{argmax}_j$) at the end.

Also, it helps to think about it in pseudo-code

def get_label(CNNs, x):
    labels = [0, 0, 0, 0, 0]  # each position refers to that last $j$
    for pCNNi in CNNs:
        predictions = pCNNi(x)
        label_i = predictions.index(max(predictions))  # this is the argmax_k
        labels[label_i] += 1
    return labels.index(max(labels))  # this is the argmax_j


Posted 2019-02-27T08:07:24.143

Reputation: 61

1In particular, probabilities aren't summed; the label is just the highest-voted option (where each CNN gets one vote, for its highest-probability-score candidate). – Ben Reiniger – 2019-02-27T21:00:16.843

well said, and thanks for the editing @BenReiniger! – glhuilli – 2019-02-27T23:24:25.127