Good idea to assign different objects to same class?


Suppose one trains a CNN to determine if something was either a cat/dog or neither (2 classes), would it be a good idea to assign all cats and dogs to one class and everything else to another? Or would it be better to have a class for cat, a class for dog and a class for everything else (3 classes)? My colleague argues for 3 classes because dogs and cats have different features, but I wonder if he's right.

John M.

Posted 2018-03-26T09:29:22.130

Reputation: 207

My bet would be 3 classes will be better...since you are making things finer – DuttaA – 2018-03-26T10:11:16.007

Lets say you have a very large NN with a huge number of nodes and you perform the following classification...if the pic is pic number 1, or pic 2 or pic 3..then for a very large NN you'll actually be able to output the order of pictures..but since you don't have a very large NN, there will be a peak accuracy at a certain number of classes..probably the it'll follow a gaussian curve but you have to check it out experimentally – DuttaA – 2018-03-28T17:09:47.497



The best approach may be to have a cat, dog, and neither class (3 classes total) and go with a regression approach — specifically, outputting the probabilities of each class for any given input. From there, you can always take the probabilities of each output and derive the probability of a cat and dog class or neither class. Also, make sure you use the right cost function on the output so that you are in fact getting probabilities, i believe this would be cross entropy.


Posted 2018-03-26T09:29:22.130

Reputation: 347


As far as generalization error is concerned, you are better off by learning the data distribution of (A and B) classes using unsupervised criterion.

If you capture the underlying factors that explain most of the variations belong to A and B classes, after that, fine-tune it using a supervised criterion. in this way if you used two classes one for (A or B) and the other for neither (A or B), you will not force the model to learn features don't belong to (A or B), because the model just checks if a new data point is probably likely drawn from the data distribution that resembles (A or B).

Side note: you will never have the data necessary to explore the internal structure of the otherwise class (neither A nor B).

Fadi Bakoura

Posted 2018-03-26T09:29:22.130

Reputation: 348


If you want to determine if something is either a

cat/dog or neither

you need 2 classes one for dog or cat and one for anything else. But If you assign all cats and dogs in the same class, you won't be able to know if it's a dog or not, you will just be (kind of) sure that what you have fed to the CNN corresponds to one of them.

In the case you wanted to predict if a new feed is a cat or a dog or neither then you'll need three classes. The two first will allow your CNN to determine if the feed is a dog or a cat and a third class will be needed in order to filter the prediction, and to be able to interpret the result.

Finally, if you specify only 2 classes 1 - dog, 2- cat, then your CNN will try to classify any new feed to each of those 2 classes, even horse or whatever.


Posted 2018-03-26T09:29:22.130

Reputation: 93

To put it more in more general terms, I basically try to determine if something was "A or B" or "neither A or B" (2 classes). There's no need to determine whether an input was A or B. I just thought if assigning both A-objects to the same class as B-objects (the "A or B" class), the CNN may "diffuse" the features of A and B. – John M. – 2018-03-26T18:05:08.823