How to label overlapping objects for deep learning model training


I am training yolov3 to detect a custom object (chickens). In a lot of my training images I have overlapping chickens (can only see a partial chicken etc). Is there a common practice for how to label the data (bounding box) in these cases? Should you only label the portion of the image which you can see?


Posted 2019-03-21T15:08:45.757

Reputation: 153



There is no common practice in labeling the bounding boxes. It is always problem dependent. For example, if you want to count the chickens then you should also label the whole chicken as one instance of a chicken. If you simply what to detect if there is a chicken in the picture you should label the unoccluded part.

You have to think about your problem. What is the goal of the algorithm? Could a human do the task without imagining where the rest of the object is? You should also consider the pixel imbalance for your problem. In general, the first method is a harder task than the second method because even humans have problems in labeling the bounding box for occluded objects. Hence, you will have a lot of variance due to this factor. If you label only what you see the bounding box labeling will be more reliable. As far as I know, the PASCAL Visual Object Classes data set which was used in the YOLO publication did only label what you can see and not what is occluded.

BTW I hope your task aims to improve the live quality of the chickens. It would be a shame if machine learning would be used to harm them.


Posted 2019-03-21T15:08:45.757

Reputation: 1 254