Object detector always predict high confidence bounding box after dataset augmentation


I am using the Object Detection API to train a MobilenetV2-SSD object detector with just one class. I started from the sample pipeline tuned for the COCO dataset. During the first training I could observe (on Tensorboard) that a few bouding boxes were predicted for the validation images, with some degree of accuracy. During testing, the trained network performed quite well but some false positives were detected.

In order to improve the network, I enriched the dataset by randomly pasting some objects (false positives that the first network often detected) in the training images, without changing the annotation files. I made sure that these objects did not overlap with the annotated ones. Furthermore, I edited the following parameters in the pipeline configuration:

# set N classes to 1
ssd {
    num_classes: 1
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300

# change the aspect rations to make them more suitable to the ones in my dataset
anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.1
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 1.6
        aspect_ratios: 0.7
        aspect_ratios: 1.3
        aspect_ratios: 0.75
# set the loss type to LOCALIZATION because I am not interested in classification
hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: LOCALIZATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 3
 # I don't expect to have more than 6 objects in my images
 post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 6
        max_total_detections: 6
      score_converter: SIGMOID

I was expecting the training to give better results. However, I could see since the beginning of the training that 6 bounding boxes were always predicted with 99% accuracy in the validation images. At the beginning the boxes were in the same position for every image. After some time, they started to center around the correct objects, but always 6 were predicted (with many false positives)

At inference time, I always get some high-score bounding box predicted even if there are no objects in the image.

What could be the cause of this behaviour?

Please note that I started the training from scratch and my dataset contains about 10000 images.


Posted 2019-09-11T07:47:32.513

Reputation: 279

No answers