What is a heatmap in the CornerNet paper?


I have been working on understanding how CornerNet works, but I couldn't figure out a few parts about the architecture.

First, the authors mention that there are 3 distinct parts to be predicted as a heatmap, embedding, and offset.

Also, in the paper, it is stated that the network was trained on the COCO dataset, which has bounding box and class annotations.

As far as I am concerned, since CornerNet is based on detecting the top-left and bottom-right corners, the ground-truth labels for heatmap should be composed of top-left and bottom-right pixel locations of bounding boxes with the class score (but I might be wrong). What is the heatmap used for?

Moreover, for the embedding part, authors used the pull&push loss at the ground-truth pixel locations to find out which corner pairs belong to which object, but I don't understand how to backpropagate this loss. How do I back-propagate the embedding loss?


Posted 2020-07-10T22:03:36.810

Reputation: 21

No answers