YOLO Architecture - kmeans clustering

1

1

In YOLO, why use k-means clustering to determine bounding-box priors ?

Why if we use standard k-means with Euclidean distance, larger boxes generate more error than smaller boxes?

Why using IOU (Jaccard Index) can avoid/eliminate such error ?

How to derive d(box, centroid) = 1 - IOU(box, centroid) ?

kevin

Posted 2019-07-01T12:37:52.300

Reputation: 181

Question was closed 2020-09-04T19:12:19.037

2Hi Kevin! Can you please ask just one question per post? You're asking 4 questions here. You should create 4 posts, one for each of these questions. – nbro – 2019-08-11T18:14:20.683

Answers

0

Let's say you have a big box (0.999, 0.999) and a small box (0.001, 0.001).

Euclidean distance = sqrt((0.999 - 0.001)^2 + (0.999 - 0.001)^2) = 1.41.

Now, we look at a small box (0.001, 0.001) and an even smaller box (0.00000001, 0.000000001).

Euclidean distance = sqrt((0.001 - 0.00000001)^2 + (0.001 - 0.000000001)^2) = 0.00141

If you look at the number of the 2nd example, the Euclidean distance thinks that the second box is pretty similar to the first box. But this is not true.

IOU solves the problem because it measures the boxes relative to each other. The result is a percentage, so it would give both boxes a percentage near 0 which is correct.

As for the Jaccard Index, why when the denominator equals the numerator, J(b1,b2)=1 ?

I suppose we still have terms (w1*h1+w2*h2) in the denominator, so how can denominator equals numerator ?

J(b1,b2) = 1 iff intersection/(w1h1 + w2h2 - intersection) = 1 where intersection = min_of_width * min_of_height

Assume w1h1 = w2h2, then intersection/(2 * w1h1 - intersection). But intersection is just w1h1, then w1h1/(2 * w1h1 - w1h1) = w1h1/wh1h1 = 1

kevin

Posted 2019-07-01T12:37:52.300

Reputation: 181