16

8

Let's say I have a top-down picture of an arrow, and I want to predict the angle this arrow makes. This would be between $0$ and $360$ degrees, or between $0$ and $2\pi$. The problem is that this target is circular, $0$ and $360$ degrees are exactly the same which is an invariance I would like to incorporate in my target, which should help generalization significantly (this is my assumption). The problem is that I don't see a clean way of solving this, are there any papers that try to tackle this problem (or similar ones)? I do have some ideas with their potential downsides:

Use a sigmoid or tanh activation, scale it to the ($0, 2\pi)$ range and incorporate the circular property in the loss function. I think this will fail fairly hard, because if it's on the border (worst prediction) only a tiny bit of noise will push the weights to go one way or the other. Also, values closer to the border of $0$ and $2\pi$ will be more difficult to reach because the absolute pre-activation value will need to be close to infinite.

Regress to two values, a $x$ and $y$ value and calculate the loss based on the angle these two values make. I think this one has more potential but the norm of this vector is unbounded, which could lead to numeric instability and could lead to blow ups or going to 0 during training. This could potentially be solved by using some weird regularizer to prevent this norm from going too far away from 1.

Other options would be doing something with sine and cosine functions but I feel like the fact that multiple pre-activations map to the same output will also make optimization and generalizations very difficult.

Honestly I think trying to predict the

totalrotation will be easier and get you better results. You can map from e.g. $3\pi$ to $\pi$ after the fact if you want to. Trying to predict the angle on the unit circle after multiplications is essentially trying to predict theremainderafter dividing by $2\pi$, and I can't see how that would be easier than predicting the overall magnitude and then subtracting off multiples of $2\pi$. – tom – 2017-11-21T16:05:32.4671

The options are a) side step the periodicity: estimate the sin and cos of the angle using a sigmoid activation function. b) incorporate the symmetry into the loss function through a kernel like so. Read about rotation groups and Taco Cohen's thesis on learning transformation groups. Unfortunately I am not knowledgeable about group theory so I can not help much more.

– Emre – 2017-11-21T18:58:16.350@tom The thing about that approach is that there are infinite pre-activations that map to the same angle while they have no thing in common. While a positive x1 always refers to a an angle between -1/2$\pi$ and 1\2$\pi$. And Emre, I will work my way through some group theory, it has always interested me so the combination of ML and group theory will appeal to me – Jan van der Vegt – 2017-11-23T08:56:28.880