I recently finished Course on RL by David Silver (on YT) and thought about trying it out on simple application in Unity Game Engine, where I've built simple labyrint with ball and want to teach the ball to get from point A to point B in there while avoiding obstacles and fire (the place where you'll get burnt so big negative reward)
The problem I encountered while designing the whole thing (programming-wise) is: What is the correct (or at least good) way of representing the position in 2D space? It is continuous so I thought about representing it as feature vector consisting of [up, down, left, right, posX, posY] where direction is whether I am pressing button of moving in that direction in binary (or actions if you want) and pos are floats (0-1) representing normalized position from one corner on the plane where the whole map is. That would be accompanied by vector W that would represent the weights adjusted using Gradient Descent.
Question is: will this work?? I am asking for 2 reasons. One is that I am not so sure about that posX and posY since it can be 0 and if I multiply it by the weights vector then how could be resulting reward anything but 0? Second reason is that I am not sure if the actions should be part of the features. I mean, it makes sense to me but I could easily be very wrong since I am a beginner.
Thanks a lot guys in advance. If you have any more questions or think the problem is not described deeply enough just ask in the comments and I'll edit the question. :)
PS: I could just code it the way I think is right, but I also want to get gasp of designing applications on paper before coding them (project management).