Does anyone know of an algorithm that could be used to determine the next action to take to reach a desired state when trained on time-series data?
For example, a robot starts at a certain state, then takes an action to get to another state. This occurs continuously for many iterations (imagine the robot is randomly exploring a room). If the robot is at a specific starting state, and I desire the robot to end up in a different state, is there an algorithm that could recommend the best next action (or set of next actions) to take to reach that final desired state?
One approach I've tried is to use a neural network with the current state and the next state being the input and the action to get from the current state to the next state being the output. The network would know for a single state how to get to a next desired state that is one action away. The issue is, what if the desired state is many actions away?