The best practice method in reducing large state spaces is a domain specific language (DSL). The domain of Monopoly has to transformed into verb-actions relationships, for example: “player1 has railwaystation”, “player1 ison field2”. The potential actions are also formulated in a domain specific language, e.g. “player1 buy smallhouse”. All potential states and actions has to be ordered in the q-table matrix, that means in the row are 100 states available which are formulated in the DSL and also in the action columns are 100 states in the DSL available. Now the reinforcement learning algorithm can determine the state-action transitions, that means to learn the values in the q-table. In theory it is possible to learn the DSL itself, which is called in the literature “grammar induction”. But this would be to complicated and is hard to implement. The easier way is to program the DSL manual and only determine the q-table automatically be reinforcement learning.
A q-table contains of state-rows which are describing the current situation of a game. In toy-problems, the states are numbered from 0 upwards and the hope is, that the total amount of states is low. There are two important questions out there: first, what does a certain state-number means? And second, how can the amount of states be reduced? Reduction states is equal to abstraction. A normal game of monopoly contains of a pixel image which is 800x600 large with 24bit colors deep. The state-space is equal to all possible images. That means the raw signals of the camera are taken as input. There is no abstraction and the state-space is very huge. A better way for describing a game is to invent a code which represents the current situation. For example, an array can be created: [0,0,0] in which the first number means which player has to move, the second number represents the amount of money of player1 and the third number is equal how many street cards (properties) player1 has. From a formal point of view, an array which contains only of three values reduces the state space very well, the problem is if this array matches to the current situation. Deciding actions only on three values is a bit uncomfortable.
The next better approach over a numerical array would be the above described Domain specific language. This language can be chosen freely. In it's easiest case the language contains of the same state-space like [0,0,0] which means it is only possible to express three values (activeplayer, money, properties) but it is also allowed to extend the complexity which will increase the state-space.