6

2

Ok, I now know how a machine can learn to play to play Atari games (Breakout): Playing Atari with Reinforcement Learning

With the same technique it is even possible to play FPS games (Doom): Playing FPS Games with Reinforcement Learning

Further studies even investigated multiagent scenarios (Pong): Multiagent Cooperation and Competition with Deep Reinforcement Learning

And even another awesome article for the interested user in context of deep reinforcement learning (easy and a must read for beginners): Demystifying Deep Reinforcement Learning

I was thrilled by these results and immediately wanted to try them in some simple "board/card game scenarios", i.e. writing AI for some simple games in order to learn more about "deep learning". Of course, thinking that I can apply the techniques above easily in my scenarios was stupid. All examples above are based on convolutional nets (image recognition) and some other assumptions, which might not be applicable in my scenarios.

Can you give me hints or futher articles, which deal with my questions below? As a beginner, I do not have an overview, yet. Preferably, your suggestions should also be connected to the following areas already: deep learning, reinforcement learning (, multiagent systems)

(1)

If you have a card game and the AI shall play a card from its hand, you could think about the cards (amongst other stuff) as the current game state. You can easily define some sort of neural net and feed it with the card data. In a trivial case the cards are just numbered. I do not know the net type, which would be suitable, but I guess deep reinforcment learning strategies could be applied easily then.

However, I can only imagine this, if there is a constant number of hand cards. In the examples above, the number of pixels is also constant, for example. What if a player can have a different numbers of cards? What to do, if a player can have an infinite number of cards? Of course, this is just a theoretical question as no game has an infinite number of cards.

(2)

In the initial examples, the action space is constant. What can you do, if the action space is not? This more or less follows from my previous problem. If you have 3 cards, you can play card 1, 2 or 3. If you have 5 cards, you can play card 1, 2, 3, 4 or 5, etc. It is also common in card games, that it is not allowed to play a card. Could this be tackled with negative reward?

So, which "tricks" can be used, e.g. always assume a constant number of cards with "filling values", which is only applicable in the non-infinite case (anyways unrealistic and even humans could not play well with that)? Are there articles, which examine such things already?

1For many cases you dont have to build everything from scratch instead you can use openAI Gym – Eka – 2016-10-27T02:31:18.933

I see this as a framework for many algorithms. Which to take? How to solve my problem with the action space? What's behind the observation space and how does it provide an infinite state space with deep learning, i.e. with which algorithm? I would take frameworks, of course, however, my question is also about theoretical background. You should understand a little bit about the technology. For example, in order to specify a meaningful action space and more.. – Stefe Klauou – 2016-10-27T06:54:48.973

Try this tutorial https://medium.com/@awjuliani/super-simple-reinforcement-learning-tutorial-part-2-ded33892c724

– Eka – 2016-10-27T08:29:16.170