2

1

I'm currently having troubles to win against a random bot playing the Schieber Jass game. It is a imperfect card information game. (famous in switzerland https://www.schieber.ch/)

The environement I'm using is on Github https://github.com/murthy10/pyschieber

To get a brief overview of the Schieber Jass I will describe the main characteristics of the game. The Schieber Jass consists of four players building two teams. At the beginning every player gets randomly nine cards (there are 36 cards). Now there are nine rounds and every player has to chose one card every round. Related to the rules of the game the "highest card" wins and the team gets the points. Hence the goal is to get more points then your opponent team.

There are several more rules but I think you can image how the game should roughly work.

Now I'm trying to apply a DQN approach at the game.

To my attempts:

- I let two independent reinforcement player play against two random players
- I design the input state as a vector (one hot encoded) with 36 "bits" for every player and repeated this nine times for every card you can play during a game.
- The output is a vector of 36 "bits" for every possible card.
- If the greedy output of the network suggest an invalid action I take the action with the highest probability of the allowed actions
- The reward is +1 for winning, -1 for losing, -0.1 for a invalid action and 0 for an action which doesn't lead to a terminal state

My question:

- Would it be helpful to use a LSTM and reduce the input state?
- How to handle invalid moves?
- Do you have some good ideas for improvements? (like Neural-Fictitious Self-Play or something similar)
- Or is this the whole approach absolute nonsense?

Welcome to AI! Sounds like an interesting project. – DukeZhou – 2018-03-28T18:43:25.057