I'm working on architecture for a game AI where, due to the nature of the game, the classical approach seems likely be sufficient to beat most humans--the endgame is tractable and traditional game-solving and is worth pursuing due to certain intrinsic properties of the game. (Strategy is a direct application of minimax, although there is a great deal of nuance in that determination.)
The simplified rules for the game can be found here and there is a free app (no ads, no purchases, no data collected) where you can try the mechanics for the basic, non-trivial game. (Tutorial takes about 10 minutes.)
At some point, we'd like to integrate some form of local NN, But size of the database for learning/training would be dependent on how much volume the user is willing to devote to their "mbrane". (These will initially be mobile devices such as phones and tablets. New iPad pro has upt to 512GB, but most devices will be much smaller.)
Is this even feasible? Today? At some point in the future?
- "Tuning" AI behavior to the human player
- Weighting complex stability states based on current board state to determine optimal placement
TUNING TO HUMAN BEHAVIOR
We want each discrete, local AI to evolve uniquely, in conjunction with human play, and adapt itself to the preferences of their respective humans.
Hardness: Most simply, we want AI hardness to be determined by a Win/Loss ratio against the human player. Hard = Human never wins. Easy = Human always wins. This forms a spectrum, so you might have settings for 2/3 W/L or 1/3 W/L. These don't have to be precise, but the AI should tailor it's play strength for a subsequent game based on the outcome of the previous game.
Individuality: We want the AI's to learn not only by self-play, but by play with their humans. It's ok if the process is glacial b/c the basic evaluation functions and perfect endgame will provide inherent AI strength. The main thing is that the discrete AI's develop uniquely. AI's will be able to play other AI's.
There are three simple stability states that arise out of the mechanics. These states can be strong or weak. These can be a hard T/F, or, since the game is quantitative, values between 1 and 0, based on the respective regional deltas (Neutrality). "Flipping" of a region refers to the changing of the dominant player in the region.
Stability (weak: can the region be flipped with a single placement? strong: can the polarity of a region be flipped by any number of placements?)
Epistability (weak: will the region flip under prior resolution? strong: such that the region cannot be epistabilized by further placements?)
Metastability (weak: can the region's position in the resolution order be meaningfully altered with a single placement? strong: can the position in the resolution order be meaningfully altered by any number of placements?)
COMPLEX STABILITY STATES
The simple stability states combine into 8 "complex stability states", based on True/False for metastability/epistability/stability:
TTT Superstability ("super-stably stable")
FTT Semistability ("semi-stably stable")
TFT Mendaxastability ("super-unstably stable")
FFT Demistability ("semi-unstably stable")
TTF Contrastability ("super-stably unstable")
FTF Nonstability ("semi-stably unstable")
TFF Antistability ("super-unstably unstable")
FFF Khaotictivity ("semi-unstably unstable")
There are two hierarchies, the ordering/weighting of metastability, epistability and stability in the complex states (here the order is reversed for ease of interpreting the linguistic description of these states,) and the ordering/weighting of the complex states themselves, currently with Superstability as the least important, and Khaotictivity as the most important. (i.e. a strongly superstable region cannot be flipped, and does not need to be reinforced; a khaotic regions is very much "in play")
It seems to me that deep learning might be very usefully applied in determining the hierarchies based on any give board state.