How does Friend-or-Foe Q-learning intuitively work?


I read about Q-Learning and was reading about multi-agent environments. I tried to read the paper Friend-or-Foe Q-learning, but could not understand anything, except for a very vague idea.

What does Friend-or-Foe Q-learning mean? How does it work? Could someone please explain this expression or concept in a simple yet descriptive way that is easier to understand and that helps to get the correct intuition?

Harris Pat

Posted 2019-07-13T10:50:54.687

