Abstract
We present a new multiagent learning algorithm (RVσ(t)) that can guarantee both no-regret performance (all games) and policy convergence (some games of arbitrary size). Unlike its predecessor ReDVaLeR, it (1) does not need to distinguish whether its opponents are self-play or otherwise non-stationary, (2) is allowed to know its portion of any equilibrium that, we argue, leads to convergence in some games in addition to no-regret. Although the regret of RVσ(t) is analyzed in continuous time, we show that it grows slower than in other no-regret techniques like GIGA and GIGA-WoLF. We show that RVσ(t) can converge to coordinated behavior in coordination games, while GIGA, GIGA-WoLF may converge to poorly coordinated (mixed) behaviors.
Original language | English |
---|---|
Title of host publication | Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems |
Pages | 798-800 |
Number of pages | 3 |
Volume | 2006 |
DOIs | |
State | Published - 1 Dec 2006 |
Event | Fifth International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS - Hakodate, Japan Duration: 8 May 2006 → 12 May 2006 |
Other
Other | Fifth International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS |
---|---|
Country/Territory | Japan |
City | Hakodate |
Period | 8/05/06 → 12/05/06 |
Keywords
- Game theory
- Multiagent learning