RVσ(t): A unifying approach to performance and convergence in online multiagent learning

Bikramjit Banerjee, Jing Peng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

We present a new multiagent learning algorithm (RVσ(t)) that can guarantee both no-regret performance (all games) and policy convergence (some games of arbitrary size). Unlike its predecessor ReDVaLeR, it (1) does not need to distinguish whether its opponents are self-play or otherwise non-stationary, (2) is allowed to know its portion of any equilibrium that, we argue, leads to convergence in some games in addition to no-regret. Although the regret of RVσ(t) is analyzed in continuous time, we show that it grows slower than in other no-regret techniques like GIGA and GIGA-WoLF. We show that RVσ(t) can converge to coordinated behavior in coordination games, while GIGA, GIGA-WoLF may converge to poorly coordinated (mixed) behaviors.

Original languageEnglish
Title of host publicationProceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems
Pages798-800
Number of pages3
DOIs
StatePublished - 2006
EventFifth International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS - Hakodate, Japan
Duration: 8 May 200612 May 2006

Publication series

NameProceedings of the International Conference on Autonomous Agents
Volume2006

Other

OtherFifth International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
Country/TerritoryJapan
CityHakodate
Period8/05/0612/05/06

Keywords

  • Game theory
  • Multiagent learning

Fingerprint

Dive into the research topics of 'RVσ(t): A unifying approach to performance and convergence in online multiagent learning'. Together they form a unique fingerprint.

Cite this