Performance bounded reinforcement learning in strategic interactions

Bikramjit Banerjee, Jing Peng

Research output: Contribution to conferencePaper

23 Citations (Scopus)

Abstract

Despite increasing deployment of agent technologies in several business and industry domains, user confidence in fully automated agent driven applications is noticeably lacking. The main reasons for such lack of trust in complete automation are scalability and non-existence of reasonable guarantees in the performance of self-adapting software. In this paper we address the latter issue in the context of learning agents in a Multiagent System (MAS). Performance guarantees for most existing on-line Multiagent Learning (MAL) algorithms are realizable only in the limit, thereby seriously limiting its practical utility. Our goal is to provide certain meaningful guarantees about the performance of a learner in a MAS, while it is learning. In particular, we present a novel MAL algorithm that (i) converges to a best response against stationary opponents, (ii) converges to a Nash equilibrium in self-play and (iii) achieves a constant bounded expected regret at any time (no-average-regret asymptotically) in arbitrary sized general-sum games with non-negative payoffs, and against any number of opponents.

Original languageEnglish
Pages2-7
Number of pages6
StatePublished - 9 Dec 2004
EventProceedings - Nineteenth National Conference on Artificial Intelligence (AAAI-2004): Sixteenth Innovative Applications of Artificial Intelligence Conference (IAAI-2004) - San Jose, CA, United States
Duration: 25 Jul 200429 Jul 2004

Other

OtherProceedings - Nineteenth National Conference on Artificial Intelligence (AAAI-2004): Sixteenth Innovative Applications of Artificial Intelligence Conference (IAAI-2004)
CountryUnited States
CitySan Jose, CA
Period25/07/0429/07/04

Fingerprint

Reinforcement learning
Multi agent systems
Learning algorithms
Scalability
Industry
Automation

Cite this

Banerjee, B., & Peng, J. (2004). Performance bounded reinforcement learning in strategic interactions. 2-7. Paper presented at Proceedings - Nineteenth National Conference on Artificial Intelligence (AAAI-2004): Sixteenth Innovative Applications of Artificial Intelligence Conference (IAAI-2004), San Jose, CA, United States.
Banerjee, Bikramjit ; Peng, Jing. / Performance bounded reinforcement learning in strategic interactions. Paper presented at Proceedings - Nineteenth National Conference on Artificial Intelligence (AAAI-2004): Sixteenth Innovative Applications of Artificial Intelligence Conference (IAAI-2004), San Jose, CA, United States.6 p.
@conference{d21956bb01b0443dbf75e79568e2b713,
title = "Performance bounded reinforcement learning in strategic interactions",
abstract = "Despite increasing deployment of agent technologies in several business and industry domains, user confidence in fully automated agent driven applications is noticeably lacking. The main reasons for such lack of trust in complete automation are scalability and non-existence of reasonable guarantees in the performance of self-adapting software. In this paper we address the latter issue in the context of learning agents in a Multiagent System (MAS). Performance guarantees for most existing on-line Multiagent Learning (MAL) algorithms are realizable only in the limit, thereby seriously limiting its practical utility. Our goal is to provide certain meaningful guarantees about the performance of a learner in a MAS, while it is learning. In particular, we present a novel MAL algorithm that (i) converges to a best response against stationary opponents, (ii) converges to a Nash equilibrium in self-play and (iii) achieves a constant bounded expected regret at any time (no-average-regret asymptotically) in arbitrary sized general-sum games with non-negative payoffs, and against any number of opponents.",
author = "Bikramjit Banerjee and Jing Peng",
year = "2004",
month = "12",
day = "9",
language = "English",
pages = "2--7",
note = "null ; Conference date: 25-07-2004 Through 29-07-2004",

}

Banerjee, B & Peng, J 2004, 'Performance bounded reinforcement learning in strategic interactions' Paper presented at Proceedings - Nineteenth National Conference on Artificial Intelligence (AAAI-2004): Sixteenth Innovative Applications of Artificial Intelligence Conference (IAAI-2004), San Jose, CA, United States, 25/07/04 - 29/07/04, pp. 2-7.

Performance bounded reinforcement learning in strategic interactions. / Banerjee, Bikramjit; Peng, Jing.

2004. 2-7 Paper presented at Proceedings - Nineteenth National Conference on Artificial Intelligence (AAAI-2004): Sixteenth Innovative Applications of Artificial Intelligence Conference (IAAI-2004), San Jose, CA, United States.

Research output: Contribution to conferencePaper

TY - CONF

T1 - Performance bounded reinforcement learning in strategic interactions

AU - Banerjee, Bikramjit

AU - Peng, Jing

PY - 2004/12/9

Y1 - 2004/12/9

N2 - Despite increasing deployment of agent technologies in several business and industry domains, user confidence in fully automated agent driven applications is noticeably lacking. The main reasons for such lack of trust in complete automation are scalability and non-existence of reasonable guarantees in the performance of self-adapting software. In this paper we address the latter issue in the context of learning agents in a Multiagent System (MAS). Performance guarantees for most existing on-line Multiagent Learning (MAL) algorithms are realizable only in the limit, thereby seriously limiting its practical utility. Our goal is to provide certain meaningful guarantees about the performance of a learner in a MAS, while it is learning. In particular, we present a novel MAL algorithm that (i) converges to a best response against stationary opponents, (ii) converges to a Nash equilibrium in self-play and (iii) achieves a constant bounded expected regret at any time (no-average-regret asymptotically) in arbitrary sized general-sum games with non-negative payoffs, and against any number of opponents.

AB - Despite increasing deployment of agent technologies in several business and industry domains, user confidence in fully automated agent driven applications is noticeably lacking. The main reasons for such lack of trust in complete automation are scalability and non-existence of reasonable guarantees in the performance of self-adapting software. In this paper we address the latter issue in the context of learning agents in a Multiagent System (MAS). Performance guarantees for most existing on-line Multiagent Learning (MAL) algorithms are realizable only in the limit, thereby seriously limiting its practical utility. Our goal is to provide certain meaningful guarantees about the performance of a learner in a MAS, while it is learning. In particular, we present a novel MAL algorithm that (i) converges to a best response against stationary opponents, (ii) converges to a Nash equilibrium in self-play and (iii) achieves a constant bounded expected regret at any time (no-average-regret asymptotically) in arbitrary sized general-sum games with non-negative payoffs, and against any number of opponents.

UR - http://www.scopus.com/inward/record.url?scp=9444299000&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:9444299000

SP - 2

EP - 7

ER -

Banerjee B, Peng J. Performance bounded reinforcement learning in strategic interactions. 2004. Paper presented at Proceedings - Nineteenth National Conference on Artificial Intelligence (AAAI-2004): Sixteenth Innovative Applications of Artificial Intelligence Conference (IAAI-2004), San Jose, CA, United States.