Decentralized Learning in Stochastic and Mean-Field Games

Loading...
Thumbnail Image

Authors

Yongacoglu, Bora

Date

Type

thesis

Language

eng

Keyword

game theory , multi-agent systems , reinforcement learning , stochastic games , mean-field games

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Multi-agent reinforcement learning (MARL) is the study of strategic interaction between multiple learning agents that coexist in a shared environment. Compared to single-agent theory, MARL theory is plagued by several inherent challenges, including decentralized information and non-stationary feedback, the latter of which arises when agents adapt their behaviour in the course of learning. This thesis is concerned with decentralized MARL in stochastic and mean-field games, with three primary objectives: (1) to identify structural properties of games that are useful and relevant to MARL; (2) to exploit observed structure and inform algorithm design; (3) to describe the emergent behaviour in a system when all agents use subjectively justified algorithms. We begin our study in the context of stochastic games with independent learners, agents who do not observe the actions of other agents. We study sequences of joint policies obtained by updating each agent's policy according to some policy update rule. Restricting our attention to update rules of the so-called epsilon-satisficing type, a natural class of update rules arising frequently in MARL, we define a structural property for games, which we call the epsilon-satisficing paths property. We prove that symmetric games and two-player games have this property, and we observe that this property has desirable consequences for algorithm design. To study MARL in a setting with greater decentralization, we study partially observed N-player mean-field games. In this setting, learning agents have only a partial view of the system, and may not appreciate the importance of strategic interaction. Agents in mean-field games may, therefore, naively resort to use of single-agent learning algorithms. We investigate the use of naive single-agent algorithms in mean-field games, and we give sufficient conditions for their convergence. We subsequently develop a theory of subjective equilibrium based on each player's using naively learned estimates to evaluate their performance. Finally, we develop a subjective analog to our epsilon-satisficing theory and prove several structural properties of mean-field games within this framework. These structural results are then used to design decentralized MARL algorithms for N-player mean-field games, and the convergence behaviour of these algorithms is rigorously analyzed.

Description

Citation

Publisher

License

Queen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canada
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
Attribution-ShareAlike 3.0 United States

Journal

Volume

Issue

PubMed ID

External DOI

ISSN

EISSN