Decentralized Learning in Stochastic and Mean-Field Games
Loading...
Authors
Yongacoglu, Bora
Date
Type
thesis
Language
eng
Keyword
game theory , multi-agent systems , reinforcement learning , stochastic games , mean-field games
Alternative Title
Abstract
Multi-agent reinforcement learning (MARL) is the study of strategic interaction between multiple learning agents that coexist in a shared environment. Compared to single-agent theory, MARL theory is plagued by several inherent challenges, including decentralized information and non-stationary feedback, the latter of which arises when agents adapt their behaviour in the course of learning.
This thesis is concerned with decentralized MARL in stochastic and mean-field games, with three primary objectives: (1) to identify structural properties of games that are useful and relevant to MARL; (2) to exploit observed structure and inform algorithm design; (3) to describe the emergent behaviour in a system when all agents use subjectively justified algorithms.
We begin our study in the context of stochastic games with independent learners, agents who do not observe the actions of other agents. We study sequences of joint policies obtained by updating each agent's policy according to some policy update rule. Restricting our attention to update rules of the so-called epsilon-satisficing type, a natural class of update rules arising frequently in MARL, we define a structural property for games, which we call the epsilon-satisficing paths property. We prove that symmetric games and two-player games have this property, and we observe that this property has desirable consequences for algorithm design.
To study MARL in a setting with greater decentralization, we study partially observed N-player mean-field games. In this setting, learning agents have only a partial view of the system, and may not appreciate the importance of strategic interaction. Agents in mean-field games may, therefore, naively resort to use of single-agent learning algorithms. We investigate the use of naive single-agent algorithms in mean-field games, and we give sufficient conditions for their convergence. We subsequently develop a theory of subjective equilibrium based on each player's using naively learned estimates to evaluate their performance.
Finally, we develop a subjective analog to our epsilon-satisficing theory and prove several structural properties of mean-field games within this framework. These structural results are then used to design decentralized MARL algorithms for N-player mean-field games, and the convergence behaviour of these algorithms is rigorously analyzed.
Description
Citation
Publisher
License
Queen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canada
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
Attribution-ShareAlike 3.0 United States
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
Attribution-ShareAlike 3.0 United States