Directed Exploration In Deep Reinforcement Learning Through Optimism
Loading...
Authors
Hashemi, Seyed Masih
Date
2024-04-02
Type
thesis
Language
eng
Keyword
Reinforcement Learning , Machine Learning , Exploration-Exploitation , Deep Reinforcement Learning , Q-Learning , Optimism
Alternative Title
Abstract
Reinforcement learning is the science of decision-making inspired by behavioral psychology and based on learning from positive or negative rewards. One of the main dilemmas in any Reinforcement Learning task is the exploration-exploitation tradeoff: The question of when the agent should exploit known rewards and when it should take actions that it believes are suboptimal to learn more about its environment. The importance of this problem becomes magnified when dealing with deceiving rewards and reward-sparse environments where the agent seldom receives feedback on the task it is trying to learn. In deep reinforcement learning, intrinsic motivation and directed exploration are the dominant strategies. Optimistic initialization of value estimates is proven to be useful when the tasks are simple enough so that the table-based technique can be executed. In this thesis, I first devise a custom sparse-reward and deceptive-reward environment with the property of one-to-one mapping between game images and states. I also outline a methodology for initializing a Deep Q-Network optimistically with some prior knowledge of the environment and show how this simple strategy can be effective even when compared to more complicated intrinsic reward methods. Furthermore, I proposed a novel algorithm to generate optimistic-like exploration without the need for prior knowledge of the environment through the detection and exploration of pessimistic (under-valued) regions.
Description
Citation
Publisher
License
Queen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canada
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.