Directed Exploration In Deep Reinforcement Learning Through Optimism

Loading...
Thumbnail Image

Authors

Hashemi, Seyed Masih

Date

2024-04-02

Type

thesis

Language

eng

Keyword

Reinforcement Learning , Machine Learning , Exploration-Exploitation , Deep Reinforcement Learning , Q-Learning , Optimism

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Reinforcement learning is the science of decision-making inspired by behavioral psychology and based on learning from positive or negative rewards. One of the main dilemmas in any Reinforcement Learning task is the exploration-exploitation tradeoff: The question of when the agent should exploit known rewards and when it should take actions that it believes are suboptimal to learn more about its environment. The importance of this problem becomes magnified when dealing with deceiving rewards and reward-sparse environments where the agent seldom receives feedback on the task it is trying to learn. In deep reinforcement learning, intrinsic motivation and directed exploration are the dominant strategies. Optimistic initialization of value estimates is proven to be useful when the tasks are simple enough so that the table-based technique can be executed. In this thesis, I first devise a custom sparse-reward and deceptive-reward environment with the property of one-to-one mapping between game images and states. I also outline a methodology for initializing a Deep Q-Network optimistically with some prior knowledge of the environment and show how this simple strategy can be effective even when compared to more complicated intrinsic reward methods. Furthermore, I proposed a novel algorithm to generate optimistic-like exploration without the need for prior knowledge of the environment through the detection and exploration of pessimistic (under-valued) regions.

Description

Citation

Publisher

License

Queen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canada
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.

Journal

Volume

Issue

PubMed ID

External DOI

ISSN

EISSN