A MARL Approach for Finding Optimal Positions for VANET Aerial Base-stations on a Sparse Highway
Loading...
Authors
Jiang, Bote
Date
Type
thesis
Language
eng
Keyword
Reinforcement Learning , Automated Systems , UAV , Connected Vehicles , Artificial Intelligence , VANET
Alternative Title
Abstract
A Vehicular Ad-Hoc Network (VANET) helps connected vehicles send and receive environmental and traffic information, making it a crucial component towards fully autonomous roads. For VANETs to serve their purpose, there has to be sufficient coverage, even in areas where there is less demand. Moreover, a lot of the safety information is time-sensitive; excessive outage time in a vehicular network can increase the risk of fatal accidents. Unmanned Aerial Vehicles (UAVs) can be used as mobile base-stations to fill in gaps of coverage. My work is focused on the placement of mobile base-stations for rural highways with sparse traffic, as it represents the worst-case scenario for vehicular communication. The goal is to maximize the segments of road that satisfy a particular communication outage time constraint. I use Multi-Agent Reinforcement Learning (MARL) to learn the optimal placement strategy. The main benefit of MARL is that it allows the agents to learn complex strategies through experience. I propose a variation of the traditional Deep Independent Q-Learning. The modifications include an observation function augmented with information directly shared between neighbouring agents as well a shared policy scheme. I also implement a lightweight custom sparse highway simulator that is used for training and testing my algorithm. The experiments show that the proposed MARL algorithm is able to learn the placement policies that produce the maximum rewards for different scenarios while adapting to the dynamic road densities along the highway segment. The experiments also show that the model is scalable, allowing the number of agents to increase without any modifications to the code. The model also displays robustness as it is still able to resume function even after multiple single and dual-point failures. Finally, I show that the model can be generalized as the algorithm can be directly used, with similar performance, on an industry standard simulator. Future experiments can be performed to improve the realism and complexity of the highway models as well as to test the method on real-world data.
Description
Citation
Publisher
License
Queen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canada
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
CC0 1.0 Universal
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
CC0 1.0 Universal
