Dynamic Reinforcement Learning-based Resource Allocation For Grant-Free Access

Thumbnail Image
Elsayem, Mariam
Grant-free , Reinforcement Learning , 5G , Dynamic Allocation , Ensemble Learning
Cellular networks have evolved to deliver high-speed broadband services to support the requirements of IoT applications, which demand high speed, low latency, and massive capacity. A primary market goal is to provide support for ultra-reliable low latency communication (URLLC). Among the use cases enabled by 5G, URLLC is of major importance for mission-critical IoT applications such as smart transporta- tion, industrial automation, telemedicine, tactile Internet, etc. However, URLLC requires below 1 ms radio latency as defined by the third generation partnership project (3GPP). One of the promising technologies to achieve the aforementioned specifications is grant-free (GF) access for uplink resources. The GF scheme enables the user equipment (UE) to transmit data over pre-allocated resources which reduces communication latency. When implementing the GF scheme, two objectives must be jointly accomplished. The first is to optimally select UEs for GF access. The second is to allocate the proper number of resources given to the environment to ensure low latency while minimizing resource wastage. The main challenge in implementing GF access in wireless networks is that the environment can frequently change. This is due to the fact that UEs typically exhibit a wide range of different traffic patterns over time. Furthermore, because the UEs are sometimes mobile, their channel qualities and interference levels vary. This thesis proposes an intelligent Reinforcement Learning (RL) based allocation technique for GF access that is trained via Deep Q-Learning. RL has the capability to learn the intricacies of the network and the behavior of the connected users to optimize resource allocation without the need for large amounts of labeled data. The results show that the proposed RL scheme was able to enhance the overall transmission latency of UEs for URLLC applications under relatively stable network conditions, achieving less than 20 TTI latency for 95% of the time. Additionally, since the proposed RL-based allocation technique incorporates ensemble learning, the proposed solution was able to cover a wide range of network scenarios. Moreover, the developed RL agent was capable of adapting to dynamic scenarios by updating the configurations to accurately select the stable UEs and the adequate number of resources that reduce wastage and ensures system stability based on the UEs’ feedback.
External DOI