A Trustworthy Deep Reinforcement Learning Framework for Slicing in Next-Generation Open Radio Access Networks
Loading...
Authors
Abdalla, Ahmad Mostafa Nagib Mohamad
Date
2025-01-03
Type
thesis
Language
eng
Keyword
Radio resource management (RRM) , Deep reinforcement learning (DRL) , Trustworthy DRL , Open Radio Access Network (O-RAN) , RAN slicing , 6G , Inter-slice resource allocation , Transfer learning , Constrained reinforcement learning , Next-generation wireless networks
Alternative Title
Abstract
Open radio access networks (O-RANs) represent a transformative architecture in mobile communications, enabling multiple services to coexist on the same infrastructure through network slicing. This allows mobile network operators (MNOs) to partition the network into distinct virtual slices, each tailored to meet the specific needs of one of the supported services. Such services reflect diverse, sometimes contradictory, requirements from data-intensive services such as ultra-high definition (UHD) video streaming to latency-intolerant services such as extended reality (XR) applications. Intelligent resource management algorithms are essential to ensuring these services simultaneously meet their performance requirements. While deep reinforcement learning (DRL) has shown promise in managing inter-slice resource allocation (RA), its practical application faces several challenges, such as generalization and safety, which hinder the widespread adoption of DRL algorithms in real environments.
This thesis makes several key contributions to address these challenges. First, we introduce a trustworthy reinforcement learning (RL) framework for O-RAN that systematically deals with such practical challenges in online deployment settings. Next, we propose a hybrid transfer learning (TL)-aided DRL approach, combining policy reuse and distillation methods, to enhance the generalization of DRL-based slicing policies to new network scenarios. We also develop a safe DRL-based slicing approach to reduce violations of the slices' latency requirements. This includes designing a reward function that reflects such requirements and learning a cost model that estimates the latency attached to an action. Finally, we design predictive mechanisms incorporating pre-trained policy selection and demand forecasting models to improve RL-based slicing agents' performance under extreme network situations. Together, these contributions advance the practical deployment of DRL-based resource management agents in O-RAN.
Extensive simulations using real network traces demonstrate that our proposed trustworthy RL approaches significantly improve service level agreement (SLA) satisfaction and reduce latency while maintaining reasonable resource consumption across O-RAN slices. These results highlight the applicability of our methods to address the diverse service requirements in dynamic O-RAN deployment environments, particularly in immersive applications. While we focus on optimizing inter-slice RA within O-RAN, our framework offers a pathway toward more comprehensive, predictive resource management strategies, ensuring robust performance in uncertain network environments regardless of the architecture.