Deep-Reinforcement-Learning-Based AoI-Aware Resource Allocation for RIS-Aided IoV Networks

June 17, 2024 · Entered Twilight · 🏛 IEEE Transactions on Vehicular Technology

Repo contents: .idea, Data.txt, Environment3.py, RL_train3.py, RL_train4.py, RL_train5.py, RL_train6.py, Reward.png, Test, Test_no_ris.py, __pycache__, log, main_test.py, model, replay_memory.py, train3.py, train4.py, train5.py, train6.py, train7.py, train8.py, train9.py

Authors Kangwei Qi, Qiong Wu, Pingyi Fan, Nan Cheng, Wen Chen, Jiangzhou Wang, Khaled B. Letaief arXiv ID 2406.11245 Category cs.LG: Machine Learning Cross-listed cs.DC, cs.NI, eess.SP Citations 45 Venue IEEE Transactions on Vehicular Technology Repository https://github.com/qiongwu86/RIS-RB-AoI-V2X-DRL.git ⭐ 32 Last Checked 1 month ago

Abstract

Reconfigurable Intelligent Surface (RIS) is a pivotal technology in communication, offering an alternative path that significantly enhances the link quality in wireless communication environments. In this paper, we propose a RIS-assisted internet of vehicles (IoV) network, considering the vehicle-to-everything (V2X) communication method. In addition, in order to improve the timeliness of vehicle-to-infrastructure (V2I) links and the stability of vehicle-to-vehicle (V2V) links, we introduce the age of information (AoI) model and the payload transmission probability model. Therefore, with the objective of minimizing the AoI of V2I links and prioritizing transmission of V2V links payload, we construct this optimization problem as an Markov decision process (MDP) problem in which the BS serves as an agent to allocate resources and control phase-shift for the vehicles using the soft actor-critic (SAC) algorithm, which gradually converges and maintains a high stability. A AoI-aware joint vehicular resource allocation and RIS phase-shift control scheme based on SAC algorithm is proposed and simulation results show that its convergence speed, cumulative reward, AoI performance, and payload transmission probability outperforms those of proximal policy optimization (PPO), deep deterministic policy gradient (DDPG), twin delayed deep deterministic policy gradient (TD3) and stochastic algorithms.