中国传媒大学学报自然科学版

一类基于定向Q‐Learning 的后5G 无线网络上下行多业务并发功率分配方法

A family of directed Q‑Learning based power allocation methods for uplink/downlink multi‑service concurrency in beyond 5G wireless networks

投稿时间：

2022/4/20 0:00:00

DOI：

中文关键词：

无线网络；功率分配；机器学习；后5G；多业务

英文关键词：

wireless networks; power allocation; machine learning; beyond 5G; multi‑service

基金项目：

北京市自然科学基金‐海淀原始创新联合基金前沿项目（L202012）；北京邮电大学-中国移动研究院联合创新中心资助项目

姓名	单位
还婧文	北京邮电大学信息与通信工程学院
杨少石	北京邮电大学信息与通信工程学院、泛网无线通信教育部重点实验室
袁田浩	北京邮电大学信息与通信工程学院
孟阔	北京邮电大学信息与通信工程学院
毕嘉辉	北京邮电大学信息与通信工程学院
唐玉蓉	中国移动研究院

点击数：1021

下载数：1035

中文摘要：

在后5G时代，基于动态时分双工技术的无线网络需要同时支持传输方向、速率、时延、可靠性等指标具有差异性的多类型业务共存及并发，这会导致复杂的跨小区交叉链路干扰问题。本文提出了一类基于定向Q‑Learning的无线网络上下行多业务并发功率分配方法，利用平均意见分作为多业务的用户体验质量评价指标，对无线网络中的基站及用户发射功率进行分配。通过对新用户加入后Q‑table的更新方式进行改进，提出了三种优化的Q‑Learning算法。仿真结果表明，改进后的算法在用户数动态变化的场景下，在保证合理的平均意见分和拥塞率时，降低了迭代次数，提高了算法收敛性能。

英文摘要：

In the beyond 5G era, the dynamic time‑division duplexing (D‑TDD) technique will be employed by wireless networks, in order to support the co‑existence and concurrency of multiple services that have diverse requirements on the transmission direction, rate, latency and reliability, thus resulting in the complex inter‑cell cross‑link interference problem. In this paper we propose a family of directed Q‑Learning based power allocation methods for uplink/downlink multi‑service concurrency, where the mean opinion score (MOS) is invoked as a metric to characterize users′ quality of experience (QoE) for multiple services and to assist the transmission power allocation on the base station and the users. By improving the update mode of the Q‑table after new users join the system, three optimized Q‑Learning algorithms are proposed. Simulation results show that when the number of users changes, the improved algorithms maintain reasonable MOS values and congestion rate, while reducing the number of iterations and achieving improved convergence performance.

参考文献：