谭颀,
贺兰钦,
唐伦
1.重庆邮电大学通信与信息工程学院 重庆 400065
2.重庆邮电大学移动通信技术重点实验室 重庆 400065
基金项目:国家自然科学基金(62071078),重庆市教委科学技术研究项目(KJZD-M20180601),重庆市重大主题专项项目(cstc2019jscx-zdztzxX0006)
详细信息
作者简介:陈前斌:男,1967年生,教授,博士生导师,主要研究方向为个人通信、多媒体信息处理与传输、异构蜂窝网络等
谭颀:女,1995年生,硕士生,研究方向为5G网络C-RAN、资源分配、动态优化理论
贺兰钦:男,1995年生,硕士生,研究方向为5G网络切片、机器学习算法
唐伦:男,1973年生,教授,博士,主要研究方向为下一代无线通信网络、异构蜂窝网络、软件定义无线网络等
通讯作者:陈前斌 cqb@cqupt.edu.cn
中图分类号:TN915计量
文章访问数:285
HTML全文浏览量:112
PDF下载量:60
被引次数:0
出版历程
收稿日期:2020-04-10
修回日期:2021-03-02
网络出版日期:2021-03-30
刊出日期:2021-09-16
Research on Resource Allocation and Offloading Decision Based on Multi-agent Architecture in Cloud-fog Hybrid Network
Qianbin CHEN,,Qi TAN,
Lanqin HE,
Lun TANG
1. School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2. Key Laboratory of Mobile Communications Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
Funds:The National Natural Science Foundation of China (62071078), The Science and Technology Research Program of Chongqing Municipal Education Commission (KJZD-M20180601), The Major Theme Special Projects of Chongqing (cstc2019jscx-zdztzxX0006)
摘要
摘要:针对D2D辅助的云雾混合架构下资源分配及任务卸载决策优化问题,该文提出一种基于多智能体架构深度强化学习的资源分配及卸载决策算法。首先,该算法考虑激励约束、能量约束以及网络资源约束,联合优化无线资源分配、计算资源分配以及卸载决策,建立了最大化系统总用户体验质量(QoE)的随机优化模型,并进一步将其转化为MDP问题。其次,该算法将原MDP问题进行因式分解,并建立马尔可夫博弈模型。然后,基于行动者-评判家(AC)算法提出一种集中式训练、分布式执行机制。在集中式训练过程中,多智能体通过协作获取全局信息,实现资源分配及任务卸载决策策略优化,在训练过程结束后,各智能体独立地根据当前系统状态及策略进行资源分配及任务卸载。最后,仿真结果表明,该算法可以有效提升用户QoE,并降低了时延及能耗。
关键词:云雾混合/
D2D/
多智能体/
资源分配/
计算卸载
Abstract:To optimize strategy of resource allocation and task offloading decision on D2D-assisted cloud-fog architecture, a joint resource allocation and offloading decision algorithm based on a multi-agent architecture deep reinforcement learning method is proposed. Firstly, considering incentive constraints, energy constraints, and network resource constraints, the algorithm jointly optimizes wireless resource allocation, computing resource allocation, and offloading decisions. Further, the algorithm establishes a stochastic optimization model that maximizes the total user Quality of Experience (QoE) of the system, and transfers it into an MDP problem. Secondly, the algorithm factorizes the original MDP problem and models a Markov game. Then, a centralized training and distributed execution mechanism based on the Actor-Critic (AC) algorithm is proposed. In the centralized training process, multi-agents obtains the global information through cooperation to optimize the resource allocation and task offloading decision strategies. After the training process, each agent performs independently resource allocation and task offloading based on the current system state and strategy. Finally, the simulation results demonstrate that the algorithm can effectively improve user QoE, and reduce delay and energy consumption.
Key words:Cloud-fog hybrid network/
D2D/
Multi-agent/
Resource allocation/
Computation offloading
PDF全文下载地址:
https://jeit.ac.cn/article/exportPdf?id=e2e27b81-b2f6-4256-b8af-a7c2492a4c72