贺小雨,,
王晓,
谭颀,
胡彦娟,
陈前斌
1.重庆邮电大学通信与信息工程学院 重庆 400065
2.重庆邮电大学移动通信重点实验室 重庆 400065
基金项目:重庆市教委科学技术研究项目(KJZD-M20180601),重庆市重大主题专项(cstc2019jscx-zdztzxX0006)
详细信息
作者简介:唐伦:男,1973年生,教授,博士生导师,主要研究方向为新一代无线通信网络、异构蜂窝网络等
贺小雨:女,1995年生,硕士生,研究方向为网络切片资源分配和强化学习
王晓:男,1995年生,硕士生,研究方向为网络切片资源优化和机器学习
谭颀:女,1995年生,硕士生,研究方向为5G网络切片、资源分配、随机优化理论
胡彦娟:女,1992年生,硕士生,研究方向为移动边缘计算中的资源分配和任务卸载
陈前斌:男,1967年生,教授,博士生导师,主要研究方向为个人通信、多媒体信息处理与传输、下一代移动通信网络、异构蜂窝网络等
通讯作者:贺小雨 Hexy1995@163.com
中图分类号:TN929.5计量
文章访问数:460
HTML全文浏览量:164
PDF下载量:39
被引次数:0
出版历程
收稿日期:2020-04-21
修回日期:2020-09-28
网络出版日期:2020-09-30
刊出日期:2021-06-18
Resource allocation Algorithm of Service Function Chain Based on Asynchronous Advantage Actor-Critic Learning
Lun TANG,Xiaoyu HE,,
Xiao WANG,
Qi TAN,
Yanjuan HU,
Qianbin CHEN
1. School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2. Key Laboratory of Mobile Communication, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
Funds:The Science and Technology Research Program of Chongqing Municipal Education Commission (KJZD-M20180601), The Major Theme Special Projects of Chongqing (cstc2019jscx-zdztzxX0006)
摘要
摘要:考虑网络全局信息难以获悉的实际情况,针对接入网切片场景下用户终端(UE)的移动性和数据包到达的动态性导致的资源分配优化问题,该文提出了一种基于异步优势演员-评论家(A3C)学习的服务功能链(SFC)资源分配算法。首先,该算法建立基于区块链的资源管理机制,通过区块链技术实现可信地共享并更新网络全局信息,监督并记录SFC资源分配过程。然后,建立UE移动和数据包到达时变情况下的无线资源、计算资源和带宽资源联合分配的时延最小化模型,并进一步将其转化为马尔科夫决策过程(MDP)。最后,在所建立的MDP中采用A3C学习方法,实现资源分配策略的求解。仿真结果表明,该算法能够更加合理高效地利用资源,优化系统时延并保证UE需求。
关键词:网络切片/
服务功能链资源分配/
马尔科夫决策过程/
异步优势演员-评论家学习/
区块链
Abstract:Considering the fact that global network information is hard to obtain, and the slice resource allocation optimization problem caused by mobility of User Equipment (UE) and dynamics of packet arrival in the radio access network slice, a Service Function Chain(SFC)resource allocation algorithm based on Asynchronous Advantage Actor-Critic (A3C) learning is proposed. Firstly, a resource management mechanism based on blockchain technology is established, which can credibly share and update the global network information, also supervise and record SFC resource allocation process. Then, a delay minimization model based on joint allocation of radio resources, computing resources and bandwidth resources is built under the circumstance of UE moving and time-varying packet arrival, and further transformed into an Markov Decision Process(MDP) problem. At last, A3C learning method is adopted to obtain the resource allocation optimization strategy in this MDP. Simulation results show that the proposed algorithm could utilize resources more efficiently to optimize the system delay while guarantee the requirement of each UE.
Key words:Network slice/
Service Function Chain(SFC) resource allocation/
Markov Decision Process(MDP)/
Asynchronous Advantage Actor-Critic(A3C) learning/
Blockchain
PDF全文下载地址:
https://jeit.ac.cn/article/exportPdf?id=29722ca2-125d-4200-ab67-6bf38ec51b77