尹安琪1,,,
曲彤洲1,
南龙梅1, 2
1.解放军信息工程大学 ??郑州 ??450001
2.复旦大学专用集成电路与系统国家重点实验室 ??上海 ??201203
详细信息
作者简介:戴紫彬:男,1966年生,教授,博士生导师,研究方向为可重构计算与安全专用芯片设计
尹安琪:女,1995年生,硕士生,研究方向为可重构计算与信息安全
曲彤洲:男,1994年生,硕士生,研究方向为可重构计算与信息安全
南龙梅:女,1981年生,讲师,博士,研究方向为可重构安全芯片设计
通讯作者:尹安琪 yinaq0222@foxmail.com
中图分类号:TP338.6计量
文章访问数:1004
HTML全文浏览量:285
PDF下载量:43
被引次数:0
出版历程
收稿日期:2018-06-26
修回日期:2018-11-27
网络出版日期:2018-12-03
刊出日期:2019-02-01
Efficient Workload Balance Technology on Many-core Crypto Processor
Zibin DAI1,Anqi YIN1,,,
Tongzhou QU1,
Longmei NAN1, 2
1. The PLA Information Engineering University, Zhengzhou 450001, China
2. State Key Laboratory of ASIC and System, Fudan University, Shanghai 201203, China
摘要
摘要:工作负载分配不均是制约众核密码平台资源利用率提高的重要因素,动态负载分配可提高平台资源利用率,但具有一定开销;所以更高的负载均衡频率并不一定带来更高的负载均衡增益。因此,该文建立了关于负载均衡增益率与负载均衡频率的数学模型。基于模型,提出一种面向众核密码平台的无冲突负载均衡策略和一种基于硬件作业队列的“可扩展-可移植”负载均衡引擎——“簇间微网络-簇内环阵列”。实验证明:在性能、延时功耗积、资源利用率和负载均衡度方面,该文设计的负载均衡引擎与基于“作业窃取”的软件技术相比平均优化约4.06倍、7.17倍、23.01%和2.15倍;与基于“作业窃取”的硬件技术相比约优化1.75倍、2.45倍、10.2%、和1.41倍;与理想硬件技术相比,密码算法吞吐率平均只降低了约5.67%(最低3%)。实验结果表明该文技术具有良好的可扩展性和可移植性。
关键词:众核密码处理器/
负载均衡策略/
负载均衡引擎/
无冲突
Abstract:Imbalanced workload distribution results in low resource utilization of many-core crypto-platform. Dynamic workload allocation can improve the resource utilization with some overhead. Therefore, a higher frequency of workload balancing is not equivalent to higher gains. This paper establishes a mathematical model for gain rate and frequency of workload balancing. Based on this model, a collision-free workload balancing policy is proposed for many-core crypto systems, and a hierarchical "expandable-portable" engine is put forward, which consists of "Inter-cluster micro-network and intra-cluster ring-array" adopting hardware job queue technology. Experiment results show that the proposed workload-balancing engine is 4.06, 7.17, 23.01% and 2.15 times higher than the software technology based on " job stealing” in terms of performance, delay power consumption, resource utilization and workload balance; 1.75, 2.45, 10.2%, and 1.41 times better compared with the hardware technology based on "job stealing". By contrast with the ideal hardware technology, the average throughput of encryption algorithms is only decreased by 5.67% (the lowest 3%). The experiment also proves the scalability and portability of the proposed technique.
Key words:Many-core crypto processor/
Workload balance strategy/
Workload balance engine/
Collision-free
PDF全文下载地址:
https://jeit.ac.cn/article/exportPdf?id=2f9e508c-b483-421c-9cd5-2dc982e2dbee