黄海,,
刘志伟,
赵石磊,
那宁
哈尔滨理工大学计算机科学与技术学院 哈尔滨 150080
基金项目:黑龙江省自然科学基金(YQ2019F010),黑龙江省博士后科研启动基金(LBH-Q18065),中央引导地方科技发展专项(ZY20B11)
详细信息
作者简介:于斌:男,1984年生,讲师,研究方向为密码算法、密码芯片设计和数字集成电路设计等
黄海:男,1982年生,硕士生导师,研究方向为信息安全、可重构技术、集成电路设计等
刘志伟:男,1987年生,讲师,研究方向为可重构计算、高速密码算法、并行加密技术、密码芯片的安全设计等
赵石磊:男,1979年生,硕士生导师,研究方向为信息安全、高速密码算法、密码芯片的安全设计等
那宁:男,1995年生,博士生,研究方向为信息安全和集成电路设计等
通讯作者:黄海 ic@hrbust.edu.cn
中图分类号:TN918; TP309计量
文章访问数:597
HTML全文浏览量:275
PDF下载量:55
被引次数:0
出版历程
收稿日期:2020-10-12
修回日期:2021-01-29
网络出版日期:2021-03-01
刊出日期:2021-07-10
High-performance Hardware Architecture Design and Implementation of Ed25519 Algorithm
Bin YU,Hai HUANG,,
Zhiwei LIU,
Shilei ZHAO,
Ning NA
School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China
Funds:The Natural Science Foundation of Heilongjiang (YQ2019F010), Heilongjiang Postdoctoral Funds for Scientific Research Initiation (LBH-Q18065), The Science and Technology Development Special Project of Central Guide the Local Government of China (ZY20B11)
摘要
摘要:针对签名验签速度难以满足特定应用领域需求的问题,该文设计了一种高性能Ed25519算法的硬件实现架构。采用宽度为2 bit的窗口法实现标量乘运算,减少了标量乘所需的总周期数;通过优化点加倍点操作步骤,提高了乘法器的硬件使用率;使用低计算复杂度的快速模约简实现模乘,提高了整体运算速度。为了使模L运算可复用标量乘中的快速模约简,该文提出一种基于Barrett约简的模L算法。通过优化解压过程中模幂操作过程,精简了步骤并使其可复用模乘。对所提架构做硬件实现,在TSMC的55 nm CMOS工艺下,面积为746×103等效门,最高频率360 MHz,每秒能够执行公钥生成9.06×104次、签名8.82×104次和验签3.99×104次。
关键词:椭圆曲线数字签名算法/
爱德华兹曲线/
硬件实现/
标量乘/
快速模约简
Abstract:The speed of existing signature and verification architecture is difficult to meet the requirement of the specific applications domain, to solve this problem a high-performance hardware architecture of Ed25519 algorithm is developed. The scalar multiplication algorithm is implemented by using the window method with 2 bit width to reduce the total cycle numbers of the algorithm significantly. By optimizing the order of operations of point addition and point doubling, the hardware utilization rate of multiplier is improved. The module multiplication is realized by using fast module reduction with low computational complexity, thus the overall operation speed is improved. The modular L algorithm based on Barrett reduction is proposed to reuse the fast modular reduction in scalar multiplications. By optimizing the modular power computation in the decompression process, the steps are simplified and the modular multiplication can be reused. Under the TSMC 55 nm CMOS process, the area of the proposed hardware architecture is 7.46×105 equivalent gate, and the maximum frequency is up to 360 MHz. It can perform 9.06×104 key generations, 8.82×104 signatures and 3.99×104 verifications per second.
Key words:Elliptic Curve Digital Signature Algorithm (ECDSA)/
Edwards-curve/
Hardware implementation/
Scalar multiplication/
Fast modular reduction
PDF全文下载地址:
https://jeit.ac.cn/article/exportPdf?id=6a653656-f79a-41bd-af0c-521e8266e2bc