计算机科学
现场可编程门阵列
并行计算
可扩展性
模幂
散列函数
算法
加密
公钥密码术
计算机硬件
计算机安全
数据库
操作系统
作者
Bin Li,Yan Yunfei,Yuanxin Wei,Heru Han
标识
DOI:10.1109/tvlsi.2023.3312423
摘要
In lattice-based postquantum cryptography (PQC), polynomial multiplication is complex and time-consuming, which affects the overall computational efficiency. In addition, the parameters of different lattice-based algorithms require different number theoretic transform (NTT) structures, which limits the versatility of hardware design. To this end, this article proposes scalable and parallel optimization of the NTT based on a field-programmable gate array (FPGA). By analyzing the algorithm flow of the NTT, inverse NTT (INTT), and pointwise multiplication (PWM), an FPGA loosely coupled structure is designed, which can be used to place butterfly units of multiple pipelines in parallel and supports various modulo operations of a polynomial. In addition, to improve computing efficiency and scalability, key algorithm modules such as multipliers and modular reduction are deeply optimized. Moreover, the storage optimization of multiple RAM channels is carried out, and the alternate access control of data and the multiplexing of RAM resources reduce resource consumption and improve data access efficiency. For the SHA-3 algorithm, the scalable Keccak algorithm is implemented in a serial–parallel hybrid manner and supports multiple hash modes. Finally, taking the Dilithium algorithm as an example, through the parallelization of SHA-3 and NTT, the calculation cycle of key generation, signature, and verification is shortened. The experimental results and analysis show that the scheme in this article shortens the NTT calculation period while ensuring a high frequency, and the calculation time is significantly better than that of other schemes. Furthermore, it can support the optimized parallelization of multiple moduli and give full play to the computing advantages of an FPGA.
科研通智能强力驱动
Strongly Powered by AbleSci AI