Design and Evaluation of a Heterogeneous DPU Architecture for Accelerating Post-Quantum Cryptography

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

To remain secure against the threat of quantum computers, post-quantum cryptographic (PQC) algorithms have been introduced. These algorithms typically demand complex computations upon very large numbers, therefore posing substantial computational and integration challenges for hardware and system designers. In this work, we design a heterogeneous architecture integrating a central processing unit ( CPU) and an embedded graphics processing unit (GPU), within a data-processing unit (DPU), enabling PQC acceleration without host involvement. We then evaluate this architecture on two National Institute of Standards and Technology (NIST) standards for PQC digital signatures – ML-KEM (Kyber) and ML-DSA (Dilithium) - on a DPU with an on-board GPU. Leveraging the DPU’s onboard ARM cores as well as its A30 GPU, we benchmarked the task of generating and verifying 1000 digital signatures at once, when performed by just the device’s CPU, versus a hybrid CPU-GPU configuration. Our results show that for batch sizes of 10 and above, the heterogeneous architecture significantly outperforms the homogenous, achieving a speedup of up to 84x. These results highlight the potential of DPUs to bridge cryptographic algorithm design and system engineering, enabling scalable, high-throughput PQC deployment in future secure data-center networks.

Article activity feed