FPGA-Accelerated Real-Time DCGANs via Xilinx DPUs and Vitis AI

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Generative Adversarial Networks (GANs) produce high-quality images but are computationally intensive, especially due to transposed convolution operations, limiting their real-time performance on traditional hardware. To address this, we propose an optimized FPGA-based acceleration framework leveraging Xil-inx Deep Learning Processing Units (DPUs) and the Vitis AI toolchain to enable real-time inference of Deep Convolutional GANs (DCGANs) for image reconstruction. The proposed approach applies a two-stage quantization method that profiles layer-wise dynamic ranges and fine-tunes scale factors via host-side retraining. This enables quantization of both generator and discriminator from 32-bit floating-point to INT8 precision with minimal accuracy degradation. Additionally, structured pruning through the Vitis AI Optimizer removes redundant weights and filters, producing a compact model that fits entirely in on-chip memory and maximizes DPU efficiency. The architecture uses a multi-threaded ARM processor to manage preprocessing and DMA operations, while a lightweight scheduler in programmable logic sequences the execution of convolu-tion kernels across multiple DPU cores. Double buffering is employed to overlap data movement with computation. Experimental results on a Zynq UltraScale+ MPSoC ZCU104 show over 105 FPS throughput, achieving up to 3.5× better performance and 7.3× energy efficiency than GPU/CPU baselines, with Fréchet Inception Distance (FID) scores within 5% of floating-point models.

Article activity feed