Evaluating and Accelerating Vision Transformers on GPU-based Embedded Edge AI Systems

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Many current embedded systems comprise heterogeneous computing components including quite powerful GPUs, which enables their application across diverse sectors. This study demonstrates the efficient execution of a medium-sized Self-Supervised Audio Spectrogram Transformer (SSAST) model on a low-power System-on-Chip (SoC). Through comprehensive evaluation, including real-time inference scenarios, we show that GPUs outperform multi-core CPUs in inference processes. Optimization techniques such as adjusting batch size, model compilation with TensorRT, and reducing data precision significantly enhance inference time, energy consumption, and memory usage. In particular, negligible accuracy degradation is observed, with post-training quantization to 8-bit integers showing less than 1% loss. This research underscores the feasibility of deploying transformer neural networks on low-power embedded devices, ensuring efficiency in time, energy, and memory while maintaining the accuracy of the results.

Article activity feed