Revisiting Convolutional Design for Efficient CNNs: An Empirical Study on Embedded AI Platforms

Onur Erdem Korkmaz

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

While Vision Transformers (ViTs) have recently demonstrated impressive performance in computer vision tasks, their high computational demands and memory usage limit their applicability in real-time and edge AI scenarios. In contrast, Convolutional Neural Networks (CNNs) remain the preferred choice for such environments due to their lower latency, inductive bias, and efficiency. This study examines the impact of five widely used convolutional operations spatial, grouped, shuffle, depth-wise\&point-wise, and shift when integrated into the ResNet-50 architecture. All variants are trained on CIFAR-10\&100 datasets under standardized GPU based settings and evaluated across three edge AI platforms: Raspberry Pi 5, Coral Dev Board, and Jetson Nano. The analysis includes parameter count, FLOPs, accuracy, and detailed runtime decomposition on CPU, GPU, and edge hardware. Results show that while depth-wise convolutions offer theoretical efficiency, they suffer from poor memory access on memory bound platforms. In contrast, shuffle and shift convolutions yield better trade offs between accuracy, computational load, and inference speed. These findings provide actionable insights for designing hardware aware, deployment optimized CNN architectures suitable for resource constrained applications.

Version published to 10.21203/rs.3.rs-7317840/v1 on Research Square
Aug 25, 2025

Performance Comparison of VGG16 and ResNet Architectures on CIFAR-10 Dataset

This article has 2 authors:
1. Anubhab Parashar
2. Vedika Pande
This article has no evaluationsLatest version Sep 17, 2025
FPGA-Accelerated Real-Time DCGANs via Xilinx DPUs and Vitis AI

This article has 5 authors:
1. Amirhossein Sadr
2. Shayan Haghighat
3. Aida Pakniyat
4. Dara Rahmati
5. Saeid Gorgin
This article has no evaluationsLatest version Aug 20, 2025
Generalized Quantization of Faster R-CNN

This article has 2 authors:
1. Tamás Menyhárt
2. Róbert Lakatos
This article has no evaluationsLatest version Oct 6, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Performance Comparison of VGG16 and ResNet Architectures on CIFAR-10 Dataset

FPGA-Accelerated Real-Time DCGANs via Xilinx DPUs and Vitis AI

Generalized Quantization of Faster R-CNN