CNN-Free Lightweight Vision Model Using Weight-Shared MLP on Micro-Patches

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Convolutional neural networks (CNNs) achieve strong performance in vision, but their convolutional operators and feature hierarchies can impose non-trivial compute and parameter overhead in lightweight settings. Meanwhile, fully connected (MLP-based) vision models often sacrifice key CNN inductive biases such as locality and weight sharing. In this paper, we present MP-MLP (Micro-Patch Multi-Layer Perceptron), a convolution-free lightweight architecture that recovers CNN-like behavior using only fully connected layers. MP-MLP partitions an input image into non-overlapping micro-patches and applies a single weight-shared MLP block to every patch, acting as a pseudo-convolutional filter without any convolution operations. Patch-wise features are concatenated and fed into a shallow classifier MLP for end-to-end recognition. We evaluate MP-MLP on MNIST, Fashion-MNIST, and SVHN, covering increasing task complexity from clean grayscale digits to RGB street-view digits. With substantially fewer parameters, MP-MLP achieves competitive accuracy on MNIST and FashionMNIST, and slightly outperforms a lightweight CNN baseline on SVHN, demonstrating that carefully designed weight-shared MLPs can be a compelling convolution-free alternative for structured lightweight vision tasks.

Article activity feed