Enhancing MP-MLP with Patch Mixing

Taehyeon Kim

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Lightweight vision architectures are important for image classification in resource-constrained environments.Among CNN-free approaches, multi-layer perceptron (MLP)-based models provide a simple and computationally efficient alternative.MP-MLP is a lightweight vision model that divides an image into non-overlapping micro-patches and applies a shared MLP to each patch independently.While this design is simple and efficient, it does not explicitly model interactions across patches before classification.To address this limitation, we introduce a simple patch mixing module inspired by the token-mixing idea of MLP-Mixer.The proposed module is applied after local patch encoding and performs mixing along the patch dimension through a lightweight MLP block.Experimental results on MNIST and SVHN show that the proposed method consistently improves performance over the baseline MP-MLP, with especially large gains on the more complex SVHN dataset.These results suggest that patch mixing is an effective way to enhance lightweight MLP-based vision models.

Version published to 10.31224/6815
Apr 13, 2026

ATI_Box: A Simple tool for convolutional neural network-based image semantic segmentation

This article has 1 author:
1. Tomasz Przygodzki
This article has no evaluationsLatest version Jun 2, 2026
Exposure to naturalistic occlusion promotes generalized, human-like robustness in deep neural networks

This article has 2 authors:
1. David D Coggan
2. Frank Tong
This article has no evaluationsLatest version Apr 27, 2026
Shorter FFT Windows Improve Cross-Domain Generalization in CNN-Based Cetacean Whistle Detection: A Controlled Sensitivity Analysis

This article has 1 author:
1. Rocco De Marco
This article has no evaluationsLatest version May 6, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

ATI_Box: A Simple tool for convolutional neural network-based image semantic segmentation

Exposure to naturalistic occlusion promotes generalized, human-like robustness in deep neural networks

Shorter FFT Windows Improve Cross-Domain Generalization in CNN-Based Cetacean Whistle Detection: A Controlled Sensitivity Analysis