Enhancing MP-MLP with Patch Mixing

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Lightweight vision architectures are important for image classification in resource-constrained environments.Among CNN-free approaches, multi-layer perceptron (MLP)-based models provide a simple and computationally efficient alternative.MP-MLP is a lightweight vision model that divides an image into non-overlapping micro-patches and applies a shared MLP to each patch independently.While this design is simple and efficient, it does not explicitly model interactions across patches before classification.To address this limitation, we introduce a simple patch mixing module inspired by the token-mixing idea of MLP-Mixer.The proposed module is applied after local patch encoding and performs mixing along the patch dimension through a lightweight MLP block.Experimental results on MNIST and SVHN show that the proposed method consistently improves performance over the baseline MP-MLP, with especially large gains on the more complex SVHN dataset.These results suggest that patch mixing is an effective way to enhance lightweight MLP-based vision models.

Article activity feed