MNI-GAIR: Multi-scale Normal Image and Grid Attention-based Image Recognition

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Achieving high-precision facial recognition in complex scenarios is one of the key challenges in the field of computer vision. This paper proposes a multi-modal collaborative recognition framework, MNI-GAIR, which combines multi-scale normal image generation, dynamic grid attention mechanisms, and point cloud generalization techniques to address the low efficiency of cross-modal feature alignment and the insufficient monocular real-time performance in existing methods. These innovations significantly enhance recognition performance in complex scenarios such as occlusion and extreme poses. Firstly, a multi-scale normal map generation module based on differentiable rendering is designed, which combines GLCM-LBP features and Cascaded Atrous Pyramid (CAP), improving noise robustness by 23.6% bib1. Secondly, a dynamic grid partitioning attention network (DGPA-Net) is proposed, which optimizes grid structures through gradient-driven approaches and incorporates dual-path attention mechanisms, improving recognition accuracy for extreme side-face (\(>75°\)) scenarios by 14.7% bib2. Lastly, a point cloud generalization framework based on Lie group theory is introduced, enabling cross-modal feature fusion and reduces cross-pose error rates (EER) to 1.23% bib3. Experimental results on multiple standard datasets, including FaceScape and LFW, demonstrate that MNI-GAIR outperforms existing methods in terms of accuracy, robustness, and computational efficiency, providing a systematic solution for 3D facial analysis. The source code is available on GitHub at \href{https://github.com/LLxuLL/MNI-GAIR-Multi-scale-Normal-Image-and-Grid-Attention-based-Image-Recognition}{LLxuLL/MNI-GAIR-Multi-scale-Normal-Image-and-Grid-Attention-based-Image-Recognition: Multi-scale Normal Image and Grid Attention-based Image Recognition}.

Article activity feed