Machine Learning-Powered Hand Recognition: Techniques, Evaluation, and Open Problems

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Hand recognition and analysis has emerged as a cornerstone of human-computer interaction, enabling a wide array of applications including gesture-based interfaces, sign language interpretation, virtual and augmented reality, and biometric authentication. Recent advances in machine learning, particularly deep learning, have significantly elevated the capabilities of hand recognition systems by allowing them to learn complex visual and kinematic patterns from large-scale datasets. This survey provides a comprehensive examination of the field, beginning with foundational representations and computational models used for detecting and understanding hand configurations in 2D and 3D. We systematically explore the use of convolutional neural networks (CNNs), graph neural networks (GNNs), recurrent models, and transformer-based architectures, highlighting their effectiveness in various subdomains such as static gesture classification, dynamic gesture recognition, pose estimation, and segmentation.We also present an in-depth discussion of evaluation metrics and experimental protocols commonly used to benchmark performance across different tasks. These include accuracy, precision, recall, mean squared error, percentage of correct keypoints (PCK), intersection-over-union (IoU), and temporal metrics such as edit distance and sequence accuracy. A summary of widely adopted datasets is included, along with a comparative analysis of state-of-the-art results.Despite these advancements, the field continues to face significant challenges related to occlusions, intra- and inter-class variation, domain adaptation, data scarcity, and computational efficiency. We analyze these limitations in detail and review emerging research directions such as self-supervised learning, multimodal fusion, efficient model design for edge deployment, and generative approaches for data synthesis. We further examine the ethical considerations and fairness implications associated with deploying hand recognition technologies in real-world environments.This survey concludes with a synthesis of the current state of the field and a forward-looking perspective on its trajectory. We argue that future progress will require interdisciplinary solutions that combine algorithmic innovation with robust evaluation, ethical deployment, and user-centric design. The insights presented herein aim to inform and inspire future research at the intersection of computer vision, machine learning, and human-centered computing.

Article activity feed