HGREncoder: Enhancing Real-Time Hand Gesture Recognition with Transformer Encoder - A Comparative Study
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In the field of Hand Gesture Recognition (HGR), Electromyography (EMG) is used to detect the electrical impulses that muscles emit when a movement is generated. Currently, there are several HGR models that use EMG to predict hand gestures. However, most of these models have limited performance in real-time applications, with the highest recognition rate achieved being 65.78\% \textpm15.15\%, without post-processing steps. Other non-generalizable models, i.e., those trained with a small number of users, achieved a window-based classification accuracy of 93.84\%, but not in time-real applications. Therefore, this study addresses these issues by employing transformers to create a generalizable model and enhance recognition accuracy in real-time applications. The architecture of our model is composed of a Convolutional Neural Network (CNN), a positional encoding layer and the transformer encoder. To obtain a generalizable model, the EMG-EPN-612 dataset was used. This dataset contains records of 612 individuals. Several experiments were conducted with different architectures, and our best results were compared with other previous research that used CNN, LSTM and transformers. The findings of this research reached a classification accuracy of 95.25 \textpm4.9\% and a recognition accuracy of 89.7\% \textpm8.77\%. This recognition accuracy is a significant contribution because it encompasses the entire sequence without post-processing steps.