Hierarchical Fusion with Decision Enhancement for Human Activity Recognition Using Deep Learning Framework
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Human Activity Recognition (HAR) is emerging as a critical enabler of context-aware applications in healthcare, fitness, and smart environments. In this research we present an approach that involves the Hierarchical Fusion with Decision Enhancement for Human Activity Recognition (Hi-FuseDE-HAR) framework. It contains four sequential hierarchical levels that transform raw sensor signals into a reliable HAR decision. At the input level, heterogeneous data streams are obtained from multiple wearable and ambient sensors. Applying Level 0 build the discriminative latent embedding for each sensor modality separately. This involves a number of CNN-based or Transformer-based encoders to transform each sensor dimension from raw to latent embedding. Level 1 fuses across sensors in groupings and determines the value of using each modality and determines its relative contribution to the overall feature importance. Level 2 applies a Graph Cross-Modal Transformer that learns relationship between sensors groups producing a globally consistent fused representation. Level 3 provides decision enhancement through uncertainty calibration and utility aware optimization to ensure the final estimates based. Experimental results indicate that the proposed framework achieves 97.6% accuracy and 96.7% F1-score on the PAMAP2 dataset, 95.5% accuracy and 93.2% F1-score on the OPPORTUNITY dataset, and 96.5% accuracy and 95.2% F1-score on the MHEALTH dataset respectively. Notably, Hi-FuseDE-HAR retains strong performance confirming its capability to generalize across varied sensor contexts and complex activity patterns.