Local Feature Enhancement for Robust 2D Multi-Person Pose Estimation via Posture Refinement Network

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Accurate 2D multi-person pose estimation remains challenging due to issues such as occlusion, missing body parts, and low resolution, particularly in complex backgrounds. This paper proposes a novel posture refinement network that leverages local feature enhancement and fusion to address these limitations. The network employs HRNet as the backbone to extract multi-scale feature maps, introducing a Dilated Convolution Module (DCM) with cascaded dilated convolutions to enrich pose keypoints representations. Additionally, a Hybrid Self-Attention Module (HSM) integrates contextual information to further refine pose estimates. Extensive experiments on the MSCOCO and CrowdPose datasets demonstrate that our method outperforms comparable methods, particularly in estimating human end joint positions with greater accuracy and robustness. Our findings highlight the effectiveness of local feature enhancement in robust multi-person pose estimation. The code and models are available at https://github.com/Twl-GZ/Human-pose.

Article activity feed