Local Feature Enhancement for Robust 2D Multi-Person Pose Estimation via Posture Refinement Network

Weili Tian
Jin Zhan
Zhaokang Guan
Chensheng Yi
Fangyuan Lei
Xiaoyong Liu
Huihui Li
Yufeng Zeng

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Accurate 2D multi-person pose estimation remains challenging due to issues such as occlusion, missing body parts, and low resolution, particularly in complex backgrounds. This paper proposes a novel posture refinement network that leverages local feature enhancement and fusion to address these limitations. The network employs HRNet as the backbone to extract multi-scale feature maps, introducing a Dilated Convolution Module (DCM) with cascaded dilated convolutions to enrich pose keypoints representations. Additionally, a Hybrid Self-Attention Module (HSM) integrates contextual information to further refine pose estimates. Extensive experiments on the MSCOCO and CrowdPose datasets demonstrate that our method outperforms comparable methods, particularly in estimating human end joint positions with greater accuracy and robustness. Our findings highlight the effectiveness of local feature enhancement in robust multi-person pose estimation. The code and models are available at https://github.com/Twl-GZ/Human-pose.

Version published to 10.21203/rs.3.rs-5034986/v1 on Research Square
Oct 9, 2024

A Spatiotemporal Bidirectional Mamba Network with Global–Local Skeletal Enhancement for 3D Human Pose Estimation

This article has 5 authors:
1. Chuhan Wu
2. Zan Wang
3. Guixian Zhou
4. Jiahao Hua
5. Lianke Shi
This article has no evaluationsLatest version Sep 4, 2025
Efficient Person Re-Identification via Progressive Filter Pruning and Body Part-Aware Feature Learning

This article has 4 authors:
1. Anusha Jayasimhan
2. Vijaya Lakshmi A
3. Pranaya Padmanabhuni
4. Priyaadharshini Ramesh
This article has no evaluationsLatest version Oct 7, 2025
STGSFormer: A 3d Human Pose Estimation Model That Integrates GCN and Self-attention in the Spatio-temporal Domain

This article has 2 authors:
1. Fanjun Su
2. Jinyue Wang
This article has no evaluationsLatest version Aug 21, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A Spatiotemporal Bidirectional Mamba Network with Global–Local Skeletal Enhancement for 3D Human Pose Estimation

Efficient Person Re-Identification via Progressive Filter Pruning and Body Part-Aware Feature Learning

STGSFormer: A 3d Human Pose Estimation Model That Integrates GCN and Self-attention in the Spatio-temporal Domain