SimpleCID: Skeleton-Guided Lightweight Heatmap Refinement for Robust Multi-Person Pose Estimation in Crowded Scenes

Lifeng Zhou
Jiasheng Su
Huanxin Zhu
Ying Li
Xiang Zhang
Jinhe Su

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Multi-person pose estimation in crowded scenes remains difficult because heavy overlap and occlusion frequently break independently predicted keypoint heatmaps. Contextual Instance Decoupling (CID) improves crowded-scene estimation by generating instance-aware feature maps, yet its final joint heatmaps are still produced channel by channel without explicit structural coupling. This paper presents SimpleCID, a lightweight refinement built on top of CID. After the Global Feature Decoupling stage, we model the keypoints of each person as nodes in a human-body graph and propagate responses through a fixed normalized adjacency matrix. The refined response is fused with the original heatmap by a residual connection with a small coefficient, allowing adjacent joints to provide structural support while preserving the baseline prediction. The module introduces no additional trainable parameters, keeps the original training pipeline unchanged, and adds only lightweight matrix multiplication along the joint dimension. On crowded-scene benchmarks, SimpleCID consistently improves the baseline: it raises AP by 1.2 points on CrowdPose and improves OCHuman AP from 41.4 to 43.3. Qualitative comparisons further show more complete limb recovery and fewer anatomically inconsistent predictions under severe occlusion. These results demonstrate that explicit yet simple skeleton reasoning is an effective complement to contextual instance decoupling.

Version published to 10.21203/rs.3.rs-9225509/v1 on Research Square
Apr 1, 2026

Precise 2D Mouse Pose Estimation via Multi-Scale Context and Sensitive-Aware Loss from Low Illumination Environment

This article has 4 authors:
1. Yubin Geng
2. Jiaxin Deng
3. Zhicheng Wang
4. Junbiao Pang
This article has no evaluationsLatest version Apr 15, 2026
CFA-DeepLabV3+: Cross-level Fusion and Attention Network for Lightweight Road Segmentation

This article has 6 authors:
1. Xin Zhang
2. Yan Li
3. Zexi Hua
4. XiangZhen Zhou
5. YuGe Pan
6. Hui Qiao
This article has no evaluationsLatest version Apr 8, 2026
BPC-SLAM: Part-Level Dynamic Suppression and Structure-Constrained RGB-D SLAM for Human-Centric Dynamic Environments

This article has 5 authors:
1. Wang Yang
2. Jiupeng Chen
3. Hongjun San
4. Fan Zhang
5. Wunyu Xu
This article has no evaluationsLatest version Apr 2, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Precise 2D Mouse Pose Estimation via Multi-Scale Context and Sensitive-Aware Loss from Low Illumination Environment

CFA-DeepLabV3+: Cross-level Fusion and Attention Network for Lightweight Road Segmentation

BPC-SLAM: Part-Level Dynamic Suppression and Structure-Constrained RGB-D SLAM for Human-Centric Dynamic Environments