ForVA and GCM-CLIP: A Million-Scale Multimodal Dataset and Representation Learning Framework for Virtual Autopsy

Jing Cai
Jikai Mao
Nanze Du
Tu Lyu
Hao Li
Yi Shen
Liang Shen
Junjun Guo

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Intelligent virtual autopsy faces a profound semantic misalignment driven by scarce multimodal data and insufficient fine-grained cognitive mapping, leaving models vulnerable to complex post-mortem noise and catastrophic 'shortcut learning'. To bridge this misalignment, we curate ForVA, a standardized multimodal virtual autopsy dataset of 1.2 million image-text pairs across 9 categories of death causes, and propose GCM-CLIP,a semantics-enhanced contrastive learning framework with an adaptive semantic decoupling module acting as a high-precision "semantic filter". Mechanistic analysis shows GCM-CLIP sharpens semantic discrimination, reduces intra-/inter-class pathological feature overlap (from 0.830/0.709 to 0.566/0.452), delivers a 25\% relative gain in zero-shot classification accuracy and 6-8\% improvements in cross-modal retrieval. Clinically, it empowers junior practitioners to achieve senior-level diagnostic precision and functions as an unbiased "second reader" to capture lesions overlooked due to cognitive anchoring. This work provides a reproducible paradigm for foundation models in high-stakes, data-scarce fields, offering transformative implications for forensic objectivity and judicial justice globally.

Version published to 10.21203/rs.3.rs-8966596/v1 on Research Square
Mar 30, 2026

CUVAE: Strengthening Latent Representations in Skip-Connection VAEs for High-Fidelity Medical Image Reconstruction

This article has 2 authors:
1. Kailash Kandpal
2. Prabhat Verma
This article has no evaluationsLatest version Mar 28, 2026
DLNDD: An Explainable Deep Learning Framework for the Early Detection and Classification of Rare Diseases

This article has 1 author:
1. Mian Muhammad Hamza
This article has no evaluationsLatest version Apr 10, 2026
HDFF-Net: A Hybrid Dual-Feature Fusion Network with Cross-Modal Attention for Automated Colposcopic Transformation Zone Classification

This article has 2 authors:
1. B. Shubhaker¹
2. B. S. Raghavendra²
This article has no evaluationsLatest version Apr 7, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

CUVAE: Strengthening Latent Representations in Skip-Connection VAEs for High-Fidelity Medical Image Reconstruction

DLNDD: An Explainable Deep Learning Framework for the Early Detection and Classification of Rare Diseases

HDFF-Net: A Hybrid Dual-Feature Fusion Network with Cross-Modal Attention for Automated Colposcopic Transformation Zone Classification