Combination AI-Machine Learning to Diagnose Pulmonary Hypertension: A Real-World Evidence Cohort Study

Seyed M. Shams
Mary E. Maldarelli
Steven Cassady
Gautam Ramani
Colleen M. Ennett
Bradley A. Maron
Katarina Zeder

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

BACKGROUND

Pulmonary hypertension (PH) is a highly morbid disease, but underdiagnosis is common outside of expert referral centers. Consequentially, there may be opportunities to automate PH diagnosis using artificial intelligence (AI) clinical decision support tools. Analysis of patient-level right heart catheterization (RHC) data is required to optimize AI-based PH diagnosis but has not been reported previously.

METHODS

We performed a retrospective cohort analysis of all RHC studies (January 1, 2016 to December 31, 2024) performed at the University of Maryland Medical System (UMMS), which is a Maryland statewide clinical network of 12 hospitals serving >2 million patients. We developed an automated large language model (LLM)-driven Pattern Repository (LDPR) method, featuring three task-specific LLM agents for extracting unstructured RHC data, which was manually cross-validated independently by two PH experts. To address data missingness, we used machine-learning to develop formulae to calculate mean pulmonary artery pressure (mPAP) from systolic (sPAP) and diastolic (dPAP) PAP, using an 80/20 train-test split.

RESULTS

The study cohort included N=11,029 unique patients and 17,292 RHC reports (age 66±13.5 years; 43% female; 65% White, 30% Black or African American; mPAP, 28±11mmHg; 26% congestive heart failure). The precision for accurate mPAP, sPAP, and dPAP extraction by the LLM was 99.6%, 99.4%, and 99.4%, respectively, with a detection failure of 0.4%. A missing mPAP was noted in N=548 cases and N=507 unique patients (3.2% and 4.6%, respectively). When applying ML to the dataset, the simple, linear equation: mPAP=1.51+0.43*sPAP+0.45*dPAP returned the highest R2 of 0.94 and lowest mean square error of 8.3 mmHg, which outperformed linear equations used currently (all p<0.001). The ML-derived formula was then directed to patients with missing mPAP (N=507) and identified N=382 patients (75.3%) with mPAP >20mmHg, and therefore reclassifying patients from no diagnosis to a diagnosis of PH.

CONCLUSION

In this retrospective cohort analysis, combination LLM-ML-based extraction and interpretation of RHC was used to automate PH diagnosis in a large and heterogenous patient population. This approach is an efficient and scalable solution to preventing under-diagnosis of PH and demonstrates the feasibility of generative AI for advancing clinically-actionable tools that can improve cardiovascular disease phenotyping and diagnosis in real-world settings.

Version published to 10.1101/2025.09.30.25336749 on medRxiv
Oct 1, 2025

Machine learning models for predicting severe clinical events in hospitalized patients with coronary artery disease

This article has 16 authors:
1. Hao Liu
2. Meijun Liu
3. Xinmiao Guan
4. Feng Cao
5. Changhao Liang
6. Zhongwen Qi
7. Jiaqi Hui
8. Junnan Zhao
9. Jingli Xing
10. Jianguo Zhou
11. Dong Zhang
12. Lei Liu
13. Xiaoliang Hao
14. Minjing Luo
15. Fengqin Xu
16. Yutong Fei
This article has no evaluationsLatest version Jan 12, 2026
Responsible AI for Sepsis Prediction: Bridging the Gap Between Machine Learning Performance and Clinical Trust

This article has 6 authors:
1. Thiago Q. Oliveira
2. Leandro A. Carvalho
3. Flávio R. C. Sousa
4. João B. F. Filho
5. Khalil F. Oliveira
6. Daniel A. B. Tavares
This article has no evaluationsLatest version Jan 30, 2026
Prediction of Chronic Obstructive Pulmonary Disease Using Machine Learning Models

This article has 7 authors:
1. Sher Ali
2. Omair Faqah
3. Elise Neubarth
4. Mohammad Shehroz Ashraf
5. Michael DeGiorgio
6. Mark Block
7. Waseem Asghar
This article has no evaluationsLatest version Dec 15, 2025

Discuss this preprint

Listed in

Abstract

BACKGROUND

METHODS

RESULTS

CONCLUSION

Article activity feed

Related articles

Machine learning models for predicting severe clinical events in hospitalized patients with coronary artery disease

Responsible AI for Sepsis Prediction: Bridging the Gap Between Machine Learning Performance and Clinical Trust

Prediction of Chronic Obstructive Pulmonary Disease Using Machine Learning Models