Combining Clinical Embeddings with Multi-Omic Features for Improved Patient Classification and Interpretability in Parkinson’s Disease

Chaeeun Lee
Barry Ryan
Riccardo E. Marioni
Pasquale Minervini
T. Ian Simpson

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This study demonstrates the integration of Large Language Model (LLM)-derived clinical text embeddings from the Movement Disorder Society Unified Parkinson’s Disease Rating Scale (MDS-UPDRS) questionnaire with molecular genomics data to enhance patient classification and interpretability in Parkinson’s disease (PD). By combining genomic modalities encoded using an interpretable biological architecture with a patient similarity network constructed from clinical text embeddings, our approach leverages both clinical and genomic information to provide a robust, interpretable model for disease classification and molecular insights. We benchmarked our approach using the baseline time point from the Parkinson’s Progression Markers Initiative (PPMI) dataset, identifying the Llama-3.2-1B text embedding model on Part III of the MDS-UPDRS as most informative. We further validated the framework at years 1, 2, 3 post baseline, achieving significance in identifying PD associated genes from a random null set by year 2 and replicating the association of MAPK with PD in a heterogenous cohort. Our findings demonstrate that the combination of clinical text embeddings with genomic features is critical for classification and interpretation. LLM text embeddings not only increase classification accuracy but also enable interpretable genomic analysis, revealing molecular signatures associated with PD progression.

Version published to 10.1101/2025.01.17.25320664 on medRxiv
Jan 17, 2025

PRESSnet: a novel framework for patient stratification and biomarker discovery using clinical knowledge graphs

This article has 11 authors:
1. Jake Cohen-Setton
2. Shruti Shikhare
3. Ioannis Kagiampakis
4. Domingo Salazar
5. Miguel Goncalves
6. Elizabeth Coker
7. Sanddhya Jayabalan
8. Damian Bikiel
9. Ben Sidders
10. Etai Jacob
11. Krishna Bulusu
This article has no evaluationsLatest version Dec 15, 2025
Deep Learning Architectures for Multi-Omics Data Integration: Bridging Biomarker Discovery and Clinical Translation

This article has 2 authors:
1. Akshay Krishnan Pushparaj
2. Malarmathi Muthukumar
This article has no evaluationsLatest version Jan 26, 2026
Personalized Disease Prediction Framework based on Genomic Variants and Disease Histories using Deep Embeddings and Alignment-based Process Conformance Checking

This article has 4 authors:
1. Daewoo Pak
2. Hyunwoo Jo
3. Seon Kim
4. Jongchan Kim
This article has no evaluationsLatest version Jan 20, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

PRESSnet: a novel framework for patient stratification and biomarker discovery using clinical knowledge graphs

Deep Learning Architectures for Multi-Omics Data Integration: Bridging Biomarker Discovery and Clinical Translation

Personalized Disease Prediction Framework based on Genomic Variants and Disease Histories using Deep Embeddings and Alignment-based Process Conformance Checking