Personalized Disease Prediction Framework based on Genomic Variants and Disease Histories using Deep Embeddings and Alignment-based Process Conformance Checking

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This study proposes a novel personalized disease prediction framework that integrates heterogeneous biomedical information, including structured genomic variant annotations, deep semantic embeddings of longitudinal disease histories, and process conformance metrics derived from historical disease pathways. Disease history embeddings were generated using BioClinicalBERT, while alignment-based process conformance checking was applied to quantify how closely an individual’s disease trajectory conforms to typical progression patterns observed in the population. The results demonstrate that incorporating conformance-based fitness features significantly improves prediction performance across all disease categories and classifiers, yielding consistently higher AUROC values and lower Brier scores. These findings indicate that process-level disease pathway conformity captures critical temporal and behavioral information not fully represented by genomic or deep semantic features alone, highlighting the importance of integrating genetic, semantic, and process-based signals for personalized disease prediction in precision medicine.

Article activity feed