Personalized Disease Risk Prediction from Multimodal Health Data Using Large Language Models

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This study presents a system designed to enhance disease prediction accuracy by integrating electronic health records (EHR) and wearable device data. EHR provides structured health information, such as hospital visits and diagnostic codes, while wearable devices capture real-time physiological data, like step count, offering insights into behavior patterns and activity levels. Traditionally, these data sources have been studied independently, resulting in limited utilization of their combined potential and inconsistencies in predictions. Our system uses a multimodal approach in which specialized encoders process each data type separately, and integrate extracted features into a shared embedding space. This common embedding is leveraged by large language models (LLMs) to capture meaningful relationships between health records and activity patterns, allowing the system to predict disease risks with increased accuracy and relevance to individual patient characteristics. The system’s risk scores, generated end-to-end, can be further utilized by LLMs to provide personalized recommendations based on individual risk profiles. This multimodal approach not only addresses the challenges of integrating disparate data formats but also provides a holistic view of patient health, identifying subtle trends that may be missed when using EHR or wearable data in isolation. The result is a robust, comprehensive framework for proactive and personalized healthcare.

Article activity feed