Predicting Behavioral Determinants of Health from Clinical Text Using Transformer Models and BiLSTM

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objective

Social and behavioral determinants of health play a critical role in patient outcomes, yet much of this information is documented only in unstructured clinical text rather than structured records. Detecting these factors using natural language processing enables a deeper understanding of patient health and supports better decision-making. This study aims to improve the prediction of Behavioral Determinants of Health (BDoH) from medical records by systematically comparing multiple transformer-based models: Bio-ClinicalBERT, BioBERT, BioMedBERT, RoBERTa, and T5-EHR with and without BiLSTM. We also address the challenge of class imbalance in the dataset and assess the role of generative versus discriminative models in this task.

Methods

We evaluated five transformer models in combination with BiLSTM for classifying BDoH mentions in the MIMIC-III dataset. To address the challenge of class imbalance, we compared oversampling, undersampling, and class weighting strategies to identify the most effective approach. Then, each model was first tested as a standalone classifier to establish baseline performance. We then extracted embeddings from these models and used them as input to a BiLSTM layer to investigate whether sequential modeling could further improve classification. Also, by introducing our T5-EHR model alongside BERT and RoBERTa models, we were able to assess both the benefits of BiLSTM and the value of using a generative model for this task. Finally, we benchmarked the performance of the best approaches against the published results from the MIMIC-SBDH study, using precision, recall, and F1 scores as evaluation metrics.

Results

The experiments showed that class weighting was the most effective strategy for handling class imbalance, showing better performance across all models. The addition of BiLSTM improved the performance of Bio-ClinicalBERT, BioMedBERT, and BioBERT across all categories. RoBERTa and T5-EHR achieved strong results as standalone models, with no additional benefit from BiLSTM integration. Across all evaluated labels, T5-EHR achieved the highest F1 scores compared to the other models and prior baselines in this study.

Conclusion

This study demonstrates that handling class imbalance is essential for robust prediction of social and behavioral determinants of health from clinical notes. We find that the integration of BiLSTM provides clear benefits for Bio-ClinicalBERT and BioBERT, enhancing their ability to capture sequential dependencies. However, RoBERTa and T5-EHR achieved their best performance without BiLSTM, reflecting the strength of their pretrained representations. Among all models, our pretrained T5-EHR, adapted especially for clinical text, outperformed every baseline, establishing the highest F1 scores across all labels and surpassing the results reported in the MIMIC-SBDH benchmark. These findings highlight both the value of model-specific strategies and the effectiveness of generative models for advancing state-of-the-art classification in health informatics.

Article activity feed