Development of Machine Learning Algorithms for Predicting Vitamin B12 Levels Using Biochemical Analyte Data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background: Vitamin B12 deficiency is a common yet frequently underdiagnosed condition due to the limited diagnostic accuracy of serum total B12 and restricted availability of confirmatory biomarkers such as holotranscobalamin and methylmalonic acid. Advances in machine learning (ML) and large-scale laboratory datasets provide new opportunities to leverage routinely collected biochemical and hematological parameters for early detection. This study aimed to develop, optimize, and validate explainable ML models to predict vitamin B12 deficiency using standard laboratory analytes obtained during routine outpatient care. Methods: This retrospective study included 51,630 adult patients from 2015–2025, with an independent temporal validation cohort of 34,744 individuals. Eight supervised ML algorithms—logistic regression, random forest, decision tree, SVM, KNN, XGBoost, CatBoost, and artificial neural networks—were trained within a four-stage experimental framework incorporating default modeling, threshold optimization, hyperparameter tuning, and feature engineering. Performance was assessed using AUC-ROC, AUC-PR, sensitivity, specificity, F1-score, PPV, NPV, accuracy, MCC, and likelihood ratios. Statistical comparisons included DeLong, paired t-tests, McNemar, NRI, and IDI analyses. Model interpretability was evaluated using SHAP, LIME, and Decision Curve Analysis. Results: Across all experiments, CatBoost achieved the most balanced performance, with the F1-maximization threshold-optimized configuration demonstrating the lowest false-negative rate. In the test set, CatBoost yielded sensitivity 0.92, specificity 0.67, F1 0.82, AUC-ROC 0.88, and AUC-PR 0.86. Temporal validation confirmed robust generalizability (sensitivity 0.85, specificity 0.77, AUC-ROC 0.90, AUC-PR 0.91, MCC 0.63). SHAP and LIME consistently identified MCV, HGB, HCT, RBC, RDW, iron, ferritin, CRP, folate, and age as key contributors. DCA demonstrated superior net clinical benefit across a wide threshold range. Conclusion: This study presents the first large-scale, explainable, and clinically validated ML model capable of predicting vitamin B12 deficiency using only routine laboratory parameters. The model exhibits strong discrimination, reliability under temporal shift, and biologically meaningful interpretability, supporting its potential integration into clinical decision-support systems for early detection and optimized laboratory workflows.