SpineScan: a deep learning model for lumbar spine MRI annotation and Pfirrmann grading assessment
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Purpose While recent advances in deep learning have enabled automated Pfirrmann grading systems of intervertebral disc degeneration (IDD), many models remain inaccessible due to proprietary restrictions. This study aimed to develop and validate a convolutional neural network (CNN) for automated Pfirrmann grading using a diverse clinical dataset, and to compare our model’s performance with previously published results. Methods We trained a CNN-based model using the YOLOv8x architecture on two datasets: a well-curated Russian lumbar disc degeneration cohort (RuDDS) and an open-access dataset, totaling 484 lumbar MRI scans. Ground truth grading was provided by expert radiologists. The model was designed to simultaneously detect intervertebral discs and classify degeneration grades from single MRI slices. Performance was evaluated using standard metrics, including precision, recall, and mean average precision (mAP) across Pfirrmann grades I to V. Results Our model achieved a predictive accuracy between 0.78 and 0.82 depending on lumbar level. The highest performance was observed for Grade IV discs (mAP50 = 0.872), while performance for Grade V was lower (mAP50-95 = 0.525), likely due to poor contrast and indistinct boundaries in highly degenerated discs. Overall, the model demonstrated a precision of 0.75 and recall of 0.808. Comparison with previous studies revealed that our results are consistent with expert-level performance. Conclusions The developed model shows strong potential for automated grading of lumbar disc degeneration and performs comparably to expert radiologists in most cases. Our findings support the clinical applicability of AI-assisted grading systems while emphasizing the need for standardized imaging and evaluation protocols.