Clinical Evaluation of an AI-Based System for Pediatric Growth Screening in Routine Practice: A Retrospective Cross-Sectional Study

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Bone age assessment is fundamental to pediatric endocrinology and growth evaluation. Traditional manual methods using radiographic atlases suffer from inter-observer variability and time constraints. Artificial intelligence (AI) systems offer potential solutions for standardized, efficient bone age screening, though rigorous clinical validation in diverse populations remains essential. Objective To evaluate the clinical performance and accuracy of BoneAH, an AI-based bone age assessment system, as a screening tool in a cohort of Indian pediatric patients compared to reference determinations using the Greulich-Pyle method. Methods This retrospective cross-sectional observational study included 288 left hand-wrist radiographs from healthy pediatric patients aged 1 to 17 years. AI-predicted bone age was compared against consensus reference determinations by three blinded clinicians using the Greulich-Pyle atlas. Primary outcomes included mean absolute error (MAE), intraclass correlation coefficient (ICC), and Bland-Altman analysis. Secondary analyses examined performance across age groups and by gender. Results The AI system demonstrated high agreement with reference standards (ICC = 0.989; 95% CI: 0.986–0.991). Overall MAE was 0.58 years (95% CI: 0.53–0.63), with 83.3% of predictions within ± 1.0 year and 97.2% within ± 1.5 years of reference values. Pearson correlation was 0.993 (p < 0.001). A systematic positive bias of + 0.40 years was observed. Performance was comparable between males (MAE = 0.61 years) and females (MAE = 0.55 years; p = 0.243). Younger children (0–5 years) showed the lowest MAE (0.45 years). Conclusions BoneAH demonstrated high reliability and clinically acceptable accuracy for pediatric bone age screening in an Indian population. Its predictable nature supports potential calibration. The system shows promise as a first-level screening tool for growth assessment programs.

Article activity feed