AI vs. Traditional ultrasound study in Congenital Heart Defect Detection: A Systematic review

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Introduction

Prenatal detection rates for CHD have increased with improved ultrasound technology and imaging, the use of the first trimester fetal echocardiography, and standardization of the views required in the fetal echocardiogram. Moreover accurate prenatal detection of CHD, particularly complex CHD, is an important contributor to improved survival rates for patients with CHD, however training and availability of expertise is still a limitation in screening. Artificial intelligence has already taken the world by storm, by surpassing human limitations and capabilities. Applications of artificial intelligence in the field of clinical diagnostics holds immense potential. Forty original, primary research studies that studied the role of Artificial Intelligence/ Machine Learning in detecting prenatal congenital heart defects in comparison to fetal echocardiography by traditional ultrasound techniques, were selected and the extracted data was assessed for the accuracy of AI-driven prenatal ultrasound screening for the detection of congenital heart defects.

Rationale

Owing to the new technology, lack of generalized access to such information systems and limited availability of advanced AI algorithms in older ultrasound machines at present, well-designed comprehensive research and systematic analyses are still very few and there is much gap in our understanding of this very valuable tool in clinical application and diagnoses.

Objectives

To assess the diagnostic accuracy of artificial intelligence algorithms in detecting various congenital heart defects in comparison to traditional ultrasound based fetal echocardiography, across various gestational ages.

Methods

A comprehensive search was conducted across databases and registers, identifying 496 records. After removing duplicates and excluding studies based on eligibility criteria, 40 original primary studies were included. Data was extracted on types of congenital heart defects detected, sensitivity of detection, timing of scan, with comparative analysis conducted on detection of the same with traditional ultrasound fetal echocardiography. We searched for academic papers using PubMed, Semantic search tools and journals. Of the papers most relevant to the question that were retrieved, both authors rigorously and independently studied and summarized the abstracts to exclude the studies that did not meet the outlined criteria. Even though some well-planned comprehensive studies were found, they lacked certain key inclusion criteria, and were therefore excluded from the review. 40 studies were shortlisted for the purpose of this study, based on the inclusion criteria. Only studies that involve prenatal ultrasound screening of human fetuses for congenital heart defects, that evaluate AI/machine learning algorithms for detecting congenital heart defects in comparison reporting outcome as at least one quantitative measure of diagnostic accuracy (sensitivity, specificity or accuracy) were included. Study Designs were limited to primary research studies (prospective or retrospective). Gestational age at study was also noted. All Non-human studies/ Duplicate publications/ Systematic or narrative reviews and studies without key information related to the question asked, were excluded.

Results

Across gestational ages, artificial intelligence algorithms detect congenital heart defects with 95-96% accuracy, exceeding traditional diagnostic methods which achieve 88-90% accuracy. This systematic review including 40 selected studies, involving around 20,000 pregnancies, gave much needed insight into the accuracy of application of AI in prenatal diagnosis of congenital heart defects. In various investigations, ensemble neural networks, convolutional architectures (including YOLO variants and DenseNet ), and explainable AI produced sensitivity values between 75% and 100% and specificity between 76% and 100% for defects such as tetralogy of Fallot, Hypoplastic left heart syndrome, atrioventricular septal defect, and ventricular septal defect. In comparisons, one report recorded an AUC of 0.883 for AI versus 0.749 for residents and 0.808 for fellows, while Ensemble methods achieved 95% sensitivity and 96% specificity compared with traditional measurements at 88% and 90%.

Performance was consistently high across gestational ages. First-trimester screenings reached approximately 95.6% accuracy, and studies focused on the common second-trimester period (18–24 weeks) and later gestations reported similar detection levels.

Conclusion

These results support that, within their respective designs and sample sizes, AI techniques generally provide detection metrics that meet or exceed those of established diagnostic methods during prenatal CHD screening and in effect provide an augmented intelligence for clinical diagnosis. Further research with more robust standardization and more large scale randomized studies is needed to validate the accuracy and diagnostic applications that can be harnessed.

Article activity feed