Personalized Hearing Loss Care Using SNOMED CT-Aligned Ontology and Random Forest Machine Learning: A Hybrid Decision-Support Framework

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: Hearing loss affects over 466 million individuals globally and is recognized as a major risk factor for Alzheimer's disease, yet treatment personalization remains limited due to the complexity and diversity of underlying causes. Current diagnostic and therapeutic approaches lack standardized methods to accurately predict the most appropriate intervention for individual patients. The integration of medical ontologies with machine learning offers a promising solution for enhancing diagnostic accuracy and treatment personalization. Aim: Our study aimed to (i) develop a SNOMED CT-aligned clinical ontology for hearing loss using Semantic Web Rule Language for automated reasoning; (ii) implement a Random Forest classifier trained on ontology-enriched patient data to classify hearing loss types (conductive, sensorineural, mixed, or normal); and (iii) predict optimal personalized treatments based on laterality, severity, audiometric thresholds, and medical history using real-world patient data. Methods: We developed a task ontology using Protégé 5.6.3 with Web Ontology Language, integrated SNOMED CT terminology alignment, and implemented Semantic Web Rule Language rules executed by Pellet 2.2.0 reasoner. The framework was trained and evaluated on 3,723 adult patients from the 2015-2016 National Health and Nutrition Examination Survey dataset with complete audiometric and clinical data. Random Forest models were developed using an 80-20 train-test split with stratified sampling and five-fold cross-validation. Performance was compared between K-Means clustering-based labeling and ontology-based semantic inference using accuracy, precision, recall, F1-score, and log loss metrics. Results: The ontology successfully generated semantic labels for all 3,723 patients, enabling precise classification of hearing loss types, severity levels, and laterality. The Random Forest model with K-Means clustering achieved test accuracy of 90.2% with log loss of 0.2766 and cross-validation mean accuracy of 91.22% (standard deviation 1.2%). Integration of ontology-based semantic enrichment significantly improved performance, achieving test accuracy of 92.48% with cross-validation mean accuracy of 92.80% (standard deviation 0.9%). F1-scores improved across all classes, with mixed hearing loss showing notable increase from 0.86 to 0.92. Feature importance analysis identified audiometric thresholds, ontology-derived severity labels, and medical history as top predictors, enhancing clinical interpretability. Conclusion: This study demonstrates that combining SNOMED CT-aligned ontology with Random Forest classification achieves superior diagnostic accuracy and enables personalized treatment recommendations for hearing loss. The hybrid framework provides clinically interpretable decision support while ensuring semantic interoperability with electronic health records. Multi-institutional validation studies are necessary to assess generalizability across diverse populations before clinical deployment.

Article activity feed