Data-Driven Machine Learning Models for Predicting Antifungal Drug Activity

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: Invasive fungal infections (IFIs) represent a pressing global health threat, particularly for immunocompromised individuals, yet the development of antifungal drugs continues to lag behind antibacterial therapeutics. In this study, we present a data-driven machine learning framework to predict antifungal compound activity, leveraging cheminformatics and supervised learning. Method: A curated dataset of 3,748 positive (antifungal) and 4,096 negative (non-antifungal) compounds was constructed using ChEMBL, ChemDiv, and HMDB. Chemical class assignment via NPClassifier and Tanimoto similarity filtering ensured non-overlapping, structurally meaningful training data. We extracted 217 molecular descriptors per compound and evaluated physicochemical differences between positive and negative sets, confirming statistically significant divergence in Lipinski parameters (p < 0.001). Results: Feature selection using model-specific importance metrics identified key descriptors such as molecular weight, van der Waals surface area, and nitrogen group counts. Multiple supervised learning models were trained—Random Forest (RF), Extreme Gradient Boosting (XGBoost), Support Vector Machines (SVMs with RBF, polynomial, and sigmoid kernels), and Multi-Layer Perceptron (MLP)—and evaluated using five-fold cross-validation. RF and MLP achieved the highest AUCs of 0.996, with SVM-RBF and XGBoost performing comparably well. To assess generalizability, we introduced chemical class-based cross-validation, wherein compounds were partitioned by their chemical class to reduce information leakage. Despite a slight drop in metrics compared to random splits, all models retained balanced accuracies above 0.91. These results demonstrate the promise of integrating molecular informatics with machine learning for antifungal drug discovery and highlight the importance of rigorous validation strategies aligned with chemical diversity.

Article activity feed