Discovery of Novel Natural Product-Derived EGFR Inhibitors Using Multiple Linear Regression, Stacked Ensemble Regression, and Fingerprinting Approaches

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This study developed and validated Quantitative Structure-Activity Relationship (QSAR) models to predict the inhibitory activity (pIC\textsubscript{50}) of 225 EGFR inhibitors. A genetic algorithm selected eight molecular descriptors, which were used to construct two models: a multiple linear regression (MLR) and a stacked ensemble regression (SER). The Stacked Ensemble Regression (SER) model showed only marginally higher accuracy (\((\Delta r^{2} = + 0.022)\)) but exhibited greater instability (\((\Delta r_{m(test)}^{2})\)= 0.0802 vs. MLR's 0.0184) and reduced interpretability. Thus, MLR was retained as the primary model due to its OECD-compliant mechanistic transparency and superior generalizability. Rigorous applicability domain analysis confirmed the MLR model's reliability. Notably, molecular docking (PDB ID: 8A27) identified a top-ranked inhibitor (Compound 121) with high binding affinity (-12.023 kcal/mol), forming critical hydrogen bonds and hydrophobic interactions with EGFR's active site. Virtual screening of 32 structural analogs of Compound 121 revealed additional promising candidates. This work provides a robust framework for EGFR inhibitor discovery, combining computational modeling with structural insights.

Article activity feed