Predicting Organ-Specific Toxicity of Selective Androgen Receptor Modulators, using Transfer Learning on Graph Convolutional Networks
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Novel Quantitative Structure-Activity Relationship (QSAR) models were constructed using Graph Convolutional Networks (GCNs), to predict Drug-Induced Liver Injury (DILI), Drug-Induced Renal Injury (DIRI) and Drug-Induced Cardiotoxicity (DICT) of Selective Androgen Receptor Modulators (SARMs) – an emerging class of performance-enhancing drugs. Prior to training on DILI, DIRI and DICT datasets, the GCN QSAR models were first pre-trained on a variety of unrelated biomedical assay datasets, as an attempt to improve model performance via transfer learning. The success of the transfer learning was mixed; model performances were measurably improved via pre-training on certain datasets, by statistically weak increases. The optimal final QSAR models achieved overall accuracy scores of 68% for DILI (no significant improvement via ensemble modelling), 76% for DIRI (improved to 77% via ensemble modelling) and 65% for DICT (improved to 67% via ensemble modelling). Application of the most optimal singular models to a dataset of 25 SARMs predicted that 21 of the 25 SARMs are either DILI-positive, DIRI-positive, or both – which raises concern, given the rising use of SARMs. All SARMs except for one were predicted as DICT-negative. A novel definition of the Applicability Domain (AD) was used, intended for close relevance to the models, via generating three-dimensional graph embeddings, for each model. Convex hulls were fitted around training data embeddings, with a ±10% buffer, defining the AD as the region of embedded chemical space covered by the convex hull, for a given model. Subsequent analysis found that a majority of DILI, DIRI and DICT testing data lay within the AD, alongside a majority of the SARMs – adding consensus to the reliability of the predictions.