Predicting Organ-Specific Toxicity of Selective Androgen Receptor Modulators, using Transfer Learning on Graph Convolutional Networks

Alexander D. Kalian
Arthur C. Silva
Jaewook Lee
Jean-Lou C.M. Dorne
Claire Potter
Emilio Benfenati
Olivia J. Osborne
Miao Guo
Christer Hogstrand

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Novel Quantitative Structure-Activity Relationship (QSAR) models were constructed using Graph Convolutional Networks (GCNs), to predict Drug-Induced Liver Injury (DILI), Drug-Induced Renal Injury (DIRI) and Drug-Induced Cardiotoxicity (DICT) of Selective Androgen Receptor Modulators (SARMs) – an emerging class of performance-enhancing drugs. Prior to training on DILI, DIRI and DICT datasets, the GCN QSAR models were first pre-trained on a variety of unrelated biomedical assay datasets, as an attempt to improve model performance via transfer learning. The success of the transfer learning was mixed; model performances were measurably improved via pre-training on certain datasets, by statistically weak increases. The optimal final QSAR models achieved overall accuracy scores of 68% for DILI (no significant improvement via ensemble modelling), 76% for DIRI (improved to 77% via ensemble modelling) and 65% for DICT (improved to 67% via ensemble modelling). Application of the most optimal singular models to a dataset of 25 SARMs predicted that 21 of the 25 SARMs are either DILI-positive, DIRI-positive, or both – which raises concern, given the rising use of SARMs. All SARMs except for one were predicted as DICT-negative. A novel definition of the Applicability Domain (AD) was used, intended for close relevance to the models, via generating three-dimensional graph embeddings, for each model. Convex hulls were fitted around training data embeddings, with a ±10% buffer, defining the AD as the region of embedded chemical space covered by the convex hull, for a given model. Subsequent analysis found that a majority of DILI, DIRI and DICT testing data lay within the AD, alongside a majority of the SARMs – adding consensus to the reliability of the predictions.

Version published to 10.1101/2025.08.27.672581 on bioRxiv
Sep 1, 2025

Deep Learning Paradigm for Precision Lung Cancer Therapy with AI-Driven Genotype-Phenotype Mining and Patient-Derived Organoid Validation

This article has 19 authors:
1. Zhongze Gu
2. Mingyue Li
3. Xiaoming Shi
4. Tianmu Hu
5. Juan Zhang
6. Ziliang Ye
7. Yuhan Cai
8. Qiwei Li
9. Linchong Liu
10. Wenlong Yu
11. Jiajia Jing
12. Qiuyin Zhang
13. Juanjuan Li
14. Xin Zhou
15. Nan Qiao
16. Jun Bao
17. Zaozao Chen
18. Lili Xu
19. Tao Wang
This article has no evaluationsLatest version Dec 23, 2025
DCPM-ADMET: Fusion of Dual-channel Pre-trained Model and Molecular Fingerprints to enhance Drug ADMET Properties Prediction

This article has 7 authors:
1. Yuchen Zeng
2. Yue Qi
3. Leilei Zhang
4. Kaili Jiang
5. Xiaofei Zhou
6. Lu Liang
7. Jianping Lin
This article has no evaluationsLatest version Dec 19, 2025
Multi-Modal Ensemble Learning for TLR4 Binding Prediction: Addressing Data Scarcity and Leakage in Small Molecule Drug Discovery

This article has 3 authors:
1. Brandon Yee
2. Maximilian Rutkowski
3. Wilson Collins
This article has no evaluationsLatest version Jan 28, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Deep Learning Paradigm for Precision Lung Cancer Therapy with AI-Driven Genotype-Phenotype Mining and Patient-Derived Organoid Validation

DCPM-ADMET: Fusion of Dual-channel Pre-trained Model and Molecular Fingerprints to enhance Drug ADMET Properties Prediction

Multi-Modal Ensemble Learning for TLR4 Binding Prediction: Addressing Data Scarcity and Leakage in Small Molecule Drug Discovery