Cross-chemical and cross-species toxicity prediction: benchmarking and a novel 3D-structure-based deep learning model

Ruying Yuan
Joseph Shaw
Haixu Tang
Yuzhen Ye

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Prediction of a compound’s toxicity is a key step toward realizing animal-free testing of chemical compounds. Recent advances have yielded significant progress in computational toxicity prediction, including machine learning methods that utilize chemical fingerprints and deep learning-based latent representations. However, challenges remain, primarily due to the lack of clean training datasets and the inconsistent model performance. To address these challenges, we curated a comprehensive dataset of aquatic toxicity from seven data sources, which contains 50,603 records for 5,889 compounds across 2,285 different species, much larger than similar datasets used in previous studies. We also developed tox-learn , a Python library featuring tools for automated dataset cleaning, machine learning methods and performance evaluation. The library places special emphasis on avoiding overestimation of prediction accuracy caused by improper train-test data splitting. Based on this toolbox, we benchmarked various predictive models using different train-test splitting strategies on the curated dataset. Our results showed that the choice of machine learning method, molecular fingerprint, and train-test splitting strategy all significantly affect performance. We demonstrated that incorporating species information generally improved predictions, although the degree of improvement depended on how this information was represented. In addition, we developed a new 3D structure–based deep-learning model, 3DMol-Tox , which achieves regression accuracy comparable to the best 2D-structure based model (GPBoost) while exhibiting consistently higher within-one-bin (W1B) classification accuracy. Finally, we analyzed the impact of different train–test splitting strategies and provide recommendations based on our benchmarking, such as using structure-aware splitting to mitigate information leakage, a common issue that inflates reported model performance.

Version published to 10.1101/2025.11.24.690199 on bioRxiv
Nov 26, 2025

LinkerMind: An Interpretable, Mechanism-Informed Deep Learning Framework for the De Novo Design of Antibody Drug Conjugate Linkers

This article has 1 author:
1. Martins Otun
This article has no evaluationsLatest version Dec 19, 2025
Multi-Modal Ensemble Learning for TLR4 Binding Prediction: Addressing Data Scarcity and Leakage in Small Molecule Drug Discovery

This article has 3 authors:
1. Brandon Yee
2. Maximilian Rutkowski
3. Wilson Collins
This article has no evaluationsLatest version Jan 28, 2026
Integrating Evolutionary and Compositional Features with ML and DL for Robust and Interpretable Druggable Protein Prediction

This article has 5 authors:
1. Mujeebu Rehman
2. Qinghua Liu
3. Muhammad Javed
4. Ali Ghulam
5. Teerath Kumar
This article has no evaluationsLatest version Dec 11, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

LinkerMind: An Interpretable, Mechanism-Informed Deep Learning Framework for the De Novo Design of Antibody Drug Conjugate Linkers

Multi-Modal Ensemble Learning for TLR4 Binding Prediction: Addressing Data Scarcity and Leakage in Small Molecule Drug Discovery

Integrating Evolutionary and Compositional Features with ML and DL for Robust and Interpretable Druggable Protein Prediction