Rethinking Representation Complexity in Drug–Target Prediction via Supervised Vector Quantization

Jiandong Chen
Yao-zhong Zhang
Lu Lu
Meixi Wu
Zhiang Chen
Seiya Imoto
Chen Li

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Accurate prediction of drug–target interactions (DTIs) is crucial for computational drug discovery. Although pretrained language models offer richer molecular and protein representations, their increasing complexity does not always lead to better predictive performance. In many cases, the inclusion of redundant or irrelevant features may obscure biologically relevant patterns. In this study, we systematically evaluate the contribution of complex features in DTI prediction and demonstrate that only a portion of these features is truly informative. Based on this insight, we propose a Vector Quantization (VQ)-based module that functions as a plug-and-play feature selection layer within deep learning architectures. When combined with a simple fully connected classifier, this supervised VQ (SVQ) framework not only surpasses recent state-of-the-art DTI methods in performance, but also enhances interpretability through the learning of discriminative codewords. This work highlights the importance of input feature selection in deep learning and offers a new perspective for constructing robust and interpretable DTI prediction models.

Version published to 10.1101/2025.06.13.659523 on bioRxiv
Jun 20, 2025

Integrating Evolutionary and Compositional Features with ML and DL for Robust and Interpretable Druggable Protein Prediction

This article has 5 authors:
1. Mujeebu Rehman
2. Qinghua Liu
3. Muhammad Javed
4. Ali Ghulam
5. Teerath Kumar
This article has no evaluationsLatest version Dec 11, 2025
Discrimination vs. Generation: The Machine Learning Dichotomy for Dopaminergic Hit Discovery

This article has 4 authors:
1. Temitope Sobodu
2. Dong Kong
3. Adeshina Yusuf
4. Dan Kiel
This article has no evaluationsLatest version Dec 16, 2025
DCPM-ADMET: Fusion of Dual-channel Pre-trained Model and Molecular Fingerprints to enhance Drug ADMET Properties Prediction

This article has 7 authors:
1. Yuchen Zeng
2. Yue Qi
3. Leilei Zhang
4. Kaili Jiang
5. Xiaofei Zhou
6. Lu Liang
7. Jianping Lin
This article has no evaluationsLatest version Dec 19, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Integrating Evolutionary and Compositional Features with ML and DL for Robust and Interpretable Druggable Protein Prediction

Discrimination vs. Generation: The Machine Learning Dichotomy for Dopaminergic Hit Discovery

DCPM-ADMET: Fusion of Dual-channel Pre-trained Model and Molecular Fingerprints to enhance Drug ADMET Properties Prediction