Rethinking Representation Complexity in Drug–Target Prediction via Supervised Vector Quantization
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Accurate prediction of drug–target interactions (DTIs) is crucial for computational drug discovery. Although pretrained language models offer richer molecular and protein representations, their increasing complexity does not always lead to better predictive performance. In many cases, the inclusion of redundant or irrelevant features may obscure biologically relevant patterns. In this study, we systematically evaluate the contribution of complex features in DTI prediction and demonstrate that only a portion of these features is truly informative. Based on this insight, we propose a Vector Quantization (VQ)-based module that functions as a plug-and-play feature selection layer within deep learning architectures. When combined with a simple fully connected classifier, this supervised VQ (SVQ) framework not only surpasses recent state-of-the-art DTI methods in performance, but also enhances interpretability through the learning of discriminative codewords. This work highlights the importance of input feature selection in deep learning and offers a new perspective for constructing robust and interpretable DTI prediction models.