Learning Chirality-Aware Representations to Predict Drug Side Effect Frequencies

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Ab initio prediction of side effect frequencies is important for assessing the risk–benefit profile of drugs and for identifying potential adverse effects early in development. A key challenge is chirality: many drugs exist as enantiomers, pairs of molecules with the same atoms and bond connectivity but different three-dimensional arrangements. Although chemically similar, enantiomers can interact differently with biological targets and therefore exhibit distinct efficacy and adverse-effect profiles. Here we introduce F2S (Features to Signatures), a method to predict the frequencies of drug side effects while explicitly accounting for chirality. Drug representations are learned directly from chemical structure using a directed-bond message-passing graph neural network that captures stereochemical configurations. Side effect representations are derived from curated textual descriptions encoded with a frozen PubMedBERT model. Side effect frequencies are predicted from the dot product between drug and side effect signatures together with biases for drugs and side effects. We evaluated F2S extensively across multiple settings, including cold-start and warm-start prediction, prospective evaluation, and scenarios controlling for chemical similarity between training and test drugs. Across these evaluations, F2S achieves performance comparable to state-of-the-art methods for general side-effect frequency prediction while producing fewer false positives and substantially improves the prediction of frequency differences between enantiomer pairs. Finally, F2S learns compact 10-dimensional signatures that support interpretability: drug signatures reflect therapeutic class and shared targets, side-effect signatures capture phenotype similarity, and the learned bias terms correlate with the popularity of drugs and side effects.

Article activity feed