qcGEM: a graph-based molecular representation with quantum chemistry awareness

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The advancement of artificial intelligence (AI) has reshaped drug discovery. AI-based models typically rely on molecular representations for prediction. However, the absence of physically grounded information in mainstream molecular representations not only limits the model performance in practical applications, but also hinders the mechanistic understanding and exploitation by human. To overcome this issue, we introduce qcGEM, a quantum-chemistry-aware graph-based embedding of molecules that incorporates physical priors into molecular representation learning. By integrating quantum chemistry knowledge with a physics-inspired architecture, qcGEM provides a compact, physics-informed molecular representation that supports a diverse range of downstream applications. Particularly, qcGEM demonstrates the state-of-the-art performance across a broad range of molecule-related benchmarks, as evidenced by comprehensive evaluations on 71 tasks including molecular property prediction, activity cliff detection, protein-ligand interaction modeling and opioid drug classification, and simultaneously offers strong interpretability at multiple representation levels. We additionally propose a simplified variant, qcGEM-Hybrid, with substantially accelerated embedding generation and robust performance. Overall, our method provides an advanced molecular representation that will benefit molecule-related modeling and prediction, supporting further progress in AI-aided drug discovery.

Article activity feed