Geometry-based BERT: an experimentally validated deep learning model for molecular property prediction in drug discovery

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Various deep learning based methods have significantly impacted the realm of drug discovery. The development of deep learning methods for identifying novel structural types of active compounds has become an urgent challenge. In this paper, we introduce a self-supervised representation learning framework, i.e., GEO-BERT. GEO-BERT considers the information of atoms and chemical bonds in chemical structures as the input, and integrates the positional information of the three-dimensional conformation of the molecule for training. Specifically, GEO-BERT enhances its ability to characterize molecular structures by introducing three different positional relationships: atom-atom, bond-bond, and atom-bond. By benchmarking study, GEO-BERT has demonstrated optimal performance on multiple benchmarks. We also performed prospective study to validate the GEO-BERT model, with screening for DYRK1A inhibitors as a case. Two potent and novel DYRK1A inhibitors (IC 50 : <1 μM) were ultimately discovered at a hit rate of 10%. Taken together, we have developed the Geometry-based BERT model for molecular property prediction and proved its practical utility in early-stage drug discovery.

Graphical Abstract

GEO-BERT is a model pretrained on large-scale drug molecule data, while improving the accuracy of property prediction by using three-dimensional structural information within the molecule.

Article activity feed