Deep Learning enabled discovery of kinase drug targets in Pharos
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We use machine learning with a standardized molecular structure and gene ontology data to predict ligand interactions for a set of human kinases. We realize this by leveraging information from the TCRD / Pharos database, developed and maintained within the Illuminating the Druggable Genome (IDG) project.
Pharos collects relevant biochemical and clinically relevant information of a large set of biologically important (human) proteins from publicly available sources, including scientific publications as well as specialized databases. The 635 kinases listed in Pharos are classified into levels reflecting the relative amount and type of accumulated information. Importantly, molecular structure and Gene Ontology annotations are available for the entire set, but only 455 of the kinases have recorded ligand affinity data.
We developed a deep neural network-based framework to predict the ligand affinity profile for kinases using generally available information (molecular structure and Gene Ontology annotations) as input. The input data is organized into a 2,770 – dimensional vector with binary entries. The output data are predicted affinity values for interactions between the respective kinase and possible ligands.
To address the very large number of possible ligands (58,800) and the sparsity of available binding data, we organized the ligands into 5,275 clusters based on structural similarity measures. Our model framework is trained to predict likely interactions between kinases and these ligand clusters.
We aim to identify sets of likely ligand partners associated with high predicted relative affinities for a given kinase. We measure performance by evaluating the efficiency in identifying known ligand partners for documented kinases that were not included in the training data. Our results indicate that our model framework can identify sets of ligands that will contain a significant fraction of the correct (known) ligand partners.