Injecting domain knowledge into graph neural networks for protein-protein interactions

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Most machine learning methods that explore protein-protein interactions rely exclusively on the network of protein interactions as input data. However, protein repositories are rich in information describing multiple protein aspects, and ignoring this wealth of information may be detrimental to predictive performance. In this paper, we investigate how injecting domain knowledge – the molecular functions proteins perform, the biological processes they are involved in and their cellular location – impacts the performance of graph neural networks on two fundamental predictive tasks: protein-protein interaction prediction and protein function prediction. We propose a novel approach that injects protein-level information in the form of knowledge graph embeddings computed over the Gene Ontology into state-of-the-art graph neural networks, replacing ad hoc strategies for node features. We assessed the impact of our knowledge injection approach in two popular benchmarks: the Open Graph Benchmark (based on STRING) and the Human Reference Interactome. Experiments conducted with ten state-of-the-art graph neural network approaches and five representative knowledge graph embedding methods demonstrated that domain knowledge injection significantly improves performance on graph neural networks in both protein function and protein-protein interaction prediction.

Article activity feed