Decoding proteome functional information in model organisms using protein language models

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The gap between sequence information and experimental determination of function in proteins is currently unsolvable. The computational and automatic implementation of function prediction is not significantly improving function assignment, so there is a pressing need to find alternative methods, since standard approaches are not able to bridge the gap. Modern machine learning methods have been recently developed to predict function, being deep neural convolutional networks a popular choice. Protein language models have been also tested and proved reliable in curated datasets, but have not been applied yet to full collections of curated proteomes. We have tested how two different machine learning based methods perform when decoding the functional information from proteomes of selected model organisms. We found that the protein Language Models are more precise and informative across gene ontology categories for all the species, recovering functional information from transcriptomics experiments. These results indicate that these methods emerge as suitable alternatives for large scale annotation contexts and further downstream analyses, from which we propose a guide of use.

Article activity feed