Decoding functional proteome information in model organisms using protein language models

Israel Barrios-Núñez
Gemma I Martínez-Redondo
Patricia Medina-Burgos
Ildefonso Cases
Rosa Fernández
Ana M Rojas

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Protein language models have been tested and proved to be reliable when used on curated datasets but have not yet been applied to full proteomes. Accordingly, we tested how two different machine learning-based methods performed when decoding functional information from the proteomes of selected model organisms. We found that protein language models are more precise and informative than deep learning methods for all the species tested and across the three gene ontologies studied, and that they better recover functional information from transcriptomic experiments. The results obtained indicate that these language models are likely to be suitable for large-scale annotation and downstream analyses, and we recommend a guide for their use.

Version published to 10.1093/nargab/lqae078
Jul 2, 2024
Version published to 10.1101/2024.02.14.580341 on bioRxiv
Feb 15, 2024

Artificial Intelligence–Driven Structural Mining Enables Functional Inference in the Human Dark Proteome

This article has 7 authors:
1. Valentina Carbonari
2. Annamaria Defilippo
3. Ugo Lomoio
4. Caterina Francesca Perri
5. Barbara Puccio
6. Pierangelo Veltri
7. Pietro Hiram Guzzi
This article has no evaluationsLatest version Dec 23, 2025
A Survey on Efficient Protein Language Models

This article has 8 authors:
1. Shouren Wang
2. Debargha Ganguly
3. Vinooth Kulkarni
4. Wang Yang
5. Zhuoran Qiao
6. Daniel Blankenberg
7. Vipin Chaudhary
8. Xiaotian Han
This article has no evaluationsLatest version Dec 24, 2025
Convolutional Deep Learning Approach to identify DNA Sequences for Gene Prediction

This article has 2 authors:
1. Jesus Antonio Motta
2. Pedro David Gomez
This article has no evaluationsLatest version Jan 27, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Artificial Intelligence–Driven Structural Mining Enables Functional Inference in the Human Dark Proteome

A Survey on Efficient Protein Language Models

Convolutional Deep Learning Approach to identify DNA Sequences for Gene Prediction