A Benchmark of Evo2 Genomic AI Models for Efficient and Practical Deployment

Huimin Li
Hongyi Ji
Yuchen Zeng
Wei Lv
Jianmin Wu
Sheng Liu
Chunhua Lin
Huanming Yang
Zhaorong Li
Yubao Chen
Wei Dong

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The rapid advancement of DNA foundation language models has brought about a transformative shift in genomics, allowing for the deciphering of intricate patterns and regulatory mechanisms embedded within DNA sequences. The genomic foundation model Evo2 demonstrates remarkable capabilities in decoding DNA functional patterns through cross-species pretraining. However, despite the great potential of Evo2 in basic genomics research, there is currently no clear and systematic guidance on its specific application scenarios, performance, and optimization directions in the field of tumor genomics, and its performance dependency on specialized hardware (such as FP8 precision on H800 GPUs) has not been empirically benchmarked. Here, we present a focused validation of Evo2 using two independent cancer genomic datasets (Bladder Urothelial Carcinoma and Ovarian Cancer), we tested the downstream tasks of Evo2, including the prediction of tumor pathogenic variants and the prediction of mutational effects, and compared its performance on A100 and H800 GPUs. The results show that critical importance of FP8 precision, enabling the H800 to achieve a 4× faster inference speed than the A100 with stable accuracy (AUC 0.88-0.95). The 7B-parameter model emerged as the top performer, whereas the 40B model experienced a severe performance drop (AUC to 0.48) on non-FP8 hardware like the A100. These findings empirically validated Evo2’s hardware specifications and provided practical insights for researchers implementing the model with similar computational resources. Futhermore, our findings provide a framework for the application and optimization of downstream tasks of the DNA language model Evo2 in cancer, and can guide researchers in effectively applying it in genomic studies.

Key Points

Hardware Precision Impact: FP8 precision on H800 GPUs is critical for Evo 2’s performance, enabling 4× faster inference than A100 (without FP8 support) while maintaining high accuracy (AUC 0.88–0.95).
Model Scale Optimization: The 7B-parameter model outperformed larger variants (e.g., 40B), which suffered severe accuracy drops (AUC as low as 0.48) on non-FP8 hardware, highlighting a balance between efficiency and performance.
Practical Guidelines: We provide a framework for deploying Evo 2 in cancer genomics, including hardware recommendations, dataset curation, and downstream task optimization—valuable for researchers with varied computational resources.

Version published to 10.1101/2025.09.10.675279 on bioRxiv
Sep 12, 2025

Understanding Pathways in Bioinformatics, Genomics, and Health Applications

This article has 1 author:
1. Diptarup Mallick
This article has no evaluationsLatest version Jan 19, 2026
Benchmarking Genomic Foundation Models for Gene Fusion Detection from DNA Sequences

This article has 5 authors:
1. Radim Krupička
2. Mariana Komárková
3. Bohuslav Dvorský
4. Kateřina Kollinová
5. Ondřej Klempíř
This article has no evaluationsLatest version Dec 23, 2025
Integrative benchmarking and automation of clonal reconstruction of somatic mutations in single-sample tumor genome analysis

This article has 3 authors:
1. Marina Masliakova
2. Steve Lefever
3. Jo Vandesompele
This article has no evaluationsLatest version Jan 21, 2026

Discuss this preprint

Listed in

Abstract

Key Points

Article activity feed

Related articles

Understanding Pathways in Bioinformatics, Genomics, and Health Applications

Benchmarking Genomic Foundation Models for Gene Fusion Detection from DNA Sequences

Integrative benchmarking and automation of clonal reconstruction of somatic mutations in single-sample tumor genome analysis