Challenges in Predicting Chromatin Accessibility Differences between Species

Amy Stephen
Arian Raje
Heather H. Sestili
Morgan E. Wirthlin
Alyssa J. Lawler
Ashley R. Brown
William R. Stauffer
Andreas R. Pfenning
Irene M. Kaplow

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Enhancers are transcriptional regulatory elements that help drive phenotypic diversity, yet they often undergo rapid sequence evolution despite functional conservation, posing a challenge for predicting their function across species. Machine learning models that predict quantitative enhancer activity using DNA sequence have not previously been evaluated for their ability to predict quantitative differences across orthologous regions. Here, we trained convolutional neural networks (CNNs) on a regression task to predict chromatin accessibility, which is a proxy for enhancer activity, in the liver across five mammals, and we developed a novel framework to evaluate cross-species performance. We demonstrated that training on multiple species improves model generalization to both species used in training and held-out species. However, the models consistently achieved poor performance in predicting quantitative differences in accessibility between species at orthologous regions. Our study highlights the challenges in using regression models to predict chromatin accessibility changes between species.

Version published to 10.1101/2025.11.09.687449 on bioRxiv
Nov 10, 2025

Deep Learning Approaches for Accurate RNA 3D Structure Prediction from Primary Sequences

This article has 1 author:
1. Nnaemeka Kingsley Ugwumba
This article has no evaluationsLatest version Jan 29, 2026
Artificial Intelligence–Driven Structural Mining Enables Functional Inference in the Human Dark Proteome

This article has 7 authors:
1. Valentina Carbonari
2. Annamaria Defilippo
3. Ugo Lomoio
4. Caterina Francesca Perri
5. Barbara Puccio
6. Pierangelo Veltri
7. Pietro Hiram Guzzi
This article has no evaluationsLatest version Dec 23, 2025
Benchmarking Reveals the Superiority of Nucleic Acid Foundation Models in Predicting lncRNA Coding Potential

This article has 5 authors:
1. Yu Yang
2. Liping Ren
3. Juan Feng
4. Yang Zhang
5. Tianyuan Liu
This article has no evaluationsLatest version Dec 17, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Deep Learning Approaches for Accurate RNA 3D Structure Prediction from Primary Sequences

Artificial Intelligence–Driven Structural Mining Enables Functional Inference in the Human Dark Proteome

Benchmarking Reveals the Superiority of Nucleic Acid Foundation Models in Predicting lncRNA Coding Potential