Puget predicts gene expression across cell types using sequence and 3D chromatin organization data

Shengqi Hang
Xiao Wang
Ghulam Murtaza
Anupama Jha
Bo Wen
Tangqi Fang
Justin Sanders
Sheng Wang
William Stafford Noble

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Gene expression is governed by both linear DNA sequence and three-dimensional (3D) chromatin architecture. Most gene expression prediction models rely on sequence alone, thereby failing to capture structural context and to generalize to unseen cell types. We present Puget, a deep learning model that predicts cell type-specific gene expression from sequence and Hi-C data, which captures 3D chromatin organization. Puget pairs pretrained sequence and Hi-C encoders with a lightweight transformer decoder. Using paired Hi-C/RNA-seq from 36 human and 4 mouse biosamples, we evaluate the ability of Puget to generalize to held-out genes, held-out biosamples, and from human to mouse. Relative to a sequence-only baseline, Puget improves cross-biosample Pearson correlation by up to 25% on highly variable genes in training biosamples and, unlike the sequence-only model, generalizes to held-out biosamples and across species. In addition, in silico perturbation experiments show that Puget can prioritize experimentally validated enhancer-gene pairs. Together, these results highlight a generalizable approach for modeling gene expression from sequence and 3D chromatin organization.

Version published to 10.1101/2025.11.19.689320 on bioRxiv
Nov 20, 2025

DoseH-seq: A single-cell multiome platform to decode gene-dosage logic driving developmental reversion and cell fate reprogramming

This article has 25 authors:
1. Ying Yang
2. Ralph Patrick
3. Xiaoli Chen
4. Stacey Anderson
5. Jingyu Zhang
6. Yifei Huang
7. Mohammadhossein Esmaeili
8. Kanupriya Tiwari
9. Shivangi Wani
10. Monisha Ganesan
11. Hsin-Yi Chou
12. Dominique Power
13. Cassy M Spiller
14. Sas Loganathan
15. Solal Chauquet
16. Michael Piper
17. Majid Alhomrani
18. Walaa Alsanie
19. Sonia Shah
20. Josephine Bowles
21. Jessica C Mar
22. Shyuan T Ngo
23. Melanie D White
24. Marina Naval-Sanchez
25. Christian M Nefzger
This article has no evaluationsLatest version Dec 23, 2025
Spatial ChIP (ChIP-SP) as a New Bioinformatics Tool to Characterize Spatial Gene Regulation

This article has 5 authors:
1. Tianyi Zhou
2. Kevin Song
3. Hui Huang
4. Ning Lyu
5. Qin Feng
This article has no evaluationsLatest version Dec 26, 2025
Discovering cell types and states from reference atlases with heterogeneous single-cell ATAC-seq features

This article has 2 authors:
1. Xiuwei Zhang
2. Yuqi Cheng
This article has no evaluationsLatest version Dec 10, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

DoseH-seq: A single-cell multiome platform to decode gene-dosage logic driving developmental reversion and cell fate reprogramming

Spatial ChIP (ChIP-SP) as a New Bioinformatics Tool to Characterize Spatial Gene Regulation

Discovering cell types and states from reference atlases with heterogeneous single-cell ATAC-seq features