NyxBind: enhancing DNN representations via contrastive learning for TFBS prediction

Xu Yang
Qingfa Xiao
Yucheng Xu
Jixin Yang
Yusen Hou
Weicai Long
Miaojun Huang
Yanlin Zhang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

While pretrained genomic language models effectively capture general DNA sequence patterns through masked language modeling, they often struggle to discriminate subtle yet biologically critical differences among transcription factor binding site (TFBS) motifs. Recent studies suggest that contrastive learning can enhance the discriminative power of embeddings by explicitly modeling inter-instance similarities and differences. Building on this insight, we introduce NyxBind, the first TFBS prediction model that applies contrastive learning across multiple TFBS types to enhance regulatory sequence representations. NyxBind better captures discriminative sequence features, enabling more accurate and biologically meaningful TFBS prediction. Extensive evaluations show that NyxBind consistently outperforms alternative models across multiple TFBS classification benchmarks, demonstrating strong robustness and generalizability. Moreover, NyxBind supports both full-parameter and parameter-efficient fine-tuning while maintaining high performance, and supports accurate motif visualization, aligning closely with experimentally validated transcription factor binding profiles. The code are available at https://github.com/ai4nucleome/NyxBind .

Version published to 10.1101/2025.10.21.683808 on bioRxiv
Oct 23, 2025

A Survey on Efficient Protein Language Models

This article has 8 authors:
1. Shouren Wang
2. Debargha Ganguly
3. Vinooth Kulkarni
4. Wang Yang
5. Zhuoran Qiao
6. Daniel Blankenberg
7. Vipin Chaudhary
8. Xiaotian Han
This article has no evaluationsLatest version Dec 24, 2025
Decoupled Representation Learning Improves Generalization in CRISPR Off-Target Prediction

This article has 2 authors:
1. Nyla Bhargava
2. Aditya Goswami
This article has no evaluationsLatest version Jan 18, 2026
DNABERT2-CAMP: A Hybrid Transformer-CNN Model for E. coli Promoter Recognition

This article has 4 authors:
1. Hua-Lin Xu
2. Xiu-Jun Gong
3. Hua Yu
4. Ying-Kai Wang
This article has no evaluationsLatest version Dec 28, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A Survey on Efficient Protein Language Models

Decoupled Representation Learning Improves Generalization in CRISPR Off-Target Prediction

DNABERT2-CAMP: A Hybrid Transformer-CNN Model for E. coli Promoter Recognition