DeepCas12a: A hybrid deep learning framework for accurate Cas12a efficiency prediction from sequence and epigenetic information
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
CRISPR-Cas12a (Cpf1) offers distinct advantages for genome editing due to its flexible, T-rich PAM recognition. However, variable cleavage efficiency—modulated by sequence context and epigenetic features—remains a challenge, with existing predictors limited in accuracy and interpretability. Here, we present DeepCas12a, a hybrid deep learning framework integrating Convolutional Neural Networks (CNNs) and a Vision Transformer (ViT) encoder to capture both local sequence motifs and long-range dependencies. The model fuses DNA sequence data with epigenetic profiles (DNA methylation and chromatin accessibility) in an end-to-end architecture. Benchmarked on an independent test set, DeepCas12a outperformed state-of-the-art predictors, achieving an Average Precision of 0.783, an AUC of 0.868, and a Spearman correlation of 0.630. Furthermore, interpretability analysis via saliency maps confirms the model captures biologically relevant features, including PAM specificity and seed region sensitivity, facilitating rational guide RNA design.