High-Codon: A Deep Learning-Based Codon Optimization Tool for Enhanced Heterologous Protein Expression in Escherichia coli
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
High-Codon is a deep learning-based codon optimization tool designed to enhance the expression levels of heterologous proteins in Escherichia coli. This approach employs a BERT pre-trained model to construct a sequence labeling framework for predicting optimal codons. Additionally, an expression-level weighted loss function is introduced to strengthen the model’s ability to learn from highly expressed proteins. Evaluation using 100 protein sequences from 34 species demonstrates that High-Codon outperforms traditional optimization methods.