Predicting molecular recognition features in protein sequences with MoRFchibi 2.0
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Molecular Recognition Features (MoRFs) are segments within disordered protein regions (IDRs) that undergo a disorder-to-order transition upon binding to their partners. Identifying MoRFs remains a significant challenge. This paper introduces MoRFchibi 2.0, a specialized prediction tool designed to identify the locations of MoRFs within protein sequences. Our results show that MoRFchibi 2.0 outperforms all existing MoRF and general predictors of protein-binding sites within IDRs, including top-performing models from CAID rounds 1, 2, and 3. Remarkably, MoRFchibi 2.0 surpasses predictors that utilize AlphaFold data and state-of-the-art protein language models, achieving superior ROC and Precision-Recall curves and higher success rates. MoRFchibi 2.0 generates output scores using an ensemble of logistic regression convolutional neural network models, followed by a reverse Bayes Rule to adjust for priors in the training data. These scores reflect MoRF probabilities normalized for the priors in the training data, making them individually interpretable and compatible with other tools utilizing the same scoring framework.
Availability
An online server: https://mc2.msl.ubc.ca/index.xhtml and code: https://github.com/NawarMalhis/MC2.git .