Disentangling the CHAOS of intrinsic disorder in human proteins
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Most proteins consist of both folded domains and Intrinsically Disordered Regions (IDRs). However, the widespread occurrence of intrinsic disorder in human proteins, along with its characteristics, is often overlooked by the broader communities of structural and molecular biologists. Building on the MobiDB database of intrinsic disorder in proteins, here we develop a comprehensive dataset ( C omprehensive analysis of Human proteins A nd their dis O rdered Segments - CHAOS). We implement internally consistent definitions of disordered regions, and annotate general characteristics such as cellular location, essentiality, post-translational modifications, and predicted pathogenicity. Further, we cross-reference to structure predictions from AlphaFold. We find that most human proteins contain at least one disordered region, predominantly located at the protein termini. IDRs are less hydrophobic, enriched in post-translational modifications, and mutations in IDRs are predicted to be less pathogenic than in non-IDRs. Additionally, we discovered that proteins residing in different cellular locations possess distinct disorder profiles. Finally, the predicted AlphaFold models of proteins in CHAOS suggest that disordered regions and proteins are often predicted to adopt secondary structure. Hereby we enhance the visibility and understanding of intrinsic disorder in human proteins.
Key messages
Four out of five human proteins contain one or more intrinsically disordered regions (IDRs).
Half of the IDRs are located at protein termini, but three quarters of all human proteins contain a terminal IDR.
The amount and location of disordered regions differs throughout cellular compartments.
One in five missense mutations in IDRs are likely pathogenic.
AlphaFold predicts secondary structure elements within intrinsically disordered regions and fully disordered proteins.