Encoded and non-genetic alternative protein variants expand human functional proteome
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Each stage of the Central Dogma contributes to proteome diversity through mechanisms such as heterozygosity, somatic mutations, transcriptional errors, and translational errors. As a result, a diverse array of protein variants can coexist within a single proteome, such as that of humans. However, until now, methods to detect, quantify, and evaluate the functional consequences of these variants have been lacking. Here we examined a large-scale proteogenomic dataset from 29 healthy human tissues and uncovered over 46,000 of unique single amino acid variants co-existing alongside their corresponding reference proteoforms. Our analysis reveals that substitutions preferentially occur in proteins with low expression levels and often affect less frequently encoded amino acids such as tryptophan, cysteine, histidine, or methionine. We found that the abundance of both genetic (SNP’s, somatic mutations) and mistranslated protein variants mirror their allele frequencies in the human population. Moreover, we show that non-genetic substitutions provide a distinct route for exploring protein sequence space, circumventing the mutational constraints imposed by the genetic code. Further, we demonstrate that substitution prevalence correlates negatively with their predicted pathogenicity, particularly in proteins expressed at low levels. We identified hundreds of substituted non-genetic proteoforms that recur consistently in multiple individuals and map to annotated protein functional sites. We propose that these substitutions constitute a novel class of functional protein phenotypic variants. Finally, we illustrate the impact of active-site substitutions in genes such as GAPDH and hemoglobin and highlight potential non-genetic routes of immunoglobulin diversification. Collectively, our findings indicate that non-genetic amino acid substitutions in human proteins provide a recurring and specific route to expanding the functional proteome.