Thousands of human non-AUG extended proteoforms lack evidence of evolutionary selection among mammals

Read the full article


The synthesis of most proteins begins at AUG codons, yet a small number of non-AUG initiated proteoforms are also known. Here we used publicly available ribo-seq data with phylogenetic approaches to identify novel, previously uncharacterised non-AUG proteoforms. Unexpectedly we found that the number of non-AUG proteoforms identified with ribosome profiling data greatly exceeds those with strong phylogenetic support. We identified an association between proteoforms with alternative N-termini and multiple compartmentalisation of corresponding gene products. In dozens of genes N-terminal extensions encode localisation signals, including mitochondrial presequence and signal peptides. While the majority of non-AUG initiated proteoforms occur in addition to AUG initiated proteoforms, in few cases non-AUG appears to be the only start. This suggests that alternative compartmentalisation is not the only function of non-AUG initiation. Taking a conservative approach, we updated annotation of several genes in the latest GENCODE version in human and mouse where non-AUG initiated proteofoms are supported by both, ribosome profiling and phylogenetic evidence. Yet, the number of such extensions is likely much higher. Thousands of non-AUG proteoforms supported only by ribosome profiling suggest that they may evolve neutrally. Indeed, expression of some may not be consequential, i.e. when N-termini is processed or they have identical biochemical properties. Nonetheless they may contribute to immune response as antigen sources. It is also possible that some proteoforms accrued useful functions only recently and evolved under purifying selection in a narrow phylogenetic group. Thus, further characterisation is important for understanding their phenotypical and clinical significance.

Article activity feed