Longer internal exons tend to have more tandem repeats and more frequently experience insertions and deletions that are mostly in intrinsically disordered regions of the encoded proteins
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Insertions and deletions (indels) in eukaryotic proteins are known to preferentially encode intrinsically disordered regions (IDRs), protein regions that by themselves do not form unique three-dimensional structures. As a previous investigation showed that long internal exons tend to encode IDRs in eukaryotes in general, we thought it worthwhile to analyze how indels alter internal exons and affect IDRs of the encoded proteins. For consideration of evolutionary roles indels play, we decided to select indels commonly observed in all variants (“fixed” indels) since indels in minor variants may represent transient aberrations in splicing. Here, by comparison of orthologous variants of closely related species together with those of outgroups, we identified fixed indels in the internal exons in four mammals and two flies. The fixed indels are nearly always nonframeshifting, short, and mostly encode IDRs. On average 51% of inserted and 40% of deleted residues are attributable to alterations in tandem repeats. Deletion tends to occur more frequently than insertion does and indels are generally more prevalent in long internal exons. Tandem repeats occur preferentially in long internal exons, indicating that their alterations account for the high frequency of indels in long internal exons. Also, since tandem repeats mostly encode IDRs, this finding at least partially explains the high incidence of IDRs in long internal exons. We propose that long internal exons had been produced in early eukaryotes mainly by repeat expansion that added IDRs to the encoded proteins but are experiencing frequent indels by alterations in tandem repeats.