Unannotated translation products are widespread in model E. coli
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Genomes contain orders of magnitude more open reading frames (ORFs) than known protein coding genes, and recent work suggests there may be unannotated proteins present in even the best studied organisms. To address this gap, we used a high throughput reverse genetic toolkit to construct precise C-terminal fusions of a reporter (and control) to >120,000 ORFs in model E. coli . We found hundreds of unannotated significant hits, and individually detected >50 novel polypeptides by western blot, including ORFs within tRNA loci. Many ORFs overlap annotated genes in the sense orientation, and we found these are likely chimeric polypeptides produced by ribosomal frameshifting. Using degron based knockdowns, we identified unannotated proteins that have putative fitness effects, and we found a novel small protein that displays phenotypes consistent with a role in the mRNA degradosome. The observation of a range of unannotated translation products should lead to better annotation and understanding of the bacterial domain of life and motivates the continued exploration of genomes broadly.