Detection and Annotation of Unique Regions in Mammalian Genomes
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Long unique genomic regions have been reported to be highly enriched for developmental genes in mice and humans. In this paper we identify unique genomic regions using a highly efficient method based on fast string matching. We quantify the resource consumption and accuracy of this method before applying it to the genomes of 18 mammals. We annotate their unique regions of at least 10 kb and find that they are strongly enriched for developmental genes across the board. When investigating the subset of unique regions that lack annotations, we found in the tasmanian devil the gene encoding iniositol polyphosphate-5-phosphatase A, which is an essential part of intracellular signaling. This implies that unique regions might be given priority when annotating mammalian genomes. Our documented pipeline for annotating unique regions in any mammalian genome is available from the repository github.com/evolbioinf/auger ; additional data for this study is available from the data-verse at doi.org/10.17617/3.4IKQAG.