Inverted Repeats in Viral Genomes

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

An inverted repeat (IR) in DNA is a sequence of nucleotides that is followed by its complementary bases but in reverse order, occurring on the same strand (e.g., TCACCGCGGTGA). If the two complementary sequences occur one after the other without other bases between them, they are referred to as DNA palindromes. IRs could form hairpin and cruciform secondary structures, which endanger genomic stability. They are found to be prevalent in viral DNA at origins of replication, and they play a crucial role in various biological processes including gene silencing, duplication, and genomic evolution. IRs have been less explored, which stems from the scarcity of sequence analysis tools allowing accurate detection on large viral genome data. Here, using the Biological Language Modeling Toolkit (BLMT), we analyzed 14 thousand viral genomes for occurrences of IRs, resulting in the identification of over 19 million IRs longer than 20 bases, including 134 IRs that are 2000 bases long, and around 1,300 IRs per virus.

Article activity feed