An evolutionary approach to predict the orientation of CRISPR arrays
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
CRISPR-Cas is a defense system of bacteria and archaea against phages. Parts of the foreign DNA, called spacers, are incorporated into the CRISPR array which constitutes the immune memory. The orientation of CRISPR arrays is crucial for analyzing and understanding the functionality of CRISPR systems and their targets. Several methods have been developed to identify the orientation of a CRISPR array. To predict the orientation, different methods use different features such as the repeat sequences between the spacers, the location of the leader sequence, the Cas genes, or PAMs. However, those features are often not sufficient to predict the orientation with certainty, or different methods disagree.
Remarkably, almost all CRISPR systems have been found to insert spacers in a polarized manner at the leader end of the array. We introduce CRISPR-evOr , a method that leverages the resulting patterns to predict the acquisition orientation for (a group of) CRISPR arrays by reconstructing and comparing the likelihood of their evolutionary history with respect to both possible acquisition orientations. The new method is independent of Cas type, leader existence and location, and transcription orientation. CRISPR-evOr is thus particularly useful for arrays that other CRISPR orientation tools cannot predict confidently and to verify or resolve conflicting predictions from existing tools.
CRISPR-evOr currently confidently predicts the orientation of 28.3% of the arrays in the considered subset of CRISPRCasdb, which other tools like CRISPRDirection and CRISPRstrand cannot reliably orient. As our tool leverages evolutionary information we expect this percentage to grow in the future when more closely related arrays will be available. Additionally, CRISPR-evOr provides confident decisions for rare subtypes of CRISPR arrays, where knowledge about repeats and leaders and their orientation is limited.
Author Summary
Some bacteria and archaea possess a CRISPR-Cas defense system, which protects them against phages and mobile genetic elements. This system adapts to new threats by incorporating small fragments of their DNA, so called spacers, in a CRISPR array. Remarkably, the acquisition of new spacers is polarized at one end of this array. To understand how this immune system functions, it is essential to know the orientation of these arrays. Many existing tools try to determine orientation using genetic markers, but these methods are often unreliable or disagree with one another.
In our work, we developed a new method that predicts the end at which new spacers are inserted, by considering the evolutionary history of a group of related CRISPR arrays. Unlike other tools, our approach is less reliant on specific genetic markers and can be applied broadly across many types of CRISPR systems. We show that it can confidently determine the orientation of a large number of arrays that other methods cannot resolve. This provides a new way to predict the array orientation, which is particularly useful for rare CRISPR types. Our evolutionary approach will become even more powerful as more genetic data becomes available.