Long-read sequencing for biodiversity analyses - a comprehensive guide
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
DNA-based monitoring of biodiversity has revolutionised our ability to describe communities and rapidly assess anthropogenic impacts on biodiversity. Currently established molecular methods for biomonitoring rely heavily on classic metabarcoding utilising short reads, mostly through Illumina data. However, increasingly more studies use long-read sequencing technologies, such as Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio), for analyses of environmental DNA (eDNA) and DNA barcoding. These long-read sequencing approaches can be advantageous over existing methods by providing increased information content and opening new avenues for understanding biodiversity at a larger scale. In this review we provide a comprehensive overview of all studies to date using long-read sequencing platforms for biodiversity analyses of eukaryotes from eDNA and community metabarcoding, and DNA barcoding. We also give detailed information on each step required for sample processing, data generation and analysis of long-read sequencing datasets for biodiversity applications. Even though the number of studies using long-read sequencing technologies is rapidly increasing, clear established guidelines for sample preparation or analysis of such datasets are lacking. Furthermore, there is no existing overview of eukaryote monitoring applications for both ONT and PacBio technologies across different sample types. Long-read sequencing platforms provide possibilities for metabarcoding (PCR-based) of both short and long fragments, shotgun sequencing (PCR-free), or DNA barcoding. ONT platforms in particular can also allow real time data acquisition while the portability of the MinION instrument can support sequencing in the field. PacBio on the other hand can provide highly accurate reads and can be used to reliably address open questions in ecology and evolution for difficult to characterise taxa such as microbial eukaryotes. Streamlining the use of these technologies could enable sequencing of whole organelles or population level assessments from environmental data, which would be a step-change for DNA based monitoring. Here we review current applications, applied methodology, and future perspectives in the field. Overall we aim to facilitate the use of long-read sequencing technologies by the wider community, promote best practices for data generation and standardised data reporting that supports reproducibility, and to encourage open data policies and tools.