Inter-tool analysis of a NIST dataset for assessing baseline nucleic acid sequence screening

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Nucleic acid synthesis is a dual-use technology that can benefit fields such as biology, medicine, and information storage. However, synthetic nucleic acids could also potentially be used negligently and ultimately cause harm, or be used with malicious intent to cause harm. Thus, this technology needs to be appropriately safeguarded. Sequence screening is one component of a biosecurity protocol for preventing such harm and consists of differentiating Sequences of Concern (SOCs) from benign sequences that are not associated with pathogenicity or toxicity. There exist many fit-for-purpose tools that have been developed for DNA synthesis sequence screening. However, questions remain regarding their performance with respect to consistency of screening. To aid in determining if screening tools are harmonized in regard to baseline sequence screening, NIST constructed a test dataset based on current screening recommendations. NIST then sent blinded datasets to sequence screening tool developers for testing. Overall, there was a general agreement between the tools and NIST assignments of the sequences and all tools had a baseline performance of greater than 95% sensitivity and 97% accuracy. Disagreement on specific sequences largely arose from single tools and could be traced to differences in defining a SOC and/or methodological differences in screening algorithms.

Article activity feed