Comparison of state-of-the-art error-correction coding for sequence-based DNA data storage

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

A wide range of codecs with vastly different error-correction approaches have been proposed and implemented for DNA data storage to date. However, while many codecs claim to provide superior performance, no studies have systematically benchmarked codec implementations to establish the current state-of-the-art in DNA data storage. In this study, we use standardized error scenarios – both in silico and in vitro – to compare the performance of six representative codecs from the literature. We find synthetic benchmarks commonly used in literature to be unsuitable indicators of codec performance, as our data shows that common experimental benchmarks fail to differentiate codecs under standardized conditions. Instead, we implement a comprehensive benchmark covering the major experimental parameters to assess codec performance under realistic DNA data storage conditions, while establishing important baselines for future codec development. Verifying our results with fair and standardized experiments, we demonstrate data storage at 43 EB g -1 using synthesis by material deposition and 13 EB g -1 using the more error prone electrochemical synthesis, employing only existing codecs from the literature. Besides closing in on the physical limits of DNA data storage, this study thus showcases the maturity of error-correction coding and defines its current state-of-the-art.

Article activity feed