DNA data storage: a generative tool for data encoding motifs

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

DNA possesses extremely high information density and durability. In order to become a commercially viable medium such as magnetic tape and hard disk drives, the cost per bit has to decrease considerably, while the write bandwidth needs to increase. Both are governed by DNA synthesis, the process of writing data to DNA, which is currently very expensive and slow. Assembly of DNA strands from motifs, i.e. short DNA sequences, is an economical and faster way of representing data using DNA. Each motif carries a letter in an alphabet. Trading the quaternary alphabet {A,C,T,G} for longer fragments, namely motifs, allows an increase in write bandwidth and reduces cost. The success of the underlying chemistry, specifically with the assembly of motifs into polymers that faithfully represent the source binary data, is sensitive to the formation of secondary structures and the correct annealing of the motifs in a unique way. In this work, we develop a mathematical framework and a method to generate a set of motifs that agree with a predefined set of constraints regardless of the order they are combined. We show that our approach generates motifs that always conform to the constraints with high probability, contrary to randomly generated motifs.

Article activity feed