Design of DNA Aptamers for Lyme disease Diagnosis Combining experimental and numerical approaches

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Aptamers are single stranded DNA or RNA molecules selected for their high affinity and specificity to bind target molecules, similar to antibodies. They are commonly selected through the SELEX process, which involves the iterative exposure of a random sequence library to a target and retaining the sequences showing good binding properties. To improve Lyme disease detection, we propose designing aptamers that specifically bind to the CspZ protein on the surface of Borrelia burgdorferi , the bacterium responsible for the disease.

Starting with a SELEX process consisting of thirteen rounds, from which selected in vitro sequence candidates have emerged, we aim to propose a holistic process that selects in silico new sequence candidates that are further validated experimentally. Our approach relies on 1) using Machine Learning (ML) techniques, specifically a Restricted Boltzmann Machine (RBM), to digitally replicate the last round of the SELEX process, 2) integrating insights from text analysis methods, such as word2vec and n-grams, into the RBM model trained on the final-round SELEX dataset to represent and compare newly generated sequences with in vitro candidates, 3) selecting in silico sequences with strong potential to bind to CspZ protein, 4) experimentally validating the selected in silico sequences of step 3.

Our holistic approach combines biological insights with statistical models to improve the efficiency and outcome of the SELEX process. We enhance the RBM model, designed to replicate the distribution of the final SELEX round, by integrating geometric representations of sequences, which is especially advantageous when dealing with limited datasets relative to the vast sequence space. In addition, it provides in silico sequence candidates with strong binding properties.

Article activity feed