Machine Learning-Augmented Analysis of Nano-electrochemical Sensor Data for Predictive and Quantitative Assays of Complex Biological Samples

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Analytical chemistry provides the content of nearly every scientific, technical and business decision relating to what atoms, molecules and devices are and how they interact in their environment. Various workflows come together in describing those interactions--separation, identification, quantitation, classification—to inform decisions across society. Each workflow has associated equipment, infrastructure and subject matter experts that integrate results for input into those decisions. However, those existing workflows are constrained by their evolution and often require integrated applications of the multiple analytical techniques integrated to form conclusions, thereby increasing cost, complexity and quality. This paper describes an alternative approach to addressing the many limitations of contemporary analytical chemistry. Here machine learning is leveraged to analyze high dimensional data derived from a nano-electrochemical sensor platform, to perform classification, identification and quantitation analyses in an integrative manner from a single-shot measurement. The approach described here for biological samples does not require sample-specific preparation or marker-specific probes and labels to identify and/or quantitate the analyte of interest, thereby liberating the physical measurement of the sample from chemicals and workflows that are tied to the underlying biochemical hypothesis. The nano-electrochemical sensor, as described in a prior publication, transduces the vibrational frequencies of multiple species in the sample into a collection of discrete electronic signatures that represent the sample in a high dimensional space, while using only 2-4µl sample volume. The sample analysis relies on training data to identify features in the high dimensional input dataset that are unique to the molecule or phenotype under consideration and distinct from the sample background matrix and/or a set of defined sample controls. The molecule- or phenotype-associated features can be tracked with clinically identified disease burden in the patient or with biomolecule concentration in the biological sample. The identification of patients with dementia from a plasma test is outlined here to demonstrate the applicability of this approach to classifying phenotypes associated with complex neurological disorders in a marker-agnostic manner. This approach is applied to a specific and quantitative assay of insulin Humalog and insulin Toujeo in a batch that comprises of a mixture of the two peptides, where Toujeo differs from Humalog in three amino acids residues only. We also describe the development of an assay for IL-6 in spiked human plasma samples with this method, thereby demonstrating the applicability of this analysis paradigm to more complex samples and protein-like species.

Article activity feed