Charting the Undiscovered Metabolome with Synthetic Multiplexing
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In untargeted metabolomics, reference MS/MS libraries are essential for structural annotation, yet currently explain only 6.9% of the more than 1.7 billion MS/MS spectra in public repositories. We hypothesized that many unannotated features arise from simple, biologically plausible transformations of endogenous and exposure-derived compounds. To test this, we created a reference resource by synthesizing over 100,000 compounds using multiplexed reactions that mimic such biochemical transformations. 91% of the compounds synthesized are absent from existing structural databases. Through improvements in the construction of the computational infrastructure that enables pan repository-scale MS/MS comparisons, searching this biologically inspired MS/MS library increased the overall reference-based match rate by 17.4%, yielding over 60 million new matches and raising the global pan-repository MS/MS annotation rate to 8.1%. By facilitating structural hypotheses for previously uncharacterized MS/MS data, this framework expands the accessible detectable biochemical landscape across human, animal, plant, and microbial systems, revealing previously undescribed metabolites such as ibuprofen-carnitine and 5-ASA-phenylpropionic acid conjugates arising from drug–host and host–microbiome co-metabolism.