Confronting spurious evaluations of computational methods in small molecule mass spectrometry

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Mass spectrometry-based metabolomics detects thousands of small molecule-associated signals in biological samples, but the vast majority cannot be structurally identified. Mounting interest in this metabolomic “dark matter” has spurred the development of dozens of machine-learning models for structural annotation of small molecules from their MS/MS spectra. Here, we expose a fundamental flaw in the longstanding paradigm by which these models have been evaluated. We show that a trivial machine-learning model can achieve strong performance on existing benchmarks despite wholly discarding the information contained within MS/MS spectra themselves, and without using any other auxiliary information. This performance arises because compounds with reference MS/MS spectra are structurally distinct from those found in generic chemical databases, and machine-learning models can exploit this dissimilarity by learning to predict whether a compound is likely to have been measured by MS/MS. However, we show that this confound can be overcome by using a generative model to sample decoy structures that are chemically indistinguishable from those found in reference MS/MS libraries. The resulting benchmark cannot be solved without attending to MS/MS spectra, and therefore provides an epistemologically valid framework to evaluate computational methods for the annotation of MS/MS spectra from small molecules.

Article activity feed