Chemistry-based vectors map the chemical space of natural biomes from untargeted mass spectrometry data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Untargeted metabolomics can comprehensively map the chemical space of a biome, but is limited by low annotation rates (<10%). We used chemistry-based vectors, consisting of molecular fingerprints or chemical compound classes, predicted from mass spectrometry data, to characterize compounds and samples. These chemical characteristics vectors (CCVs) estimate the fraction of compounds with specific chemical properties in a sample. Unlike the aligned MS1 data with intensity information, CCVs incorporate actual chemical properties of compounds, offering deeper insights into sample comparisons. Thus, we identified key compound classes differentiating biomes, such as ethers which are enriched in environmental biomes, while steroids enriched in animal host-related biomes. In biomes with greater variability, CCVs revealed key clustering compound classes, such as organonitrogen compounds in animal distal gut and lipids in animal secretions. CCVs thus enhance the interpretation of untargeted metabolomic data, providing a quantifiable and generalizable understanding of the chemical space of natural biomes.