Tree-based quantification infers proteoform regulation in bottom-up proteomics data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Quantitative readout is essential in proteomics, yet current bioinformatics methods lack a framework to handle the inherent multi-level nature of the data (fragments, MS1 isotopes, charge states, modifications, peptides and genes). We present AlphaQuant, which introduces tree-based quantification . This approach organizes quantitative data into a hierarchical tree across levels. It allows differential analyses at fragment and MS1 level, recovering up to 50-fold more regulated proteins compared to a state-of-the-art approach. Using gradient boosting on tree features, we address the largely unsolved challenge of scoring quantification accuracy, as opposed to precision. Our method clusters peptides with similar quantitative behavior, providing a new approach to the protein grouping problem and enabling identification of regulated proteoforms directly from bottom-up data. Combined with deep learning classification, we infer phosphopeptides from proteome data alone, validating our findings with EGFR stimulation data. We then describe proteoform diversity across mouse tissues, revealing distinct patterns of post translational modifications and alternative splicing.