High-throughput proteomics of nanogram-scale samples with Zeno SWATH MS

This article has been Reviewed by the following groups

Read the full article

Listed in

Log in to save this article

Abstract

The possibility to record proteomes in high throughput and at high quality has opened new avenues for biomedical research, drug discovery, systems biology, and clinical translation. However, high-throughput proteomic experiments often require high sample amounts and can be less sensitive compared to conventional proteomic experiments. Here, we introduce and benchmark Zeno SWATH MS, a data-independent acquisition technique that employs a linear ion trap pulsing (Zeno trap pulsing) to increase the sensitivity in high-throughput proteomic experiments. We demonstrate that when combined with fast micro- or analytical flow-rate chromatography, Zeno SWATH MS increases protein identification with low sample amounts. For instance, using 20 min micro-flow-rate chromatography, Zeno SWATH MS identified more than 5000 proteins consistently, and with a coefficient of variation of 6%, from a 62.5 ng load of human cell line tryptic digest. Using 5 min analytical flow-rate chromatography (800 µl/min), Zeno SWATH MS identified 4907 proteins from a triplicate injection of 2 µg of a human cell lysate, or more than 3000 proteins from a 250 ng tryptic digest. Zeno SWATH MS hence facilitates sensitive high-throughput proteomic experiments with low sample amounts, mitigating the current bottlenecks of high-throughput proteomics.

Article activity feed

  1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Reply to the reviewers

    Please note we have uploaded a PDF with the point to point reply.

  2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #2

    Evidence, reproducibility and clarity

    Summary:

    Wang et al. present an evaluation of a new generation of time-of-flight-based mass spectrometer that improves on the fraction of ions factually used for detection of peptide analytes, thus boosting the sensitivity of the Zenotof 7600 system when compared to the same instrument with the duty-cycle-enhancing Zenotrap module disabled and also when compared to the previous generation instrument of the same vendor in some of the comparisons.

    The authors position the MS acquisition technique as particularly suitable in combination with medium (micro-) and high ('analytical') flow and throughput methods where higher flow rates (vs. conventional nanoflow-LCMS) allow rapid sample turnover and high throughput, yet limit the efficiency of electrospray and ion transfer into the MS system, thus being in dire need for enhanced sensitivity of the MS system employed for detection. The competency of such an MS system for very low input materials as e.g. encountered in emerging single-cell proteomic workflows, typically employing nanoflow chromatography, was thus not part of the study.

    Accordingly, a medium- (micro-flow) and very high ('analytical'-flow) throughput LC method were screened on the three MS (parameter) setups using human cell lysate digests typically utilized in such technical evaluations. Well-received, the authors further extended their analysis (for the new instrument) across additional sample types of clinical and extended biological interest and spanning different levels of complexity and dynamic range of contained protein analytes.

    In addition, the authors also performed a controlled ratio 2-species mixture experiment which allows detailed benchmarking of proteome coverage as well as the quality of protein quantification in a known differential comparison for the medium throughput (micro-flow) method.

    The data quite convincingly demonstrate an increased sensitivity of the instrument based on similar identification performance in DIA bottom up proteomics from ca. 3- to 8-fold lower input peptide mass. However, I see a number of shortcomings mainly in the presentation and in part the completeness of the work, with specific comments below.

    Major comments:

    • Are the key conclusions convincing?

      • The concluded 10x sensitivity increase is overstating the observed numbers (x5-x8). In addition, the authors should at least discuss other changes than the Zeno trap incurred in the Zeno SWATH vs non-Zeno-SWATH DIA setups, particularly changes in accumulation times per m/z range, with Zeno Swath accumulating ~42 % longer per cycle spanning the same m/z range (85 vs 60 windows with 11ms per window) in the uflow method set and ~ 18 % longer in the high-flow method set (same window number but 13 ms vs. 11 ms dwell time per window). This should be discussed as one of the optimizations/factors contributing to the increased sensitivity observed in Zeno Swath measurements vs conventional SWATH. On that note, it was unclear to me when and where the 40 variable window SWATH method mentioned in the methods was used and where the settings can be found.
      • Since injected material is a critical parameter here, it would be good if it was mentioned also with the key conclusion on the increased number of confidently quantified peptides in microflow (based on the 2-species controlled quantity experiment).
      • Conclusions 'increasing protein identification numbers through the use of analytical-flow-rate chromatography' does not capture the observed data; the use of analytical-flow-rate does not convey an increase in protein identification numbers but enhanced sensitivity rather enables the maintenance of high protein identification numbers / proteome coverage despite/concurrent with analytical-flow chromatography
      • In titration curve experiments like these, probing proteome coverage from relatively small sample amounts, special care
    • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      • 'Zeno SWATH increases protein identification in complex samples
        5- to 10-fold when compared to current SWATH acquisition methods on the same instrument' - At no point this is shown, a decrease of required input amounts by 5-8-fold (increase in sensitivity) is shown by the data, not a multiplication of protein identification rates by that factor.
    • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      • Figure 1f, Supplemental Figure 1b, Figure 3 and Supplemental Figure 3 lack data for the Zeno SWATH method's performance at higher concentration. Given the fact that there is a clear, continuous trend of significant enhancement of proteomic depth in the highest 3 concentrations sampled by the Zeno SWATH method, I lack an assessment of the upper limit of proteome coverage achievable by the new platform when input material is not limited, or at least learn why injecting more is not advisable on the ZenoTOF 7600 system. It is clear that the region of interest is the lower loads where sensitivity gains are most pronounced, but with the strong trend in IDs per ng injected in the sampled range and discrepant range sampled by the non-Zeno method I feel there is a gap in the dataset and the upper ceiling of proteome coverage could be mapped out more thoroughly (At least for human cell lysate and possibly human plasma where trends appear most (log2-)linear).
      • Similarly, unless constrained for technical or practical reasons, I would suggest to find the ceiling for achievable proteome depth in analytical flow (4, 8 ug?)
    • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      • All these should be re-injections of existing samples on these MS setups and a minor effort provided instrument availability (<1w) and rapid re-analysis via DIA-NN.
    • Are the data and the methods presented in such a way that they can be reproduced?

      • The raw data have not been deposited to a public repository. Reproducibility of the study would benefit significantly by raw data (including search results and spectral libraries with log files of creation) upload/sharing e.g. via ProteomeXchange/PRIDE.
      • If any software versions or firmwares on the hardware are required to perform the measurements on the ZenoTOF on the market today, these versions and prospective release dates should be included or the accessibility of these settings commented on.
    • Are the experiments adequately replicated and statistical analysis adequate?

      • Figures 3 and Supplemental Figure 3 need a clarification in the legend as to the nature and origin of ID numbers (mean? Number of replicates? Add error bars if possible)
      • The usage of DIA-NN for data analysis is somewhat unclear, in particular the in Methods/Spectral libraries "For the analysis of plasma samples, a project-independent public spectral library [29] was used as described previously [15]. The Human UniProt [30] isoform sequence database (UP000005640, 19 October 2021) was used to annotate the library and the processing was performed using the MBR mode in DIA-NN." The authors should address in a revised version whether the identification numbers reported stem from two-pass or single-pass analysis (i.e. when the feature termed Match-between-runs implemented since DIANNv1.8 was enabled and whether all runs, spanning different injection amounts were co-analyzed and data-re-queried for a targeted library containing precursors identified in high load samples in first pass analysis and then queried in low-load samples. In other words, are the low-load IDs independent of high load IDs? If not (i.e. the different loads were co-analyzed with MBR), what proteome coverage to the low sample loads reach bona fide, without the 'guidance' of high-load IDs?

    Side note: Turning this around, could a high-load injection e.g. from a pool of limited-amount samples serve as a guiding element in a MBR-enabled analysis of a large cohort with limited sample amounts available per biological condition?

    Minor comments:

    • Specific experimental issues that are easily addressable.

      • The authors state the impact on dynamic range of identification when comparing ID sets against an external dataset with presumable cellular concentration numbers. I would in addition suggest comparing the dynamic range of the quantititative values observed from the available data which should provide a direct assessment of the dynamic range of quantification of the two methods.
    • Are prior studies referenced appropriately?

      • The statement that conventional DIA methods rely on nanoflow chromatography (p3, paragraph 3) is not accurate as there is previous implementations of data-independent acquisition MS of microflow separations, in part the group's work and referred to later in the text.

    o Vowinckel, J. et al. Cost-effective generation of precise label-free quantitative proteomes in high-throughput by microLC and data-independent acquisition. Sci. Rep. 8, 4346 (2018)

    o Bruderer, R. et al. Analysis of 1508 plasma samples by capillary-flow data-independent acquisition profiles proteomics of weight loss and maintenance. Mol. Cell Proteom. 18, 1242-1254 (2019).

    It is correct that most early implementations of DIA-MS utilized nanoflow separations due to sensitivity and proteome coverage but DIA as such is a chromatography-flow-speed-agnostic principle and the concept to combine microflow LC with DIA not new, yet powerful as demonstrated by the authors and others previously and once again, here. - P.3 paragraph 3 'Moreover, the increased sensitivity of DIA methods has facilitated applications in large-scale proteomics, including system-biology studies in various model organisms, disease states, and species [5-9]' Include Ref 4 where improved sensitivity of DIA was demonstrated (at proteomic breadth..)

    • Are the text and figures clear and accurate?

      • Text and Figures need to be edited for typos, language, and clarity/accuracy.
    1. Abstract 'Zeno SWATH increases protein identification in complex samples
      5- to 10-fold when compared to current SWATH acquisition methods on the same instrument' - At no point this is shown, a drop of required input amounts by 5-8-fold (increase in sensitivity) is shown by the data, not a multiplication of protein identification rates.
    2. P. 4 paragraph 3: Use terms 'consensus' or 'shared' identifications or similar to refer to the proteins identified in all 3 replicates, rather than 'reproducible' when discussing the reproducibility of peptide and protein quantification (as contrast to reproducibility of identification).
    3. P.3 paragraph 2 'selects and fragments multiple charge ions' -> multiply charged (?)
    4. P. 4 p. 1 'leading to under-detection' please clarify (leading to partial ion usage and limited sensitivity?)
    5. P. 6 paragraph 3 'The gain in identification number of Zeno SWATH versus SWATH is mostly explained by an increased dynamic range: i.e. more low-abundance proteins are detected' - Reformulate/clarify: Is increased dynamic range of identifications against external quantities an explanation or perhaps simply the increased sensitivity with improved duty cycle?
    6. Term 'active gradient' unclear. An inactive gradient is isocratic flow. Omit 'active'. Isocratic/other portions are overhead.
    7. Figure 1 panel a) iteration scheme a-d) is redundant with the rest of the figure; use alternative iter scheme within panel a). Panel a) is further contains illegibly small fonts and should be edited for legibility
    8. Revisit y-axis labels. Example: Fig. 1f) 'Precursors Identificaiton' -> Precursors identified/Precursor identifications. Correct throughout manuscript
    9. ID bar graphs in all Figures: Cumulative IDs shade of grey is not properly visible, suggest alternate color scheme or add black color outline to the bars
    10. Figure 1 e) legend 'along gradient length' -> gradient time / retention time
    11. Figure 1 d) too small, trend lines mentioned in text invisible in graph. Boxplots very small.
    12. There is three different terms used for the high throughput method (analytical-flow, high-flow, and another one.. please align where possible for clarity (i.e. choose 2 names for the 2 methods throughout the manuscript) etc..
    • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      • They authors may consider adding a short explanation of the term 'dynamic range coverage of identification' to contrast this from a direct assessment of dynamic range of quantitative values observed in this study.
      • 2-species controlled experiment: The discrepancy of observed vs true mixing ratios suggests the data were scaled during the analysis which, with these mixture ratios, tends to distort the accuracy (i.e. generates offset of observed from true ratio. That's very likely not a pipetting error on a log scale). In other words, you may want to evaluate the raw quantitative ratios (w/o any normalization/scaling applied) which should be more reflective of true/manual pipetting ratios in light of normalization strategy incompatibily with certain species mix scenarios (compare Supplementary Figure 1 a). Note to the editor(s): This will not affect the clear benefit of Zenotrap usage demonstrated by the 2-species benchmark as is but can be considered a minor yet recommended improvement (thus here).
      • The 2-species controlled experiment can reveal more information than currently extracted and I would recommend to show Zeno Swath and Swath xy scatters, including count-scaled density distributions of the observed ratios, side-by side. This would give deeper understanding of the large impact of the Zeno SWATH method. Also, I believe I haven't seen any instrument to date delivering precise quantification over as broad a dynamic range as surmisable from Fig. 1d) which might be worth wile highlighting.

    Significance

    Wang et al. describe a technical advance in ion usage and sensitivity based on an ion-trap device storing and focusing ions for TOF-based bottom-proteomics measurements. The study demonstrates improved sensitivity relative to previous generation instrumentation and also explores the impact of the specific trap device relative to the general improvements of the remaining MS system. The work outlines a route towards high coverage proteomics at very high throughput and robustness, as desirable in clinical proteomics and prospective personalized medicine approaches. While not all sample types of interest are limited to the amounts where the strongest improvements are seen in the presented data, large scale studies across expansive cohorts will likey be rendered more practical and realistic due to reduced instrument contamination at reduced loads and also further applications beyond those discussed in the manuscript will be rendered feasible on the newer generation instrument.

    The improved ZenoTOF system and SWATH method follows a series of innovations in the mass spectrometry instrumentation, most notably and related the drastic improvement of ion utilization by storage e.g in a trapped ion mobility device earlier in the ion stream where, beyond an accumulation-based boost of sensitivity, ion mobility as a further biophysical properties is assessed in addition to the conventional m/z, as reviewed recently (doi: 10.1016/j.mcpro.2021.100138.). While these developments culminated and have been targeting low-flow, ultra-high sensitivity applications such as single-cell proteomics, the present study takes a different angle towards higher throughput measurements from significantly larger than single cell, but also significantly lower than historically required sample amounts that were prohibitive to a range of applications that are now easier to accomplish thanks to this and related work of the authors and others. The presented research appears of broad relevance and interest to the scientific community interested in protein abundance pattern analysis, in particular in larger (clinical) cohorts. Furthermore, the performance metrics on proteomic depth from human cell lysate digests will likely allow researchers with analytical quests other than those exemplified in the manuscript to extrapolate the ZenoTOF and Zeno SWATH suitability for their respective analytical targets.

    Reviewer Field of expertise/background:

    Quantitative proteomics. DIA mass spectrometry method & algorithm development & heavy usage. Protein Biochemistry. Molecular Biology.

  3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #1

    Evidence, reproducibility and clarity

    In this manuscript Wang et al, benchmark the new ZenoTOF with analytical and micro-flow set up and show impressive numbers of proteins identified and quantified. The paper is well written, and I have only a few minor comments:

    1. Figure 1. many of the panels are hard to read. Especially 1a
    2. Figure 1d. can the human amount not be normalised to log2=0?
    3. Please provide in the legend the bin size for the figure in 1e.
    4. Page 10 top: did SWATH identify more proteins than Zeno SWATH in plasma? There is something wrong as the figure shows something else. Also: That sentence in brackets is confusing.
    5. Typo: page 10: respectively.
    6. Please add raw data to PRIDE or a similar repository.

    Significance

    This is an impressive new technology that has been benchmarked by the Ralser group. It outperforms current state-of-the-art approaches.

    The primary audience is the proteomics community.

    My expertise is proteomics and quantitative mass spectrometry. I am well qualified to review this paper.