Massively parallel interrogation of protein fragment secretability using SECRiFY reveals features influencing secretory system transit

This article has been Reviewed by the following groups

Read the full article

Abstract

While transcriptome- and proteome-wide technologies to assess processes in protein biogenesis are now widely available, we still lack global approaches to assay post-ribosomal biogenesis events, in particular those occurring in the eukaryotic secretory system. We here develop a method, SECRiFY, to simultaneously assess the secretability of >10 5 protein fragments by two yeast species, S. cerevisiae and P. pastoris , using custom fragment libraries, surface display and a sequencing-based readout. Screening human proteome fragments with a median size of 50–100 amino acids, we generate datasets that enable datamining into protein features underlying secretability, revealing a striking role for intrinsic disorder and chain flexibility. The SECRiFY methodology generates sufficient amounts of annotated data for advanced machine learning methods to deduce secretability patterns. The finding that secretability is indeed a learnable feature of protein sequences provides a solid base for application-focused studies.

Article activity feed

  1. SciScore for 10.1101/241349: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Arguably, our method also has its limitations. In the current SECRiFY setup, secretability was measured in the sequence context of the a mating factor prepro sequence at the N-terminus, and the Sag1 cell wall protein at the C-terminus. While results from our and other labs have indicated that for several single proteins, display efficiency correlates with relative secretion levels, it cannot be excluded that, at least for certain fragments, both leader sequence and the +/− 300 amino acid Sag1 anchor might differentially influence fragment folding, solubility, or stability. In E. coli, fusion to large proteins such as SUMO, the T. harzanium cellulose binding domain (CBD), or to maltose binding protein (MBP) is an often used strategy to promote ‘passenger solubilization’, although again, effects vary depending on the protein66,67. Considering the vectorial nature of translation, a C-terminal fusion, as is the case in our setup, is nevertheless generally deemed less perturbing than an N-terminal fusion, although this is not absolute. Sag1 is also a GPI-anchored protein, affecting the entry pathway into the ER68–70. Similarly, the prepro leader sequence, with its multi-step processing and preference for posttranslational translocation71–73, may bias secretability of certain fragments. It remains to be determined whether similar patterns will emerge with different secretory leaders, anchors, promoters, untranslated regions, or growth conditions. Display also imposes limitations on...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.