Structural Modeling of the TMPRSS Subfamily of Host Cell Proteases Reveals Potential Binding Sites

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The transmembrane protease serine subfamily (TMPRSS) has at least eight members with known protein sequence: TMPRSS2, TMPRRS3, TMPRSS4, TMPRSS5, TMPRSS6, TMPRSS7, TMPRSS9, TMPRSS11, TMPRSS12 and TMPRSS13. A majority of these TMPRSS proteins have key roles in human hemostasis as well as promoting certain pathologies, including several types of cancer. In addition, TMPRSS proteins have been shown to facilitate the entrance of respiratory viruses into human cells, most notably TMPRSS2 and TMPRSS4 activate the spike protein of the SARS-CoV-2 virus. Despite the wide range of functions that these proteins have in the human body, none of them have been successfully crystallized. The lack of structural data has significantly hindered any efforts to identify potential drug candidates with high selectivity to these proteins. In this study, we present homology models for all members of the TMPRSS family including any known isoform (the homology model of TMPRSS2 is not included in this study as it has been previously published). The atomic coordinates for all homology models have been refined and equilibrated through molecular dynamic simulations. The structural data revealed potential binding sites for all TMPRSS as well as key amino acids that can be targeted for drug selectivity.

Article activity feed

  1. SciScore for 10.1101/2021.06.15.448583: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Homology Modeling: The 14 proteins (TMPRSS2, TMPRSS3, TMPRSS4, TMPRSS5, TMPRSS6, TMPRSS7, TMPRSS9, TMPRSS11A, TMPRSS11B, TMPRSS11D, TMPRSS11E, TMPRSS11F, TMPRSS12, TMPRSS13) amino acid sequence was obtained from the UnitProt database (Gene ID: O15393, P57727, Q9NRS4, Q9H3S3, Q9DBI0, Q7RTY8, Q7Z410, Q6ZMR5, Q86T26, O60235, Q9UL52, Q6ZWK6, Q86WS5, Q9BYE2, respectively), and the crystal structures with high sequence identity, available in the Protein Databank (PDB), were retrieved through the BLASTp algorithm.
    BLASTp
    suggested: (BLASTP, RRID:SCR_001010)
    Found through the Pfam database, all the crystal structures were truncated to only their trypsin domains.
    Pfam
    suggested: (Pfam, RRID:SCR_004726)

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.