Maintenance of copy number variation at the human salivary agglutinin gene ( DMBT1 ) by balancing selection driven by host-microbe interactions

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Most genetic variation in humans occurs in a pattern consistent with neutral evolution, but a small subset is maintained by balancing selection. Identifying loci under balancing selection is important not only for understanding the processes explaining variation in the genome, but also to identify loci with alleles that affect response to the environment and disease. Several genome scans using genetic variation data have identified the 5’ end of the DMBT1 gene as a region undergoing balancing selection. DMBT1 encodes the pattern-recognition glycoprotein DMBT1, also known as SALSA, gp340 or salivary agglutinin. It binds to a wide variety of pathogens through a tandemly-arranged scavenger receptor cysteine-rich (SRCR) domain, with the number of SRCR domains varying in humans. Here we use expression analysis, linkage in pedigrees, and long range single transcript sequencing, to show that the signal of balancing selection is driven by one haplotype usually carrying shorter SRCR repeats, and another usually carrying a longer SRCR repeat, within the coding region of DMBT1 . The DMBT1 protein size isoform encoded by a shorter SRCR domain repeat allele showed complete loss of binding of a cariogenic and invasive Streptococcus mutans strain in contrast to the long SRCR allele. Taken together, our results suggest that balancing selection at DMBT1 is due to host-microbe interactions of encoded SRCR tandem repeat alleles.

Article activity feed

  1. Peer review report

    Reviewer: Champion Deivanayagam

    Institution: University of Alabama at Birmingham

    email: champy@uab.edu


    General comments

    Summary of the study: The authors begin the manuscript describing an effort to understand the various sizes of DMBT1 protein, namely variations in the copy numbers of SRCR repeats based on its DNA sequence variations among various groups. In this effort they have identified/chosen three major groups namely European (states in Figure 1of the manuscript they are European-Americans in Utah), African (Yoruba of Ibadan, Nigeria) and Asian (Chinese from Beijing). The observed high D value shown in Figure 1, they contend is the evidence for balancing selection (which this Reviewer has no expertise on). Based on this Tajima scores, they were able to identify two haplotypes, and this led to them arriving at SNP (rs11523871).

    Now comes the interesting part of parsing the copy number variations of DMBT1’s SRCR domains within these two haplotype clades, where they conclude that the SRCR copy numbers are population specific. Then looking at tissue specific expression of DMBT1, where they observe higher expression levels in some tissues such as lung, small intestine, colon, and minor salivary gland. More interestingly they observe that there exists no linear relationship that exists among the alleles and concluding that there is no plausible explanation for protein expression, but the selection locus rs11523861 somehow is related at balancing selection.

    In an attempt to determine the copy number variations, a small set of samples (8) were collected from saliva and analysed. In table 1 they summarize these findings, where they observe four different isoforms. They report that there is a strong linear relationship, but do not present a figure or other details, except statistical parameters to convince the reader. From here, they step into establishing alternative splicing as a possibility, where they use the H292 lung cell line model, and report in past tense that the H292 cell line was homozygous for the 11/10 SRCR domain repeats (Figure 4). While they could not conclude, it mentions that alternative splicing may play a minor role.

    Finally, this study attempts to use the results from the classic Stromberg lab study published in 2007, which enumerated various properties of Gp340, and classified them into 4 groups, and enumerated their affinities for various carbohydrates, Lewis antigens and compared the oral and lung Gp340. The authors here use western blots to determine if the short allele is different from the normal one. They use two knock-outs of surface components on S. mutans, SpaP and Cnm to show that the shorter alleles display no binding. In the same set of experiments, they also show that S. mutans adheres to mono and dimeric amylase.

    Finally, they conclude that some of the variations in adherence may be due to the carbohydrates present on the different DMBT1.


    Section 1 – Serious concerns

    • Do you have any serious concerns about the manuscript such as fraud, plagiarism, unethical or unsafe practices? No
    • Have authors’ provided the necessary ethics approval (from authors’ institution or an ethics committee)? Yes

    Section 2 – Language quality

    • How would you rate the English language quality? Low to medium quality, but I understand the content

    Section 3 – validity and reproducibility

    • Does the work cite relevant and sufficient literature? Yes
    • Is the study design appropriate and are the methods used valid? Yes
    • Are the methods documented and analysis provided so that the study can be replicated? Yes
    • Is the source data that underlies the result available so that the study can be replicated? Yes
    • Is the statistical analysis and its interpretation appropriate? Yes to a large extent
    • Is quality of the figures and tables satisfactory? No
    • Are the conclusions adequately supported by the results? No
    • Are there any objective errors or fundamental flaws that make the research invalid? Please describe these thoroughly. Yes

    Section 4 – Suggestions

    • In your opinion how could the author improve the study?

    Additional experiments are needed to support the conclusions.

    Major Concerns:

    The research work presented here appears highly disjointed at times. While they begin this study to evaluate the need for copy numbers and how it could have been a part of selection process induced by a variety of factors, their analysis leads to Supplementary figure 1, where they identify two clades. From here they offer some evidence for the choice of rs11523861 to study the balancing selection. However, no concrete evidence is presented and/or discussed except peripherally. This Reviewer understands the effort, which is very interesting, however, without additional analysis the current form of results does not offer any new insights, and so the title is highly misleading.

    At this point, they move into CNV’s, whereby their own admission have used a very small sampling of 8 individuals and extrapolate their results to be conclusive. Further sampling and additional studies are necessary to complete this section.

    One solid set of results shown here are from the H292 lung cell line model, where they report different variations of the repeats. However, all analysis stops here, and conclusion is derived that alternative splicing is ruled out, but for a minimal role. Once again, the same pattern exists here to do an experiment, without further analysis and additional data, conclusions are drawn.

    Finally, from sequence analysis the authors step into protein-based assays to expose the role of S. mutans adherence on the presence and absence of its surface proteins SpaP and Cnm. Their conclusion that Cnm is the major adherent protein is also out of one single experiment. More importantly, their analysis of I-IV alleles (without explaining them), an extension of a previous well cited article, results in a conclusion that smaller allelles (???) do not adhere to saliva. In Table 1, the molecular weights of these isoforms are in alignment with the 14 SRCR domains, but for their carbohydrate decorations, with one exceptions of sample 6. Also, this Reviewer is not sure if this table corresponds to the 8 samples, they have used for CNV analysis and/or for the smaller/larger alleles they allude to in the western blot study.

    In summary this appears to be a manuscript that is not only disjointed, but also not detailed enough to warrant publication at this stage. If they could do additional analysis on each of the points raised, it would elevate the research and conclusions.

    Minor concerns: The manuscript is extremely poorly written and needs major revamping in order to produce a more concrete publication.

    Figure numbers and legends are often mixed up both in the text as well as in the legends. Figure 5 - Differential binding of S.mutans by DMBT1 isoforms in saliva: Overlay of individual saliva phenotypes with DMBT1 size isoforms I- IV with A) a biotinylated S. mutans SpaP A, Cnm strain and B) with DMBT1-specific antibodies. The positions of DMBT1, mono- and dimer amylase, and acidic PRP co-receptors are marked by arrows”. This bold line is not related to this figure but to the supplementary figure.

    There is no marker in these figure 5A so indicate the specific molecular weights. The isoform IV on the last lane seems to lightup with the lower MW DMBT1 (allele?). However, the conclusions presented show that lower isoforms do not bind well is contrary to the results presented here.

    Supplementary figure 2 should be in the main manuscript, even though this is not a new result, in combination with the other two, authors can summarize the results.

    Abbreviations should be in the expanded form the first time.

    Why do the authors use the SpaP A instead of SpaP?

    What do they mean by a biotinylated S. mutans SpaP A, Cnm strain?

    • Do you have any other feedback or comments for the Author?

    These copy numbers have always been fascinating not only on DMBT1 but on other proteins, yet to date we don’t have a handle on the type of selection. If these authors could provide additional analysis on why and how balance selection could have played a role it would be very important and could be extended to numerous other proteins.


    Section 5 – Decision

    Requires revisions