Linking gut microbiome to bone mineral density: a shotgun metagenomic dataset from 361 elderly women

This article has been Reviewed by the following groups

Read the full article See related articles


Bone mass loss contributes to the risk of bone fracture in the elderly. Many factors including age, obesity, estrogen and diet, are associated with bone mass loss. Mice studies suggested that the gut microbiome might affect the bone mass by regulating the immune system. However, there has been little evidence from human studies. Bone loss increases after menopause. Therefore, we have recruited 361 Chinese post-menopausal women to collect their fecal samples and metadata to conduct a metagenome-wide association study (MWAS) to investigate the influence of the gut microbiome on bone health. Gut microbiome sequencing data were produced using the BGISEQ-500 sequencer. Bone mineral density (BMD) was calculated using a Hologic dual energy X-ray machine, and body mass index (BMI) and age were also recorded. This collected data allows exploration of the gut microbial diversity and their links to bone mass loss as well as to microbial markers for bone mineral density. In addition, these data are potentially useful in studying the role that the gut microbiota might play in bone mass loss and in exploring the process of bone mass loss.

Article activity feed

  1. Bone mass loss

    **Reviewer 1. Levi Waldron ** Wang et al. present a shotgun metagenomics cross-sectional study of fecal specimens from 361 elderly women with the primary objective of identifying correlations between bone mass density and microbial taxa. The methods are reasonable and I have no major concerns about this manuscript, only some moderate suggestions to improve reporting and discussion.

    For items answered “Yes” it would help to provide line numbers in the manuscript, as done for some but not all checklist items.

    3.0 Participants:

    It’s stated that “Fecal samples of 361 post-menopause women were randomly collected at the People’s Hospital of Shenzhen” – I suspect the correct word here is “arbitrarily” rather than “randomly”, unless a random number generator was used to select a random sample of all eligible patients. Some statement of how the women were recruited and how representative they are of all patients at the hospital is warranted. E.g. were they recruited from emergency room, a cancer ward, all outpatients, all admitted patients, etc? See also later comment about generalizability.

    4.9 Batch Effects:

    This is left “NA” – can the authors at least comment (in the manuscript) on the potential for batch effects affecting cases and controls differently – ie were they all prepared together or in separate libraries, and were they sequenced in the same runs or completely separated?

    8.0 Reproducible research:

    I appreciate that data have been posted at EBI and CNGB. Could the authors also comment on whether the metadata essential to the analysis are also provided, and that these can be linked to the sequence data? Although I’m glad to hear that “Others could reproduce the reported analysis from clean reads by the declared software and parameters” I do think that the code to reproduce the analysis should also be reported.

    8.1 Raw data access

    The checklist states “no raw reads for ethical” but the manuscript states “The sequencing reads from each sequencing library have been deposited at EBI with the accession number: PRJNA530339 and the China National Genebank (CNGB), accession number CNP0000398.” so there is a disconnect. Assuming human sequence reads are removed from the data, I’m not convinced of ethical reasons not to post microbial sequence reads, but it seems the authors have posted the microbial sequence reads.

    10.1 – 10.5 Taxonomy, differential abundance, other analysis, other data types, and other statistical analysis are all blank. Some should be “N/A” but others just seem to be overlooked.

    13.2 Generalizability: I think this is an important element to include in the discussion. How typical are your volunteers of all women that age?


    “Making these data potentially useful in studying the role the gut microbiota might play in bone mass loss and offering exploration into the bone mass loss process.” -> These data are potentially useful in studying the role the gut microbiota might play in bone mass loss and in exploring the bone mass loss process.

    The manuscript is well written, but there are a few other places that would benefit from some copy editing.

  2. Abstract

    **Reviewer 2. Christopher Hunter ** Is the language of sufficient quality?


    Is the data all available and does it match the descriptions in the paper?


    Most of the data are provided as supplemental files in biorXiv, but in Excel rather than CSV. These data files will need to be curated into a GigaDB dataset.

    Is the data and metadata consistent with relevant minimum information or reporting standards?


    Is the data acquisition clear, complete and methodologically sound?


    Comment. The consent by the patients to openly share all metadata is not clearly stated, simply saying the study was approved by the bioethics review board does not mean consent was given to share the data, just that the institute consent to the study being done.

    Is there sufficient detail in the methods and data-processing steps to allow reproduction?


    Comments: Maybe to someone with a good understanding of statistics there is sufficient detail, this is an area that a statistician should look at. For me, the descriptions of the analysis and the methods do not given anywhere near enough detail for me to either understand what was done or replicate it. The concept of "Gut metabolic modules" is not defined here, with just a reference to another paper, a brief explanation of what is meant by the term here would be useful.

    Is there sufficient data validation and statistical analyses of data quality?


    Comments. The sequences were filtered for human contaminants and adapter seq, also low quality reads were removed.

    Is the validation suitable for this type of data?


    Comments: The metadata is extensive but there are some basic points that are missing; collection date, antibiotic use, relatedness of samples/patients. Other less important details are also missing, like why and how this cohort was selected.

    Is there sufficient information for others to reuse this dataset or integrate it with other data?


    Any Additional Overall Comments to the Author


    • I am concerned about the open sharing of patient metadata without the evidence that it was consented prior to sharing. - A lot of metadata is collected and provided in the supplemental tables (which is great for reuse) but there are no explanations of what the values are, while some headers are self explanatory others less-so e.g. what is CROSSL(pg/ml)? or "Side crops", - how were the various conditions diagnosed? - I see no indication of antibiotic usage in the cohort - Are all the samples from different individuals? was each sample a single bowl movement? - There is no background given as to how this cohort was selected or why. - The is no discussion of the bone mass density of a "normal" cohort, does this cohort represent a normal cohort or is it already biased toward low or high density? Simply describing the cohort with respect to Normal (T of -1 or above), low (-1to-2.5) or osteoporosis (< -2.5) would be a help. I cannot see the T-scores included in the sTab1a file, are they computed from the L1-L4(z) values given? - There are a number of NA values in the table of samples metadata, but there is no explanation as to how these samples where handled in the analysis. - In general I feel that there is a lot of poorly described statistical analyses included that are not required as part of a data note, the focus should be on describing the data and ensuring the data and metadata are well explained.