Serum metabolome indicators of early childhood development in the Brazilian National Survey on Child Nutrition (ENANI-2019)

Marina Padilha
Victor Nahuel Keller
Paula Normando
Raquel M Schincaglia
Nathalia C Freitas-Costa
Samary SR Freire
Felipe M Delpino
Inês RR de Castro
Elisa MA Lacerda
Dayana R Farias
Zachary Kroezen
Meera Shanmuganathan
Philip Britz-Mckibbin
Gilberto Kac

Curated by eLife

eLife Assessment

This important work advances our understanding of factors influencing early childhood development. The large sample size and methodology applied make the findings of this study convincing; however, support for some of the claims made by the authors is incomplete. The work will be of interest to researchers in developmental science and early childhood pediatrics.

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (eLife)

Abstract

The role of circulating metabolites on child development is understudied. We investigated associations between children’s serum metabolome and early childhood development (ECD).

Methods:

Untargeted metabolomics was performed on serum samples of 5004 children aged 6–59 months, a subset of participants from the Brazilian National Survey on Child Nutrition (ENANI-2019). ECD was assessed using the Survey of Well-being of Young Children’s milestones questionnaire. The graded response model was used to estimate developmental age. Developmental quotient (DQ) was calculated as the developmental age divided by chronological age. Partial least square regression selected metabolites with a variable importance projection ≥1. The interaction between significant metabolites and the child’s age was tested.

Results:

Twenty-eight top-ranked metabolites were included in linear regression models adjusted for the child’s nutritional status, diet quality, and infant age. Cresol sulfate ( β =–0.07; adjusted-p <0.001), hippuric acid ( β =–0.06; adjusted-p <0.001), phenylacetylglutamine ( β =–0.06; adjusted-p <0.001), and trimethylamine- N -oxide ( β =–0.05; adjusted-p=0.002) showed inverse associations with DQ. We observed opposite directions in the association of DQ for creatinine (for children aged –1 SD: β =–0.05; p P =0.01;+1 SD: β =0.05; p=0.02) and methylhistidine (–1 SD: β = - 0.04; p=0.04;+1 SD: β =0.04; p=0.03).

Conclusions:

Serum biomarkers, including dietary and microbial-derived metabolites involved in the gut-brain axis, may potentially be used to track children at risk for developmental delays.

Funding:

Supported by the Brazilian Ministry of Health and the Brazilian National Research Council.

Version published to 10.7554/elife.97982 on eLife
Jan 15, 2025
eLife
Oct 22, 2024

Author response:

Reviewer #1 (Public Review):

Padilha et al. aimed to find prospective metabolite biomarkers in serum of children aged 6-59 months that were indicative of neurodevelopmental outcomes. The authors leveraged data and samples from the cross-sectional Brazilian National Survey on Child Nutrition (ENANI-2019), and an untargeted multisegment injection-capillary electrophoresis-mass spectrometry (MSI-CE-MS) approach was used to measure metabolites in serum samples (n=5004) which were identified via a large library of standards. After correlating the metabolite levels against the developmental quotient (DQ), or the degree of which age-appropriate developmental milestones were achieved as evaluated by the Survey of Well-being of Young Children, serum concentrations of phenylacetylglutamine (PAG), cresol sulfate (CS), hippuric …

Author response:

Reviewer #1 (Public Review):

Padilha et al. aimed to find prospective metabolite biomarkers in serum of children aged 6-59 months that were indicative of neurodevelopmental outcomes. The authors leveraged data and samples from the cross-sectional Brazilian National Survey on Child Nutrition (ENANI-2019), and an untargeted multisegment injection-capillary electrophoresis-mass spectrometry (MSI-CE-MS) approach was used to measure metabolites in serum samples (n=5004) which were identified via a large library of standards. After correlating the metabolite levels against the developmental quotient (DQ), or the degree of which age-appropriate developmental milestones were achieved as evaluated by the Survey of Well-being of Young Children, serum concentrations of phenylacetylglutamine (PAG), cresol sulfate (CS), hippuric acid (HA) and trimethylamine-N-oxide (TMAO) were significantly negatively associated with DQ. Examination of the covariates revealed that the negative associations of PAG, HA, TMAO and valine (Val) with DQ were specific to younger children (-1 SD or 19 months old), whereas creatinine (Crtn) and methylhistidine (MeHis) had significant associations with DQ that changed direction with age (negative at -1 SD or 19 months old, and positive at +1 SD or 49 months old). Further, mediation analysis demonstrated that PAG was a significant mediator for the relationship of delivery mode, child's diet quality and child fiber intake with DQ. HA and TMAO were additional significant mediators of the relationship of child fiber intake with DQ.

Strengths of this study include the large cohort size and study design allowing for sampling at multiple time points along with neurodevelopmental assessment and a relatively detailed collection of potential confounding factors including diet. The untargeted metabolomics approach was also robust and comprehensive allowing for level 1 identification of a wide breadth of potential biomarkers. Given their methodology, the authors should be able to achieve their aim of identifying candidate serum biomarkers of neurodevelopment for early childhood. The results of this work would be of broad interest to researchers who are interested in understanding the biological underpinnings of development and also for tracking development in pediatric populations, as it provides insight for putative mechanisms and targets from a relevant human cohort that can be probed in future studies. Such putative mechanisms and targets are currently lacking in the field due to challenges in conducting these kind of studies, so this work is important.

However, in the manuscript's current state, the presentation and analysis of data impede the reader from fully understanding and interpreting the study's findings.

Particularly, the handling of confounding variables is incomplete. There is a different set of confounders listed in Table 1 versus Supplementary Table 1 versus Methods section Covariates versus Figure 4. For example, Region is listed in Supplementary Table 1 but not in Table 1, and Mode of Delivery is listed in Table 1 but not in Supplementary Table 1. Many factors are listed in Figure 4 that aren't mentioned anywhere else in the paper, such as gestational age at birth or maternal pre-pregnancy obesity.

We thank the reviewer for their comment. We would like to clarify that initially, the tables had different variables because they have different purposes. Table 1 aims to characterize the sample on variables directly related to the children’s and mother’s features and their nutritional status. Supplementary File 1(previously named supplementary table 1) summarizes the sociodemographic distribution of the development quotient. Neither of the tables concerned the metabolite-DQ relationships and their potential covariates, they only provide context for subsequent analyses by characterizing the sample and the outcome. Instead, the covariates included in the regression models were selected using the Direct Acyclic Graph presented in Figure 1.

To avoid this potential confusion however, we included the same variables in Table 1 and Supplementary File 1(page 38) and we discussed the selection of model covariates in Figure 4 in more detail here in the letter and in the manuscript.

The authors utilize the directed acrylic graph (DAG) in Figure 4 to justify the further investigation of certain covariates over others. However, the lack of inclusion of the microbiome in the DAG, especially considering that most of the study findings were microbial-derived metabolite biomarkers, appears to be a fundamental flaw. Sanitation and micronutrients are proposed by the authors to have no effect on the host metabolome, yet sanitation and micronutrients have both been demonstrated in the literature to affect microbiome composition which can in turn affect the host metabolome.

Thank you for your comment. We appreciate that the use of DAG and lack of the microbiome in the DAG are concerns. This has been already discussed in reply #1 to the editor that has been pasted below for convenience:

Thank you for the comment and suggestions. It is important to highlight that there is no data on microbiome composition. We apologize if there was an impression such data is available. The main goal of conducting this national survey was to provide qualified and updated evidence on child nutrition to revise and propose new policies and nutritional guidelines for this demographic. Therefore, collection of stool derived microbiome (metagenomic) data was not one of the objectives of ENANI-2019. This is more explicitly stated as a study limitation in the revised manuscript on page 17, lines 463-467:

“Lastly, stool microbiome data was not collected from children in ENANI-2019 as it was not a study objective in this large population-based nutritional survey. However, the lack of microbiome data does not reduce the importance/relevance, since there is no evidence that microbiome and factors affecting microbiome composition are confounders in the association between serum metabolome and child development.”

Besides, one must consider the difficulties and costs in collecting and analyzing microbiome composition in a large population-based survey. In contrast, the metabolome data has been considered a priority as there was already blood specimens collected to inform policy on micronutrient deficiencies in Brazil. However, due to funding limitations we had to perform the analysis in a subset of our sample, still representative and large enough to test our hypothesis with adequate study power (more details below).

We would like to argue that there is no evidence that microbiome and factors affecting microbiome composition are confounders on the association between serum metabolome and child development. First, one should revisit the properties of a confounder according to the epidemiology literature that in short states that confounding refers to an alternative explanation for a given conclusion, thus constituting one of the main problems for causal inference (Kleinbaum, Kupper, and Morgenstern, 1991; Greenland & Robins, 1986; VanderWeele, 2019). In our study, we highlight that certain serum metabolites associated with the developmental quotient (DQ) in children were circulating metabolites (e.g., cresol sulfate, hippuric acid, phenylacetylglutamine, TMAO) previously reported to depend on dietary exposures, host metabolism and gut microbiota activity. Our discussion cites other published work, including animal models and observational studies, which have reported how these bioactive metabolites in circulation are co-metabolized by commensal gut microbiota, and may play a role in neurodevelopment and cognition as mediated by environmental exposures early in life.

In fact, the literature on the association between microbiome and infant development is very limited. We performed a search using terms ‘microbiome’ OR ‘microbiota’ AND ‘child development’ AND ‘systematic’ OR ‘meta-analysis’ and found only one study: ‘Associations between the human immune system and gut microbiome with neurodevelopment in the first 5 years of life: A systematic scoping review’ (DOI 10.1002/dev.22360). The authors conclude: ‘while the immune system and gut microbiome are thought to have interactive impacts on the developing brain, there remains a paucity of published studies that report biomarkers from both systems and associations with child development outcomes.’ It is important to highlight that our criteria to include confounders on the directed acyclic graph (DAG) was based on the literature of systematic reviews or meta-analysis and not on single isolated studies.

In summary, we would like to highlight that there is no microbiome data in ENANI-2019 and in the event such data was present, we are confident that based on the current stage of the literature, there is no evidence to consider such construct in the DAG, as this procedure recommends that only variables associated with the exposure and the outcome should be included. Please find more details on DAG below.

Moreover, we would like to clarify that we have not stated that sanitation and micronutrients have no effect on the serum metabolome, instead, these constructs were not considered on the DAG.

To make it clearer, we have modified the passage about DAG in the methods section. New text, page 9, lines 234-241:

“The subsequent step was to disentangle the selected metabolites from confounding variables. A Directed Acyclic Graph (DAG; Breitling et al., 2021) was used to more objectively determine the minimally sufficient adjustments for the regression models to account for potentially confounding variables while avoiding collider variables and variables in the metabolite-DQ causal pathways, which if controlled for would unnecessarily remove explained variance from the metabolites and hamper our ability to detect biomarkers. To minimize bias from subjective judgments of which variables should and should not be included as covariates, the DAG only included variables for which there was evidence from systematic reviews or meta-analysis of relationships with both the serum metabolome and DQ (Figure 1). Birth weight, breastfeeding, child's diet quality, the child's nutritional status, and the child's age were the minimal adjustments suggested by the DAG. Birth weight was a variable with high missing data, and indicators of breastfeeding practice data (referring to exclusive breastfeeding until 6 months and/or complemented until 2 years) were collected only for children aged 0–23 months. Therefore, those confounders were not included as adjustments. Child's diet quality was evaluated as MDD, the child's nutritional status as w/h z-score, and the child's age in months.”

Additionally, the authors emphasized as part of the study selection criteria the following, "Due to the costs involved in the metabolome analysis, it was necessary to further reduce the sample size. Then, samples were stratified by age groups (6 to 11, 12 to 23, and 24 to 59 months) and health conditions related to iron metabolism, such as anemia and nutrient deficiencies. The selection process aimed to represent diverse health statuses, including those with no conditions, with specific deficiencies, or with combinations of conditions. Ultimately, through a randomized process that ensured a balanced representation across these groups, a total of 5,004 children were selected for the final sample (Figure 1)."

Therefore, anemia and nutrient deficiencies are assumed by the reader to be important covariates, yet, the data on the final distribution of these covariates in the study cohort is not presented, nor are these covariates examined further.

Thank you for the comments. We apologize for the misunderstanding and will amend the text to make our rationale clearer in the revised version of the manuscript.

We believed the original text was clear enough in stating that the sampling process was performed aiming to maintain the representativeness of the original sample. This sampling process considered anemia and nutritional deficiencies, among other variables. However, we did not aim to include all relevant covariates of the DQ-metabolome relationship; these were decided using the DAG, as described in the manuscript and other sessions of this letter. Therefore, we would like to emphasize that our description of the sampling process does not assumes anemia and nutritional deficiencies are important covariates for the DQ-metabolome relationship.

We rewrote this text part, page 11, lines 279-285:

“Due to the costs involved in the metabolome analysis, it was necessary to reduce the sample size that is equivalent to 57% of total participants from ENANI-2019 with stored blood specimens. Therefore, the infants were stratified by age groups (6 to 11, 12 to 23, and 24 to 59 months) and health conditions such as anemia and micronutrient deficiencies. The selection process aimed to represent diverse health statuses to the original sample. Ultimately, 5,004 children were selected for the final sample through a random sampling process that ensured a balanced representation across these groups (Figure 2).”

The inclusion of specific covariates in Table 1, Supplementary Table 1, the statistical models, and the mediation analysis is thus currently biased as it is not well justified.

We appreciate the reviewer comment. However, it would have been ideal to receive a comment/critic with a clearer and more straightforward argumentation, so we could try to address it based on our interpretation.

Please refer to our response to item #1 above regarding the variables in the tables and figures. The covariates in the statistical models were selected using the DAG, which is a cutting-edge procedure that aims to avoid bias and overfitting, a common situation when confounders are adjusted for without a clear rationale. We elaborate on the advantages of using the DAG in response to item #6 and in page 9 of the manuscript. The statistical models we use follow the best practices in the field when dealing with a large number of collinear predictors and a continuous outcome (see our response to the editor’s 4th comment). Finally, the mediation analyses were done to explore a few potential explanations for our results from the PLSR and multiple regression analyses. We only ran mediation analyses for plausible mechanisms for which the variables of interest were available in our data. Please see our response to reviewer 3’s item #1 for a more detailed explanation on the mediation analysis.

Finally, it is unclear what the partial-least squares regression adds to the paper, other than to discard potentially interesting metabolites found by the initial correlation analysis.

Thank you for the question. As explained in response to the editor’s item #4, PLS-based analyses are among the most commonly used analyses for parsing metabolomic data (Blekherman et al., 2011; Wold et al., 2001; Gromski et al. 2015). This procedure is especially appropriate for cases in which there are multiple collinear predictor variables as it allows us to compare the predictive value of all the variables without relying on corrections for multiple testing. Testing each metabolite in separate correlations corrected for multiple comparisons is less appropriate because the correlated nature of the metabolites means the comparisons are not truly independent and would cause the corrections (which usually assume independence) to be overly strict. As such, we only rely on the correlations as an initial, general assessment that gives context to subsequent, more specific analyses. Given that our goal is to select the most predictive metabolites, discarding the less predictive metabolites is precisely what we aim to achieve. As explained above and in response to the editor’s item #4, the PLSR allows us to reach that goal without introducing bias in our estimates or losing statistical power.

Reviewer #2 (Public Review):

A strength of the work lies in the number of children Padilha et al. were able to assess (5,004 children aged 6-59 months) and in the extensive screening that the Authors performed for each participant. This type of large-scale study is uncommon in low-to-middle-income countries such as Brazil.

The Authors employ several approaches to narrow down the number of potentially causally associated metabolites.

Could the Authors justify on what basis the minimum dietary diversity score was dichotomized? Were sensitivity analyses undertaken to assess the effect of this dichotomization on associations reported by the article? Consumption of each food group may have a differential effect that is obscured by this dichotomization.

Thank you for the observation. We would like to emphasize that the child's diet quality was assessed using the minimum dietary diversity (MDD) indicator proposed by the WHO (World Health Organization & United Nations Children’s Fund (UNICEF), 2021). This guideline proposes the cutoff used in the present study. We understand the reviewer’s suggestion to use the consumption of healthy food groups as an evaluation of diet quality, but we chose to follow the WHO proposal to assess dietary diversity. This indicator is widely accepted and used as a marker and provides comparability and consistency with other published studies.

Could the Authors specify the statistical power associated with each analysis?

To the best of our knowledge, we are not aware of power calculation procedures for PLS-based analyses. However, given our large sample size, we do not believe power was an issue with the analyses. For our regression analyses, which typically have 4 predictors, we had 95% power to detect an f-squared of 0.003 and an r of 0.05 in a two-sided correlation test considering an alpha level of 0.05.

New text, page 11, lines 296-298:

“Given the size of our sample, statistical power is not an issue in our analyses. Considering an alpha of 0.05 for a two-sided test, a sample size of 5000 has 95% power to detect a correlation of r = 0.05 and an effect of f2 = 0.003 in a multiple regression model with 4 predictors.”

Could the Authors describe in detail which metric they used to measure how predictive PLSR models are, and how they determined what the "optimal" number of components were?

We chose the model with the fewest number of components that maximized R2 and minimized root mean squared error of prediction (RMSEP). In the training data, the model with 4 components had a lower R2 but a lower RMSEP, therefore we chose the model with 3 components which had a higher R2 than the 4-component model and lower RMSEP than the model with 2 components. However, the number of components in the model did not meaningfully change the rank order of the metabolites on the VIP index.

New text, page 8, lines 220-224:

“To better assess the predictiveness of each metabolite in a single model, a PLSR was conducted. PLS-based analyses are the most commonly used analyses when determining the predictiveness of a large number of variables as they avoid issues with collinearity, sample size, and corrections for multiple-testing (Blekherman et al., 2011; Wold et al., 2001; Gromski et al. 2015).”

New text, page 12, lines 312-314:

“In PLSR analysis, the training data suggested that three components best predicted the data (the model with three components had the highest R2, and the root mean square error of prediction (RMSEP) was only slightly lower with four components). In comparison, the test data showed a slightly more predictive model with four components (Figure 3—figure supplement 2).”

The Authors use directed acyclic graphs (DAG) to identify confounding variables of the association between metabolites and DQ. Could the dataset generated by the Authors have been used instead? Not all confounding variables identified in the literature may be relevant to the dataset generated by the Authors.

Thank you for the question. The response is most likely no, the current dataset should not be used to define confounders as these must be identified based on the literature. The use of DAGs has been widely explored as a valid tool for justifying the choice of confounding factors in regression models in epidemiology. This is because DAGs allow for a clear visualization of causal relationships, clarify the complex relationships between exposure and outcome. Besides, DAGs demonstrate the authors' transparency by acknowledging factors reported as important but not included/collected in the study. This has been already discussed in reply #1 to the editor that has been pasted below for convenience.

Thank you for the comment and suggestions. It is important to highlight that there is no data on microbiome composition. We apologize if there was an impression such data is available. The main goal of conducting this national survey was to provide qualified and updated evidence on child nutrition to revise and propose new policies and nutritional guidelines for this demographic. Therefore, collection of stool derived microbiome (metagenomic) data was not one of the objectives of ENANI-2019. This is more explicitly stated as a study limitation in the revised manuscript on page 17, lines 463-467:

“Lastly, stool microbiome data was not collected from children in ENANI-2019 as it was not a study objective in this large population-based nutritional survey. However, the lack of microbiome data does not reduce the importance/relevance, since there is no evidence that microbiome and factors affecting microbiome composition are confounders in the association between serum metabolome and child development.”

Besides, one must consider the difficulties and costs in collecting and analyzing microbiome composition in a large population-based survey. In contrast, the metabolome data has been considered a priority as there was already blood specimens collected to inform policy on micronutrient deficiencies in Brazil. However, due to funding limitations we had to perform the analysis in a subset of our sample, still representative and large enough to test our hypothesis with adequate study power (more details below).

We would like to argue that there is no evidence that microbiome and factors affecting microbiome composition are confounders on the association between serum metabolome and child development. First, one should revisit the properties of a confounder according to the epidemiology literature that in short states that confounding refers to an alternative explanation for a given conclusion, thus constituting one of the main problems for causal inference (Kleinbaum, Kupper, and Morgenstern, 1991; Greenland & Robins, 1986; VanderWeele, 2019). In our study, we highlight that certain serum metabolites associated with the developmental quotient (DQ) in children were circulating metabolites (e.g., cresol sulfate, hippuric acid, phenylacetylglutamine, TMAO) previously reported to depend on dietary exposures, host metabolism and gut microbiota activity. Our discussion cites other published work, including animal models and observational studies, which have reported how these bioactive metabolites in circulation are co-metabolized by commensal gut microbiota, and may play a role in neurodevelopment and cognition as mediated by environmental exposures early in life.

In fact, the literature on the association between microbiome and infant development is very limited. We performed a search using terms ‘microbiome’ OR ‘microbiota’ AND ‘child development’ AND ‘systematic’ OR ‘meta-analysis’ and found only one study: ‘Associations between the human immune system and gut microbiome with neurodevelopment in the first 5 years of life: A systematic scoping review’ (DOI 10.1002/dev.22360). The authors conclude: ‘while the immune system and gut microbiome are thought to have interactive impacts on the developing brain, there remains a paucity of published studies that report biomarkers from both systems and associations with child development outcomes.’ It is important to highlight that our criteria to include confounders on the directed acyclic graph (DAG) was based on the literature of systematic reviews or meta-analysis and not on single isolated studies.

In summary, we would like to highlight that there is no microbiome data in ENANI-2019 and in the event such data was present, we are confident that based on the current stage of the literature, there is no evidence to consider such construct in the DAG, as this procedure recommends that only variables associated with the exposure and the outcome should be included. Please find more details on DAG below.

Moreover, we would like to clarify that we have not stated that sanitation and micronutrients have no effect on the serum metabolome, instead, these constructs were not considered on the DAG.

To make it clearer, we have modified the passage about DAG in the methods section. New text, page 9, lines 234-241:

“The subsequent step was to disentangle the selected metabolites from confounding variables. A Directed Acyclic Graph (DAG; Breitling et al., 2021) was used to more objectively determine the minimally sufficient adjustments for the regression models to account for potentially confounding variables while avoiding collider variables and variables in the metabolite-DQ causal pathways, which if controlled for would unnecessarily remove explained variance from the metabolites and hamper our ability to detect biomarkers. To minimize bias from subjective judgments of which variables should and should not be included as covariates, the DAG only included variables for which there was evidence from systematic reviews or meta-analysis of relationships with both the serum metabolome and DQ (Figure 1). Birth weight, breastfeeding, child's diet quality, the child's nutritional status, and the child's age were the minimal adjustments suggested by the DAG. Birth weight was a variable with high missing data, and indicators of breastfeeding practice data (referring to exclusive breastfeeding until 6 months and/or complemented until 2 years) were collected only for children aged 0–23 months. Therefore, those confounders were not included as adjustments. Child's diet quality was evaluated as MDD, the child's nutritional status as w/h z-score, and the child's age in months.”

Were the systematic reviews or meta-analyses used in the DAG performed by the Authors, or were they based on previous studies? If so, more information about the methodology employed and the studies included should be provided by the Authors.

Thank you for the question. The reviews or meta-analyses used in the DAG have been conducted by other authors in the field. This has been laid out more clearly in our methods section.

New text, page 9, lines 234-241:

“The subsequent step was to disentangle the selected metabolites from confounding variables. A Directed Acyclic Graph (DAG; Breitling et al., 2021) was used to more objectively determine the minimally sufficient adjustments for the regression models to account for potentially confounding variables while avoiding collider variables and variables in the metabolite-DQ causal pathways, which if controlled for would unnecessarily remove explained variance from the metabolites and hamper our ability to detect biomarkers. To minimize bias from subjective judgments of which variables should and should not be included as covariates, the DAG only included variables for which there was evidence from systematic reviews or meta-analysis of relationships with both the metabolome and DQ (Figure 1). Birth weight, breastfeeding, child's diet quality, the child's nutritional status, and the child's age were the minimal adjustments suggested by the DAG. Birth weight was a variable with high missing data, and indicators of breastfeeding practice data (referring to exclusive breastfeeding until 6 months and/or complemented until 2 years) were collected only for children aged 0–23 months. Therefore, those confounders were not included as adjustments. Child's diet quality was evaluated as MDD, the child's nutritional status as w/h z-score, and the child's age in months.”

Approximately 72% of children included in the analyses lived in households with a monthly income superior to the Brazilian minimum wage. The cohort is also biased towards households with a higher level of education. Both of these measures correlate with developmental quotient. Could the Authors discuss how this may have affected their results and how generalizable they are?

Thank you for your comment. This has been already discussed in reply #6 to the editor and that has been pasted below for convenience.

Thank you for highlighting this point. The ENANI-2019 is a population-based household survey with national coverage and representativeness for macroregions, sex, and one-year age groups (< 1; 1-1.99; 2-2.99; 3-3.99; 4-5). Furthermore, income quartiles of the census sector were used in the sampling. The study included 12,524 households 14,588 children, and 8,829 infants with blood drawn.

Due to the costs involved in metabolome analysis, it was necessary to further reduce the sample size to around 5,000 children that is equivalent to 57% of total participants from ENANI-2019 with stored blood specimens. To avoid a biased sample and keep the representativeness and generability, the 5,004 selected children were drawn from the total samples of 8,829 to keep the original distribution according age groups (6 to 11 months, 12 to 23 months, and 24 to 59 months), and some health conditions related to iron metabolism, e.g., anemia and nutrient deficiencies. Then, they were randomly selected to constitute the final sample that aimed to represent the total number of children with blood drawn. Hence, our efforts were to preserve the original characteristics of the sample and the representativeness of the original sample.

The ENANI-2019 study does not appear to present a bias towards higher socioeconomic status. Evidence from two major Brazilian population-based household surveys supports this claim. The 2017-18 Household Budget Survey (POF) reported an average monthly household income of 5,426.70 reais, while the Continuous National Household Sample Survey (PNAD) reported that in 2019, the nominal monthly per capita household income was 1,438.67 reais. In comparison, ENANI-2019 recorded a household income of 2,144.16 reais and a per capita income of 609.07 reais in infants with blood drawn, and 2,099.14 reais and 594.74 reais, respectively, in the serum metabolome analysis sample.

In terms of maternal education, the 2019 PNAD-Education survey indicated that 48.8% of individuals aged 25 or older had at least 11 years of schooling. When analyzing ENANI-2019 under the same metric, we found that 56.26% of ≥25 years-old mothers of infants with blood drawn had 11 years of education or more, and 51.66% in the metabolome analysis sample. Although these figures are slightly higher, they remain within a reasonable range for population studies.

It is well known that higher income and maternal education levels can influence child health outcomes, and acknowledging this, ENANI-2019 employed rigorous sampling methods to minimize selection biases. This included stratified and complex sampling designs to ensure that underrepresented groups were adequately included, reducing the risk of skewed conclusions. Therefore, the evidence strongly suggests that the ENANI-2019 sample is broadly representative of the Brazilian population in terms of both socioeconomic status and educational attainment.

Further to this, could the Authors describe how inequalities in access to care in the Brazilian population may have affected their results? Could they have included a measure of this possible discrepancy in their analyses?

Thank you for the concern.

The truth is that we are not in a position to answer this question because our study focused on gathering data on infant nutritional status and there is very limited information on access to care to allow us to hypothesize. Another important piece of information is that this national survey used sampling procedures that aimed to make the sample representative of the 15 million Brazilian infants under 5 years. Therefore, the sample is balanced according to socio-economic strata, so there is no evidence to make us believe inequalities in access to health care would have played a role.

The Authors state that the results of their study may be used to track children at risk for developmental delays. Could they discuss the potential for influencing policies and guidelines to address delayed development due to malnutrition and/or limited access to certain essential foods?

The point raised by the reviewer is very relevant. Recognizing that dietary and microbial derived metabolites involved in the gut-brain axis could be related to children's risk of developmental delays is the first step to bringing this topic to the public policy agenda. We believe the results can contribute to the literature, which should be used to accumulate evidence to overcome knowledge gaps and support the formulation and redirection of public policies aimed at full child growth and development; the promotion of adequate and healthy nutrition and food security; the encouragement, support, and protection of breastfeeding; and the prevention and control of micronutrient deficiencies.

Reviewer #3 (Public Review):

The ENANI-2019 study provides valuable insights into child nutrition, development, and metabolomics in Brazil, highlighting both challenges and opportunities for improving child health outcomes through targeted interventions and further research.

Readers might consider the following questions:

(1) Should investigators study the families through direct observation of diet and other factors to look for a connection between food taken in and gut microbiome and child development?

As mentioned before, the ENANI-2019 did not collect data on stool derived microbiome. However, there is data on child dietary intake with 24-hour recall that can be further explored in other studies.

(2) Can an examination of the mother's gut microbiome influence the child's microbiome? Can the mother or caregiver's microbiome influence early childhood development?

The questions raised by the reviewer are interesting and has been explored by other authors. However, we do not have microbiota data from the child nor from the mother/caregiver.

(3) Is developmental quotient enough to study early childhood development? Is it comprehensive enough?

Yes, we are confident it is comprehensive enough.

According to the World Health Organization, the term Early Childhood Development (ECD) refers to the cognitive, physical, language, motor, social and emotional development between 0 - 8 years of age. The SWCY milestones assess the domains of cognition, language/communication and motor. Therefore, it has enough content validity to represent ECD.

The SWYC is recommended for screening ECD by the American Society of Pediatrics. Furthermore, we assessed the internal consistency of the SWYC milestones questionnaire using ENANI-2019 data and Cronbach's alpha. The findings indicated satisfactory reliability (0.965; 95% CI: 0.963–0.968).

The SWCY is a screening instrument and indicates if the ECD is not within the expected range. If one of the above-mentioned domains are not achieved as expected the child may be at risk of ECD delay. Therefore, DQ<1 indicates that a child has not reached the expected ECD for the age group. We cannot say that children with DQ≥1 have full ECD, since we do not assess the socio-emotional domains. However, DQ can track the risk of ECD delay.

References

Blekherman, G., Laubenbacher, R., Cortes, D. F., Mendes, P., Torti, F. M., Akman, S., ... & Shulaev, V. (2011). Bioinformatics tools for cancer metabolomics. Metabolomics, 7, 329-343.

Gromski, P. S., Muhamadali, H., Ellis, D. I., Xu, Y., Correa, E., Turner, M. L., & Goodacre, R. (2015). A tutorial review: Metabolomics and partial least squares-discriminant analysis–a marriage of convenience or a shotgun wedding. Analytica chimica acta, 879, 10-23.

Wold, S., Sjöström, M., & Eriksson, L. (2001). PLS-regression: a basic tool of chemometrics. Chemometrics and intelligent laboratory systems, 58(2), 109-130.

LUIZ, RR., and STRUCHINER, CJ. Inferência causal em epidemiologia: o modelo de respostas potenciais [online]. Rio de Janeiro: Editora FIOCRUZ, 2002. 112 p. ISBN 85-7541-010-5. Available from SciELO Books http://books.scielo.org.

GREENLAND, S. & ROBINS, J. M. Identifiability, exchangeability, and epidemiological Confounding. International Journal of Epidemiolgy, 15(3):413-419, 1986.

Freitas-Costa NC, Andrade PG, Normando P, et al. Association of development quotient with nutritional status of vitamins B6, B12, and folate in 6–59-month-old children: Results from the Brazilian National Survey on Child Nutrition (ENANI-2019). The American journal of clinical nutrition 2023;118(1):162-73. doi: https://doi.org/10.1016/j.ajcnut.2023.04.026

Sheldrick RC, Schlichting LE, Berger B, et al. Establishing New Norms for Developmental Milestones. Pediatrics 2019;144(6) doi: 10.1542/peds.2019-0374 [published Online First: 2019/11/16]

Drachler Mde L, Marshall T, de Carvalho Leite JC. A continuous-scale measure of child development for population-based epidemiological surveys: a preliminary study using Item Response Theory for the Denver Test. Paediatric and perinatal epidemiology 2007;21(2):138-53. doi: 10.1111/j.1365-3016.2007.00787.x [published Online First: 2007/02/17]

VanderWeele, TJ Princípios de seleção de fatores de confusão. Eur J Epidemiol 34, 211–219 (2019). https://doi.org/10.1007/s10654-019-00494-6

David G. Kleinbaum, Lawrence L. Kupper; Hal Morgenstern. Epidemiologic Research: Principles and Quantitative Methods. 1991

Yan R, Liu X, Xue R, Duan X, Li L, He X, Cui F, Zhao J. Association between internet exclusion and depressive symptoms among older adults: panel data analysis of five longitudinal cohort studies. EClinicalMedicine 2024;75. doi: 10.1016/j.eclinm.2024.102767.

Zhong Y, Lu H, Jiang Y, Rong M, Zhang X, Liabsuetrakul T. Effect of homemade peanut oil consumption during pregnancy on low birth weight and preterm birth outcomes: a cohort study in Southwestern China. Glob Health Action. 2024 Dec 31;17(1):2336312.

Aristizábal LYG, Rocha PRH, Confortin SC, et al. Association between neonatal near miss and infant development: the Ribeirão Preto and São Luís birth cohorts (BRISA). BMC Pediatr. 2023;23(1):125. Published 2023 Mar 18. doi:10.1186/s12887-023-03897-3

Al-Haddad BJS, Jacobsson B, Chabra S, et al. Long-term risk of neuropsychiatric disease after exposure to infection in utero. JAMA Psychiatry. 2019;76(6):594-602. doi:10.1001/jamapsychiatry.2019.0029

Chan, A.Y.L., Gao, L., Hsieh, M.HC. et al. Maternal diabetes and risk of attention-deficit/hyperactivity disorder in offspring in a multinational cohort of 3.6 million mother–child pairs. Nat Med 30, 1416–1423 (2024).

Hernan MA, Robins JM (2020). Causal Inference: What If. Boca Raton: Chapman & Hall/CRC.

Greenland S; Pearl J; Robins JM. Confounding and collapsibility in causal inference. Statist Sci. 14 (1) 29 - 46 1999. https://doi.org/10.1214/ss/1009211805

Read the original source
eLife
Oct 19, 2024

eLife Assessment

This important work advances our understanding of factors influencing early childhood development. The large sample size and methodology applied make the findings of this study convincing; however, support for some of the claims made by the authors is incomplete. The work will be of interest to researchers in developmental science and early childhood pediatrics.

Read the original source
eLife
Oct 19, 2024

Reviewer #1 (Public Review):

Padilha et al. aimed to find prospective metabolite biomarkers in serum of children aged 6-59 months that were indicative of neurodevelopmental outcomes. The authors leveraged data and samples from the cross-sectional Brazilian National Survey on Child Nutrition (ENANI-2019), and an untargeted multisegment injection-capillary electrophoresis-mass spectrometry (MSI-CE-MS) approach was used to measure metabolites in serum samples (n=5004) which were identified via a large library of standards. After correlating the metabolite levels against the developmental quotient (DQ), or the degree of which age-appropriate developmental milestones were achieved as evaluated by the Survey of Well-being of Young Children, serum concentrations of phenylacetylglutamine (PAG), cresol sulfate (CS), hippuric acid (HA) and …

Reviewer #1 (Public Review):

Padilha et al. aimed to find prospective metabolite biomarkers in serum of children aged 6-59 months that were indicative of neurodevelopmental outcomes. The authors leveraged data and samples from the cross-sectional Brazilian National Survey on Child Nutrition (ENANI-2019), and an untargeted multisegment injection-capillary electrophoresis-mass spectrometry (MSI-CE-MS) approach was used to measure metabolites in serum samples (n=5004) which were identified via a large library of standards. After correlating the metabolite levels against the developmental quotient (DQ), or the degree of which age-appropriate developmental milestones were achieved as evaluated by the Survey of Well-being of Young Children, serum concentrations of phenylacetylglutamine (PAG), cresol sulfate (CS), hippuric acid (HA) and trimethylamine-N-oxide (TMAO) were significantly negatively associated with DQ. Examination of the covariates revealed that the negative associations of PAG, HA, TMAO and valine (Val) with DQ were specific to younger children (-1 SD or 19 months old), whereas creatinine (Crtn) and methylhistidine (MeHis) had significant associations with DQ that changed direction with age (negative at -1 SD or 19 months old, and positive at +1 SD or 49 months old). Further, mediation analysis demonstrated that PAG was a significant mediator for the relationship of delivery mode, child's diet quality and child fiber intake with DQ. HA and TMAO were additional significant mediators of the relationship of child fiber intake with DQ.

Strengths of this study include the large cohort size and study design allowing for sampling at multiple time points along with neurodevelopmental assessment and a relatively detailed collection of potential confounding factors including diet. The untargeted metabolomics approach was also robust and comprehensive allowing for level 1 identification of a wide breadth of potential biomarkers. Given their methodology, the authors should be able to achieve their aim of identifying candidate serum biomarkers of neurodevelopment for early childhood. The results of this work would be of broad interest to researchers who are interested in understanding the biological underpinnings of development and also for tracking development in pediatric populations, as it provides insight for putative mechanisms and targets from a relevant human cohort that can be probed in future studies. Such putative mechanisms and targets are currently lacking in the field due to challenges in conducting these kind of studies, so this work is important.

However, in the manuscript's current state, the presentation and analysis of data impede the reader from fully understanding and interpreting the study's findings. Particularly, the handling of confounding variables is incomplete. There is a different set of confounders listed in Table 1 versus Supplementary Table 1 versus Methods section Covariates versus Figure 4. For example, Region is listed in Supplementary Table 1 but not in Table 1, and Mode of Delivery is listed in Table 1 but not in Supplementary Table 1. Many factors are listed in Figure 4 that aren't mentioned anywhere else in the paper, such as gestational age at birth or maternal pre-pregnancy obesity.

The authors utilize the directed acrylic graph (DAG) in Figure 4 to justify the further investigation of certain covariates over others. However, the lack of inclusion of the microbiome in the DAG, especially considering that most of the study findings were microbial-derived metabolite biomarkers, appears to be a fundamental flaw. Sanitation and micronutrients are proposed by the authors to have no effect on the host metabolome, yet sanitation and micronutrients have both been demonstrated in the literature to affect microbiome composition which can in turn affect the host metabolome.

Additionally, the authors emphasized as part of the study selection criteria the following,
"Due to the costs involved in the metabolome analysis, it was necessary to further reduce the sample size. Then, samples were stratified by age groups (6 to 11, 12 to 23, and 24 to 59 months) and health conditions related to iron metabolism, such as anemia and nutrient deficiencies. The selection process aimed to represent diverse health statuses, including those with no conditions, with specific deficiencies, or with combinations of conditions. Ultimately, through a randomized process that ensured a balanced representation across these groups, a total of 5,004 children were selected for the final sample (Figure 1)."

Therefore, anemia and nutrient deficiencies are assumed by the reader to be important covariates, yet, the data on the final distribution of these covariates in the study cohort is not presented, nor are these covariates examined further.

The inclusion of specific covariates in Table 1, Supplementary Table 1, the statistical models, and the mediation analysis is thus currently biased as it is not well justified.

Finally, it is unclear what the partial-least squares regression adds to the paper, other than to discard potentially interesting metabolites found by the initial correlation analysis.

Read the original source
eLife
Oct 19, 2024

Reviewer #2 (Public Review):

A strength of the work lies in the number of children Padilha et al. were able to assess (5,004 children aged 6-59 months) and in the extensive screening that the Authors performed for each participant. This type of large-scale study is uncommon in low-to-middle-income countries such as Brazil.
The Authors employ several approaches to narrow down the number of potentially causally associated metabolites.
Could the Authors justify on what basis the minimum dietary diversity score was dichotomized? Were sensitivity analyses undertaken to assess the effect of this dichotomization on associations reported by the article? Consumption of each food group may have a differential effect that is obscured by this dichotomization.
Could the Authors specify the statistical power associated with each analysis?
Could the …

Reviewer #2 (Public Review):

A strength of the work lies in the number of children Padilha et al. were able to assess (5,004 children aged 6-59 months) and in the extensive screening that the Authors performed for each participant. This type of large-scale study is uncommon in low-to-middle-income countries such as Brazil.
The Authors employ several approaches to narrow down the number of potentially causally associated metabolites.
Could the Authors justify on what basis the minimum dietary diversity score was dichotomized? Were sensitivity analyses undertaken to assess the effect of this dichotomization on associations reported by the article? Consumption of each food group may have a differential effect that is obscured by this dichotomization.
Could the Authors specify the statistical power associated with each analysis?
Could the Authors describe in detail which metric they used to measure how predictive PLSR models are, and how they determined what the "optimal" number of components were?
The Authors use directed acyclic graphs (DAG) to identify confounding variables of the association between metabolites and DQ. Could the dataset generated by the Authors have been used instead? Not all confounding variables identified in the literature may be relevant to the dataset generated by the Authors.
Were the systematic reviews or meta-analyses used in the DAG performed by the Authors, or were they based on previous studies? If so, more information about the methodology employed and the studies included should be provided by the Authors.
Approximately 72% of children included in the analyses lived in households with a monthly income superior to the Brazilian minimum wage. The cohort is also biased towards households with a higher level of education. Both of these measures correlate with developmental quotient. Could the Authors discuss how this may have affected their results and how generalizable they are?
Further to this, could the Authors describe how inequalities in access to care in the Brazilian population may have affected their results? Could they have included a measure of this possible discrepancy in their analyses?
The Authors state that the results of their study may be used to track children at risk for developmental delays. Could they discuss the potential for influencing policies and guidelines to address delayed development due to malnutrition and/or limited access to certain essential foods?

Read the original source
eLife
Oct 19, 2024

Reviewer #3 (Public Review):

The ENANI-2019 study provides valuable insights into child nutrition, development, and metabolomics in Brazil, highlighting both challenges and opportunities for improving child health outcomes through targeted interventions and further research.

Strengths of the methods and results:
(1) The study utilizes data from the ENANI-2019 cohort, which was already existing. This cohort choice allows for longitudinal assessments and exploration of associations between metabolites and developmental outcomes. In addition, there was conservation of resources which are scanty in all settings in the current scenario.
(2) The study aims to investigate the relationship between circulating metabolites (exposure) and early childhood development (outcome), specifically developmental quotient (DQ). The objectives are clearly …

Reviewer #3 (Public Review):

The ENANI-2019 study provides valuable insights into child nutrition, development, and metabolomics in Brazil, highlighting both challenges and opportunities for improving child health outcomes through targeted interventions and further research.

Strengths of the methods and results:
(1) The study utilizes data from the ENANI-2019 cohort, which was already existing. This cohort choice allows for longitudinal assessments and exploration of associations between metabolites and developmental outcomes. In addition, there was conservation of resources which are scanty in all settings in the current scenario.
(2) The study aims to investigate the relationship between circulating metabolites (exposure) and early childhood development (outcome), specifically developmental quotient (DQ). The objectives are clearly stated, which facilitates focused research questions and hypotheses. The population that is studied is clearly mentioned.
(3) The study accessed a large number of children under five years, with blood collected from a final sample size of 5,004 children. The exclusion of infants under six months due to venipuncture challenges and lack of reference values highlights practical considerations in research design.
The study sample reflects a diverse range of children in terms of age, sex distribution, weight status, maternal education, and monthly family income. This diversity enhances the generalizability of findings across different sociodemographic groups within Brazil.
(4) The study uses standardized measures (e.g., DQ assessments) and chronological age. Confounding variables, such as child's age, diet quality, and nutritional status, are carefully considered and incorporated into analyses through a Directed Acyclic Graph (DAG). The mean DQ of 0.98 indicates overall developmental norms among the studied children, with variations noted across different demographic factors such as age, region, and maternal education. The prevalence of Minimum Dietary Diversity (MDD) being met by 59.3% of children underscores dietary patterns and their potential impact on health outcomes. The association between nutritional status (weight-for-height z-scores) and developmental outcomes (DQ) provides insights into the interplay between nutrition and child development.
The study identified key metabolites associated with developmental quotient (DQ):
Component 1: Branched-chain amino acids (Leucine, Isoleucine, Valine).
Component 2: Uremic toxins (Cresol sulfate, Phenylacetylglutamine).
Component 3: Betaine and amino acids (Glutamine, Asparagine).
The study focused on several serum metabolites like PAG (phenylacetylglutamine), CS (p-cresyl sulfate), HA (hippuric acid), TMAO (trimethylamine-N-oxide), MeHis (methylhistidine), and Crtn (creatinine). These metabolites are implicated in various metabolic pathways linked to gut microbiota activity, amino acid metabolism, and dietary factors.
These metabolites explained a significant portion of both metabolite variance (39.8%) and DQ variance (4.3%). The study suggests that these metabolites can be used as proxy measures of the gut microbiome in children.
(5) The use of partial least square regression (PLSR) with cross-validation (80% training, 20% testing) which is a robust approach to identify metabolites predictive of DQ, which minimizes overfitting. This model allows for outliers to remain outliers for transparency.
The Directed Acyclic Graph (DAG) identifies and adjusts for confounding variables (e.g., child's diet quality, nutritional status) and strengthens the validity of findings by controlling for potential biases. Developmental and gender differences were studied by testing interactions with the age of the child and the sex.
Mediation analysis exploring metabolites as potential mediators provides insights into underlying pathways linking exposures (e.g., diet, microbiome) with DQ.
The use of Benjamini-Hochberg correction for multiple comparisons and bootstrap tests (5,000 iterations) enhances the reliability of results by controlling false discovery rates and assessing significance robustly.

Significant correlations between serum metabolites and DQ, particularly negative associations with certain metabolites like PAG and CS, suggest potential biomarkers or pathways influencing developmental outcomes. Notably, these associations varied with age, suggesting different metabolic impacts during early childhood development.

Weaknesses:
(1) The data collected was incomplete especially those related to breastfeeding history and birth weight. These have been mentioned in the limitations of the study but yet might have been potential confounders or even factors leading to the particular identified metabolite state of the population.
(2) Other tests than mediation analysis might have been used to ensure reliability and robustness of the data. How data was processed, data cleaning methods, how outliers were handled and sensitivity analyses would ensure robustness of the findings.
(3) The generalizability of the data is not sound especially considering the children mostly belonged to a higher socioeconomic group in Brazil with mother or caregiver education being above a certain level. Comparative studies with children from other socio-economic groups and other cohorts might have been useful. Consideration of sample size adequacy and power analysis might have helped in generalizing the findings.
(4) Caution is needed in interpreting causality from this data because of the nature of the study design Discussing alternative explanations and potential confounding factors in more depth could strengthen the conclusions.

Appraisal
(1) The aims of the study were to identify associations between children's serum metabolome and Early Childhood development. This aim was met. The results do confirm their conclusions.
Impact of the work on the field

(1) Unless actual gut microbiome of children in this age group from gut bacteria examination or gastrointestinal examination of the gut of children, the causality of gut metabolome on early childhood development cannot be established with certainty. Because this may not be possible in every situation, proxy methods such as the one elucidated here might be useful, considering the risk-benefit ratio.
(2) More research is needed on this theme through longitudinal studies to validate these findings and explore underlying pathways involving gut-brain interactions and metabolic dysregulation.
Other readings: Readers are advised to read other research from other countries and other languages to understand the connection between gut microbiome, metabolite spectra, and child development. In addition to study the effect of these factors on child mental development too.

Readers might consider the following questions:
(1) Should investigators study the families through direct observation of diet and other factors to look for a connection between food taken in and gut microbiome and child development?
(2) Can an examination of the mother's gut microbiome influence the child's microbiome? Can the mother or caregiver's microbiome influence early childhood development?
(3) Is developmental quotient enough to study early childhood development? Is it comprehensive enough?

Read the original source
Version published to 10.1101/2024.05.09.24307119 on medRxiv
May 10, 2024