Ultra-processed food intake and colorectal cancer risk in the NIH-AARP Diet and Health Study

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Background

Ultra-processed foods (UPF) account for >50% of calories consumed by US adults. Strong evidence links whole grain, fiber, calcium, and dairy intake to lower and processed meat intake to higher colorectal (CRC) risk. UPF, include some whole grain and dairy products and most processed meats. Studies of UPF intake and CRC risk are inconsistent.

Objective

To estimate the association between UPF intake and CRC risk as well as to evaluate the joint effect of UPF intake and diet quality with CRC risk and to estimate associations of select food groups and nutrients with CRC risk by UPF and non-UPF source .

Methods

US adults, aged 50-71, who participated in the NIH-AARP Diet and Health Study self-reported dietary intake using a validated food frequency questionnaire (FFQ). We assigned disaggregated FFQ items to Nova classification and categorized UPF intake (g/1000 kcal/day) into sex-specific quintiles. We used multivariable-adjusted Cox proportional hazards regression models to estimate hazard ratios (HR) and 95% confidence intervals (CI) for CRC.

Results

Over 20 years of follow-up, 10,075 colorectal adenocarcinoma cases were diagnosed among 461,682 participants who were cancer-free at baseline. Median UPF intake was 293 g/1000 kcal/day or 43% of daily energy intake. UPF intake was not associated with incident CRC (HR Q5vs.Q1 =0.97; 95% CI, 0.91-1.03; P trend =.55) overall or by anatomic location (all P trend >.05). Whole grain, dairy, and calcium intake were inversely but meat intake was positively associated with CRC risk regardless of processing level.

Conclusions

Total UPF intake was not associated with incident CRC in this cohort of older, US adults. This may be explained, in part, by opposing effects of some UPF on CRC etiology. Our findings support current dietary guidance to consume whole grains, fiber, dairy, and calcium and avoid processed meat for CRC prevention.

Article activity feed

  1. This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/18162576.

    Summary

    This preprint examines the association between ultra-processed food (UPF) intake and colorectal cancer (CRC) risk in the NIH–AARP Diet and Health Study, a very large prospective cohort with over 20 years of follow-up. Using FFQ-based NOVA classification, the authors report largely null associations between total UPF intake and CRC incidence overall and by anatomic subsite. Importantly, the study goes beyond aggregate UPF exposure by examining diet quality, source-specific nutrients/foods, and compositional substitution models, offering a nuanced interpretation that opposing components within UPF may obscure associations with CRC risk.

    Overall, this is a carefully conducted, transparent, and methodologically sophisticated analysis that makes a valuable contribution to the ongoing debate on UPF and cancer risk, particularly in the U.S. context.

    Strengths

    • Exceptional sample size and follow-up provide strong statistical power for overall and subsite-specific CRC analyses.

    • Thoughtful exposure construction, including disaggregation of FFQ items and validation against calibration sub-studies.

    • Comprehensive analytical strategy, including joint analyses with HEI-2015, source-specific nutrient/food models, and compositional substitution analyses.

    • Balanced interpretation that appropriately situates the null findings within prior mixed evidence and highlights population-specific UPF composition as a plausible explanation.

    • The source-specific analyses convincingly show that established CRC-related dietary factors (e.g., calcium, dairy, processed meat) operate similarly regardless of processing level, which is an important insight for public health messaging.

    Major Concerns / Limitations

    1. Exposure misclassification over long follow-up UPF intake is assessed only once in the mid-1990s, with no time-updated measures. Given substantial changes in the U.S. food supply over the past two decades, this likely induces non-differential misclassification and attenuation toward the null.

    2. NOVA classification limitations NOVA assignment relies on database linkage rather than ingredient-level data, particularly for mixed or ambiguous foods (e.g., yogurt, breads). This is acknowledged, but the likely magnitude of attenuation is not quantified.

    3. Handling of proportional hazards violations The proportional hazards assumption is violated and addressed via stratified 5-year intervals. While reasonable, this approach is relatively coarse; more flexible time-varying coefficient models could better characterize temporal heterogeneity.

    Suggestions for Strengthening the Manuscript

    • Clarify missing data handling, including the extent of missingness and whether complete-case analysis, missing indicators, or multiple imputation was used.

    • Add quantitative bias analysis, such as regression calibration using the NIH–AARP calibration sub-study or sensitivity analyses (e.g., E-values), to assess the likely impact of exposure misclassification and unmeasured confounding.

    • Present exposure distributions more fully, including quintile cutpoints and IQRs for UPF metrics, to aid interpretation of effect sizes.

    • If feasible, expand UPF subgroup analyses (e.g., ultra-processed meats, SSBs, breads/cereals) to test the hypothesis of heterogeneous effects within UPF.

    Overall Assessment

    This study provides robust evidence that total UPF intake, as currently measured in older U.S. adults, is not a useful standalone marker for colorectal cancer risk. The null findings are plausible given likely non-differential misclassification and the presence of both protective and harmful components within UPF in this population. The large scale, rigorous methods, and triangulation across multiple analytical approaches make this an informative and valuable contribution. With additional transparency around missing data, exposure distributions, and sensitivity analyses, the manuscript would be further strengthened.

    Competing interests

    The author declares that they have no competing interests.

    Use of Artificial Intelligence (AI)

    The author declares that they did not use generative AI to come up with new ideas for their review.