A call to ensure reproducibility of machine learning applications in industrial ecology

Keagan Hudson Rankin
Franco Donati
Qingshi Tu
Jesse Ward-Bond
Nolan Reitz
Brianna Hiser
Simon van Lierde
Sangwon Suh
Shoshanna Saxe
I. Daniel Posen
Jason K Hawes

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Machine learning (ML) usage in industrial ecology (IE) has grown nearly tenfold in the last decade. In other fields, similar increases in ML adoption have led to the widespread publication of results that cannot be reproduced. This uptick in irreproducibility, driven by a failure to follow best-practices when creating and reporting models - undermines the conclusions and credibility of science. Industrial ecologists have not yet determined whether reproducibility is becoming an issue of concern in their applications of ML. In order to assess this risk, we audited 50 recent IE studies against a ML reproducibility ontology. We find that 84% of surveyed studies suffer from computational reproducibility issues, and 28% exhibit methodological flaws that could introduce data leakage and invalidate findings. Yet, bibliometric analysis shows these potentially irreproducible studies are cited as or more frequently than their non-ML counterparts, which could embed flawed results into the scientific literature. Our findings serve as a call to action for the IE community. We suggest multi-level interventions, including that journals adopt reproducibility checklists and that reviewers prioritize key reproducibility errors over performance metrics, to safeguard the field and maximize the reproducibility of future ML-driven IE research.

Version published to 10.21203/rs.3.rs-9270723/v1 on Research Square
Apr 13, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed