Are Food Safety Classifiers Learning Hazards or Memorizing Firms? Entity-Level Leakage in FDA Recall Severity Prediction

Peilun Li
Juk-Sen Tang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Machine learning (ML) models for predicting food recall severity could accelerate regulatory triage, yet no systematic benchmark exists on the U.S.\ Food and Drug Administration (FDA) open-access database. We construct the first comprehensive ML benchmark for FDA food recall severity classification (Class I / II / III) using 28,448 enforcement records spanning 2012--2025. A 1,437-dimensional feature space is engineered from TF-IDF and Sentence-BERT embeddings of recall narratives, structured categorical attributes, and temporal indicators. Five classifiers (Logistic Regression, Random Forest, XGBoost, LightGBM, CatBoost) are trained with Optuna-tuned hyperparameters. Under standard random splitting, XGBoost achieves Macro-F1 = 0.89; however, a multi-layer leakage audit reveals that this figure is inflated by entity-level autocorrelation. When firm-aware group splitting, temporal splitting, or their combination is applied, Macro-F1 drops to approximately 0.57. A firm-mode baseline---assigning each company's historically most frequent severity class---reaches 0.82 under random splitting, demonstrating that 92% of the apparent performance stems from firm-level memorisation. Identity-masking experiments confirm that the leakage is structural rather than attributable to explicit company-name tokens. A \( 2 \times 2 \) factorial decomposition shows that firm overlap and temporal continuity are highly collinear; removing either suffices to expose the true generalisation floor. A hazard-type decomposition reveals that pathogen--severity associations transfer across firms, whereas labelling and GMP violations are highly firm-specific, explaining the disproportionate collapse of Class~III prediction under group splitting. SHAP analysis, feature ablation, and a nine-year continuous-learning simulation provide additional insights into model behaviour and retraining strategies. We recommend that food-safety ML studies adopt group-aware or temporal evaluation protocols, report entity-overlap statistics, and include entity-prior baselines to prevent overstated conclusions.

Version published to 10.20944/preprints202603.0343.v1
Mar 4, 2026

Detecting Illicit Investment in Real Estate: A Machine‑Learning Approach to Rare‑Event AML Risk

This article has 1 author:
1. Mark Lokanan
This article has no evaluationsLatest version Mar 5, 2026
Machine learning early warning for financial distress in health plan operators

This article has 2 authors:
1. Guilherme Coelho
2. Clarimar José Coelho
This article has no evaluationsLatest version Mar 4, 2026
EGADB: An Enhanced Genetic Algorithm for Class Imbalance Problems

This article has 4 authors:
1. Oluwafunmilola Aderannibi Adepegba
2. Stephen Olatunde Olabiyisi
3. Solomon Akinboro
4. Emmanuel Okyere Ekwam
This article has no evaluationsLatest version Feb 10, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Detecting Illicit Investment in Real Estate: A Machine‑Learning Approach to Rare‑Event AML Risk

Machine learning early warning for financial distress in health plan operators

EGADB: An Enhanced Genetic Algorithm for Class Imbalance Problems