Safety-Aware Multi-Agent Deep Reinforcement Learning for Adaptive Fault-Tolerant Control in Sensor-Lean Industrial Systems: Validation in Beverage CIP

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Fault-tolerant control in safety-critical industrial systems demands adaptive responses to equipment degradation, parameter drift, and sensor failures while maintaining strict operational constraints. Traditional model-based controllers struggle under these conditions, requiring extensive retuning and dense instrumentation. Recent safe multi-agent reinforcement learning (MARL) frameworks with control barrier functions (CBFs) achieve real-time constraint satisfaction in robotics and power systems, yet assume comprehensive state observability—incompatible with sensor-hostile industrial environments where instrumentation degradation and contamination risks dominate design constraints. This work presents a safety-aware multi-agent deep reinforcement learning framework for adaptive fault-tolerant control in sensor-lean industrial environments, achieving formal safety through learned implicit barriers under partial observability. The framework integrates four synergistic mechanisms: (1) multi-layer safety architecture combining constrained action projection, prioritized experience replay, conservative training margins, and curriculum-embedded verification achieving zero constraint violations; (2) multi-agent coordination via decentralized execution with learned complementary policies. Additional components include (3) curriculum-driven sim-to-real transfer through progressive four-stage learning achieving 85–92% performance retention without fine-tuning; (4) offline extended Kalman filter validation enabling 70% instrumentation reduction (91–96% reconstruction accuracy) for regulatory auditing without real-time estimation dependencies. Validated through sustained deployment in commercial beverage manufacturing clean-in-place (CIP) systems—a representative safety-critical testbed with hard flow constraints (≥1.5 L/s), harsh chemical environments, and zero-tolerance contamination requirements—the framework demonstrates superior control precision (coefficient of variation: 2.9–5.3% versus 10% industrial standard) across three hydraulic configurations spanning complexity range 2.1–8.2/10. Comprehensive validation comprising 37+ controlled stress-test campaigns and hundreds of production cycles (accumulated over 6 months) confirms zero safety violations, high reproducibility (CV variation < 0.3% across replicates), predictable complexity–performance scaling (R2=0.89), and zero-retuning cross-topology transferability. The system has operated autonomously in active production for over 6 months, establishing reproducible methodology for safe MARL deployment in partially-observable, sensor-hostile manufacturing environments where analytical CBF approaches are structurally infeasible.

Article activity feed