Experimental time series data with and without anomalies from a continuous distillation mini-plant for development of machine learning anomaly detection methods
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Reliable detection of process anomalies remains a challenge in industrial chemical plants. The ability of machine learning (ML) to recognize patterns has triggered numerous research efforts to apply ML to anomaly detection (AD). Typically, simulation based benchmarks like the Tennessee Eastman process are widely used in the development and training of AD methods. Real-world process data, which are crucial for meaningful research advancements, are lacking due to proprietary limitations in the industry. To overcome this issue, we present an openly accessible dataset of time series generated from an industry-like continuous distillation mini-plant under steady-state conditions. The data generated have different complexities: water runs, a heteroazeotropic separation of n-butanol and water, and a reactive process to produce a fuel additive. Chemical systems, plant setup, and anomalies (encountered or induced manually) are described alongside the sample data. The complete dataset, including sensor and actuator data, annotations to mark anomalies, and other metadata, is available in open access in an online repository. It serves as training and testing data for ML-based AD and other data-driven applications.