A Machine Learning Approach to Valve Plate Failure Prediction in Piston Pumps Under Imbalanced Data Conditions: Comparison of Data Balancing Methods
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This article focuses on the problem of building a real-world predictive maintenance system for hydraulic piston pumps. Particular attention is given to the issue of limited data availability regarding the failure state of systems with a damaged valve plate. The main objective of this work was to analyze the impact of imbalanced data on the quality of the failure prediction system. Several data balancing techniques, including oversampling, undersampling, and combined methods, were evaluated to overcome the limitations. The dataset used for evaluation includes recordings from eleven sensors, such as pressure, flow, and temperature, registered at various points in the hydraulic system. It also includes data from three additional vibration sensors. The experiments were conducted with imbalance ratios ranging from 0.5% to a fully balanced dataset. The results indicate that two methods, Borderline SMOTE, SMOTE+Tomek-Links dominate. These methods allowed the system to achieve the highest performance on a completely new dataset with different levels of damaged valve plates, for the balance rate larger than three percent. Furthermore, for balance rates below one percent, the use of data balancing methods may harm the model. Finally, our results indicate the limitations of the use of cross-validation procedure when assessing data balancing methods.