A Convolutional Neural Networks Stochastic Petri Nets Hybrid Modeling Approach for Reliable Classification and Monitoring of Hadoop Cluster Nodes
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Ensuring the reliability and security of large-scale distributed infrastructures such as Hadoop clusters requires monitoring models capable of capturing both deterministic structural behavior and stochastic variations related to workload dynamics and potential anomalies. To address these challenges, we introduce a novel CNN-SPN hybrid model that explicitly combines a Convolutional Neural Network (CNN) with a Stochastic Petri Net (SPN) incorporating multiple probability distributions (multi-density), including exponential, normal, log-normal, Poisson, and Weibull laws. The SPN component provides a rigorous formalization of system behavior, while the CNN component learns discriminative patterns enabling supervised classification of nodes into stable and non-stable operational states. The hybrid architecture leverages the strengths of both paradigms: stochastic modeling to capture temporal and probabilistic transitions, and deep learning to generalize from observed execution traces. The proposed model is evaluated through simulation using the TimeNet environment, complemented by real cluster activity logs. Experimental results demonstrate that the CNN-SPN multi-density approach significantly improves classification performance, reaching an accuracy of 97.8%, a precision of 96.4%, a recall of 95.7%, and an F1-score of 96.0%, confirming consistency among metrics. Compared to traditional models such as Logistic Regression, SVM, and Random Forest, the CNN-SPN hybrid not only achieves higher accuracy but also effectively captures rare or extreme node failures that conventional models may miss. These results confirm the efficiency of integrating multi-density stochastic representation with deep neural architectures for detecting abnormal behaviors and enhancing the overall security and resilience of distributed systems.