Predicting Fatal Mine Accident Categories Using Text Mining and Machine Learning: A Comparative Model Analysis

Arra Kumar
Suprakash Gupta

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The threat of mine accidents continues to endanger workers worldwide, so proper classification of accidents becomes essential for developing specific prevention measures. The research focuses on advancing reactive safety practices because prediction models must identify upcoming accident types during planning. Our method combines machine learning with text mining techniques to examine historical fatal accident reports using approaches that have not yet been included in mining safety literature. A statistical model prediction pipeline has been established to study fatal accident reports using natural language processing (NLP) with vectorisation techniques and six machine learning (ML) algorithms: logistic regression (LR), support vector machine (SVM), random forest (RF), naïve Bayes (NB), decision tree (DT), and multilayer perceptron (MLP) algorithms with stratified 10-fold cross validation. We evaluated the performance using the confusion matrix, precision, recall, and weighted average F1 scores revealed that the MLP model achieved superior performance (0.84 F1 score), followed by LR (0.83), SVM (0.81), RF (0.70), NB (0.60), and DT (0.57). The innovative aspect of this study is its use of complete text mining techniques against unstructured accident reports, which enables detection capabilities that typical structured data evaluation methods cannot achieve. Research shows that Natural Language Processing (NLP) and Machine Learning (ML) integrated systems create exceptional improvements in accident predictions. Mining safety authorities and stakeholders now have evidence-based prevention tools that aid the development of focused safety initiatives to lower mining-related deaths.

Version published to 10.21203/rs.3.rs-6634717/v1 on Research Square
Jul 11, 2025

Development and Validation of a Machine Learning-Based Risk Prediction Model for PICC-Related Bloodstream Infections in Premature Infants Using SHAP Interpretability

This article has 5 authors:
1. Yongqin Guo
2. Yingying Dou
3. Wenxia Song
4. Lihong Wang
5. Li Wang
This article has no evaluationsLatest version Dec 30, 2025
Heart Disease Detection with Machine Learning Algorithms

This article has 2 authors:
1. Fatemeh Hosseinabadi
2. Seyedhassan Sharifi
This article has no evaluationsLatest version Jan 6, 2026
Comparing Algorithm Effectiveness in Health Data Analysis

This article has 1 author:
1. Abdulmalik Hazaa Alshammari
This article has no evaluationsLatest version Jan 22, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Development and Validation of a Machine Learning-Based Risk Prediction Model for PICC-Related Bloodstream Infections in Premature Infants Using SHAP Interpretability

Heart Disease Detection with Machine Learning Algorithms

Comparing Algorithm Effectiveness in Health Data Analysis