Building a Database for Managing Weather Data and Algorithm for Tracking Severe Weather Patterns

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The weather has a significant impact on various aspects of lives. The consequences of severe weather conditions are manifold and detrimental, posing significant risks to human survival. In this paper we aim to develop a database for capturing weather data and to predict severe weather condition. The dataset used to achieve this aim was a secondary dataset collected from OpenWeather from 2/21/2024 to 2/27/2024. The weather dataset was analyzed quantitatively using descriptive statistic and correlation analysis. Furthermore, three supervised machine learning model (logistic regression, random forest, and neural network) was trained to predict the severity of the weather situation. Temperature exhibited a strong negative correlation with pressure (-0.640), moderate negative correlation with humidity (-0.296) and wind degree (-0.002), weak positive correlation with ground level (0.260) and wind speed (0.221). Also, pressure was found to have a weak negative correlations with humidity (-0.083), ground level (-0.387), and wind speed (-0.015), wind degree (-0.322) and cloud cover (-0.075). humidity showed a strong positive correlation with ground level (0.662) and wind degree (0.642). Wind degree had a moderate positive correlation with humidity (0.642) and ground level (0.528). The performance metric of the model was evaluated in three different instances (train, testing, all). On the training dataset NN achieved an accuracy of 1(95% CI: 0.9997 to 1), RF with an accuracy of 1.0(95% CI: 0.9998 to 1), logistic with an accuracy of 0.9651(95% CI: 0.9625 to 0.9676). On a new dataset (i.e., test dataset), RF maintained its accuracy of 1.0(95% CI: 0.9984 to 1), while NN had a slightly lower accuracy of 0.9996(95% CI: 0.9975 to 1), and LR achieved accuracy of 0.9663(95% CI: 0.958 to 0.973). Across the entire dataset, RF 1.0(95% CI: 0.9998 to 1) accuracy, The NN model performed nearly as well, with accuracy of 0.9999(95% CI: 0.9997 to 1), while LR maintained consistent but lower scores, recording accuracy of 0.9652(95% CI: 0.9628 to 0.9676). The results suggest that our weather prediction system was stable across the three instances effectively classifies weather conditions indicating its potential for real-world applications in weather prediction systems.

Article activity feed