Data Fusion Method for Multi-Sensor Internet of Things Systems Including Data Imputation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In Internet of Things (IoT) systems, data collected by geographically distributed sensors is often incomplete due to device failures, harsh deployment conditions, energy constraints, and unreliable communication. Such data gaps can significantly degrade downstream data processing and decision-making, particularly when failures result in the loss of all locally redundant sensors. Conventional imputation approaches typically rely on historical trends or multi-sensor fusion within the same target environment; however, historical methods struggle to capture emerging patterns, while same-location fusion remains vulnerable to single-point failures when local redundancy is unavailable. This article proposes a correlation-aware, cross-location data fusion framework for data imputation in IoT networks that explicitly addresses single-point failure scenarios. Instead of relying on co-located sensors, the framework selectively fuses semantically similar features from independent and geographically distributed gateways using summary statistics-based and correlation screening to minimize communication overhead. The resulting fused dataset is then processed using a lightweight KNN with an Iterative PCA imputation method, which combines local neighborhood similarity with global covariance structure to generate synthetic data for missing values. The proposed framework is evaluated using real-world weather station data collected from eight geographically diverse locations across the United States. The experimental results show that the proposed approach achieves improved or comparable imputation accuracy relative to conventional same-location fusion methods when sufficient cross-location feature correlation exists and degrades gracefully when correlation is weak. By enabling data recovery without requiring redundant local sensors, the proposed approach provides a resource-efficient and failure-resilient solution for handling missing data in IoT systems.

Article activity feed