Cognitive erasure-coded data update and repairfor mitigating I/O overhead

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In erasure-coded storage systems, updating data necessitates parity updates to maintain data consistency, which leads to I/O amplification due to "write-after-read" operations. Additionally, the scattered storage of parity updates imposes significant disk seek overhead during data repair. To address these challenges, this paper proposes a Cognitive Update and Repair Method (CURM), which uses machine learning to classify files into write-only, read-only, and read-write categories, enabling customized update and repair strategies. For write-only and read-write files, CURM utilizes data difference and fine-grained I/O scheduling to reduce I/O overhead. Furthermore, CURM reserves disk space adjacent to parity chunks for read-write files, enabling efficient parallel reads and minimizing seek cost during repair. We implement CURM in a prototype storage system and evaluate its performance using real-world NFS and MSR workloads on a 25-node cluster. Experimental results show that CURM improves data update throughput by up to 82.52% and reduces data recovery time by up to 47.47%, while achieving lower storage overhead compared to state-of-the-art approaches including FL, PL, PLR, and PARIX. These results demonstrate CURM’s effectiveness in enhancing both update and recovery performance for large-scale erasure-coded storage systems.

Article activity feed