Zero-Knowledge Cross-User De-Duplication for Big Data Storage on Cloud
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The rise of cloud computing and the prodigious volume of data in cloud storage has been a forerunner and expediter to the emergence of big data. Cloud computing has become the platform for centralized pools of resources such as applications, networks and storage services. As there is an enormous increase in data, it leads to duplicate/redundant copies of information on cloud servers. To eliminate that redundant information, data de-duplication has become the mainstream technology in cloud storage. In the view of removing redundant data, secure cross user source-based de-duplication and integrity auditing delegation technique have been perlustrated. In this paper, a pooled technique is proposed to perform both Zero-Knowledge and public integrity auditing of data in big data storage. De-duplication, along with data integrity, frees up the space in the cloud server and ensures the security of the stored data. The originality of the proposed methodology covers three folds. Firstly, an enhanced cuckoo hashing algorithm is used to identify duplicate chunks, which improves the overall performance as the time required to find is significantly less compared to traditional hashing algorithms. Secondly, after hashing, the de-duped data hash index is transmitted and checked for de-duplication in Cloud Service Provider (CSP). The missing hash values of the corresponding data chunk alone will be sent to the CSP. If there is any data loss during this transmission, it can be recovered through Reed-Solomon (RS) mechanism. Thirdly, the proposed key exchange algorithm inherits the security enhancement of the Advanced Encryption Standard (AES). Several experiments are conducted on the proposed technique with state-of-art under various use cases and user scenarios. The proposed framework uses less computation power and frees up storage space up to 86.77 % which is higher than other state-of-the-art deduplication algorithms. The upload and download response time for file encoding is stable even when the file size is more prodigious. Subsequently, the proposed zero-knowledge cross-user data de-dupe mechanism is stable despite the variation in file size.