MalScore: A Quality Assessment Framework for Visual Malware Datasets Using No-Reference Image Quality Metrics
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The Internet has been progressively more integrated into daily life through its evolutionary stages, ranging from web1.0 to the current development of web3.0. These continued integrations broaden the attack surface that cybercriminals aim to exploit. The prevalence of cybercrimes, particularly malware attacks, has become increasingly sophisticated and made more accessible through dark web marketplaces. Including artificial intelligence (AI) within anti-virus solutions has challenged the traditional dichotomy of malware detection schemes, offering more accurate and holistic detection capabilities. Research has shown that transforming malware files into textured images offers resistance to obfuscation and the potential to detect zero-days. This paper explores the application of image quality assessment (IQA) techniques in enhancing visual malware dataset curation. We propose a novel framework that applies a no-reference IQA algorithm to evaluate current datasets and offer guidance in future dataset curation. We use datasets of various popularities and sizes, applying a standardized framework to assess and compare their effectiveness based on multiple metrics. These metrics raised awareness of malware's continuous evolution and were derived from a need for robust detection mechanisms. A proposed MalScore formula and framework are applied to facilitate the evaluation and ranking of datasets. This work bridges the gap between IQA and visual malware detection to set the foundation for future work to further unite the two research fields. The evaluation findings demonstrate that the proposed framework effectively distinguishes between datasets, highlighting strengths and areas for improvement.