DT-DFRS: Enhanced Data-Free Robustness Stealing via Dual Teacher Guidance in Black-Box Settings

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Model Stealing Attacks (MSAs) are identified as a significant privacy threat to Machine Learning as a Service (MLaaS). MSAs aim to craft a substitute model that has the same performance by just querying MLaaS. Various techniques have been proposed to steal the accuracy as well as the robustness of target models so that these models achieve not only the same performance as the victim model but also their robustness against adversarial attacks. Since the training data, architecture, and parameters of these models are inaccessible due to privacy issues, most approaches rely on distillation methods. In this process, a clone model is trained to imitate the behavior of the target model, effectively stealing its efficiency.Robustness Distillation (RD) addresses both the efficiency and robustness challenges of existing models. However, most existing approaches focus solely on distilling model accuracy while neglecting robustness, despite its importance in safety-critical scenarios. Additionally, many approaches rely on access to real or proxy datasets, which is often infeasible due to privacy constraints. Other approaches assume the availability of Soft-Label (SL) predictions, which requires retrieving the outputs from the softmax layer lying before the final classification.In this paper, we propose a novel Dual Teacher Data-Free Hard-Label Robustness Stealing attack (DT-DFRS) that enables robustness distillation without requiring real or proxy data while preserving the model's efficiency in hard-label settings.Our experiments demonstrate how our DT-DFRS is effective over existing state-of-the-art data-free hard-label methods. Our proposed model improves the baseline by 3.41% and 3.13% for CIFAR-10 and CIFAR-100 datasets, respectively.

Article activity feed