A Survey of Recent Advances in Adversarial Attack and Defense on Vision-Language Models

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In the rapidly advancing domain of artificial intelligence, Vision-Language Models (VLMs) have emerged as critical tools by synergizing visual and textual data processing to facilitate a multitude of applications including automated image captioning, accessibility enhancements, and intelligent responses to multimodal queries. This survey explores the evolving paradigm of Pre-training, Fine-tuning, and Inference that has notably enhanced the capabilities of VLMs, allowing them to perform effectively across various downstream tasks and even enable zero-shot predictions. Despite their advancements, VLMs are vulnerable to adversarial attacks, largely because of their reliance on large-scale, internet-sourced pre-training datasets. These attacks can significantly undermine the models' integrity by manipulating their input interpretations, posing severe security risks and eroding user trust. Our survey delves into the complexities of these adversarial threats, which range from single-modal to sophisticated multimodal strategies, highlighting the urgent need for robust defense mechanisms. We discuss innovative defense strategies that adapt model architectures, integrate adversarially robust training objectives, and employ fine-tuning techniques to counteract these vulnerabilities. This paper aims to provide a comprehensive overview of current challenges and future directions in the adversarial landscape of VLMs, emphasizing the importance of securing these models to ensure their safe integration into various real-world applications.

Article activity feed