When the Prompt Becomes Visual: Vision-Centric Jailbreak Attacks for Large Image Editing Models

arXiv:2602.10179v2 Announce Type: replace-cross Abstract: Recent advances in large image editing models have shifted the paradigm from text-driven instructions to vision-prompt editing, where user intent is inferred directly from visual inputs such as marks, arrows, and visual-text prompts. While this paradigm greatly expands usability, it also introduces a critical and underexplored safety risk: the attack surface itself becomes visual. In this work, we propose Vision-Centric Jailbreak Attack (VJA), the first visual-to-visual jailbreak attack that conveys malicious instructions purely through
The rapid advancement of large image editing models and the transition to vision-prompt editing naturally create new vectors for adversarial attacks, making this research timely.
This development highlights a critical and under-explored security vulnerability in a rapidly evolving AI paradigm, posing risks to product safety, intellectual property, and model integrity.
The attack surface for AI models now extends significantly into the purely visual domain, requiring new defensive techniques beyond traditional text-based prompt engineering and moderation.
- · AI security researchers
- · Cybersecurity firms
- · Companies developing robust AI defense mechanisms
- · Developers of large image editing models
- · Users relying solely on text-based safety filters
- · Platforms highly dependent on visual-prompt interfaces
Image-editing AI models become vulnerable to malicious visual inputs, leading to undesirable or harmful outputs.
Increased investment in visual adversarial defense mechanisms and a shift in how model safety is designed and implemented.
Public distrust in the integrity of AI-generated or edited visual content, potentially impacting industries from media to e-commerce.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI