SIGNALAI·Jun 15, 2026, 4:00 AMSignal75Medium term

The Perceived Fragility of Explanations in Audio Models: Manipulation of Attribution with Unchanged Predictions

arXiv:2606.14466v1 Announce Type: cross Abstract: This paper investigates the fragility of post-hoc explanation methods in audio deepfake detection. While previous work on explanation manipulation focused on images using standard $L_p$ metrics, we introduce a psychoacoustic framework that optimizes inaudible perturbations to decouple model attributions from final classifications. We evaluate this vulnerability across state-of-the-art architectures under strict prediction-preserving constraints. By evaluating the manipulation cost through domain-specific perceptual audio quality metrics alongsi

Why this matters

Why now

The rapid advancement and deployment of AI models, particularly in sensitive areas like deepfake detection, necessitates a deeper understanding of their vulnerabilities and the robustness of their explainability methods.

Why it’s important

This research highlights a critical vulnerability in AI transparency, demonstrating that interpretations of AI decisions can be manipulated even when the core prediction remains accurate, impacting trust and accountability.

What changes

The understanding that post-hoc explanations for AI models can be unreliable and manipulable introduces a new layer of complexity for AI safety, auditing, and deployment, especially in high-stakes applications.

Winners

· AI safety researchers
· Cybersecurity firms specializing in AI red-teaming
· Developers of robust explainable AI (XAI) methods

Losers

· Platforms relying solely on current post-hoc XAI for trust
· Organizations deploying AI without considering explanation vulnerability
· Users who implicitly trust AI explanations

Second-order effects

Direct

Increased scrutiny and demand for more robust and secure explainable AI (XAI) techniques across various domains.

Second

Potential for new attack vectors and adversarial manipulations targeting AI explanations, leading to misinformed human decisions.

Third

A broader philosophical debate on the ultimate trustworthiness of AI systems, even those that appear to provide explanations for their decisions.

Editorial confidence: 90 / 100 · Structural impact: 60 / 100

Original report

This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.

Read at arXiv cs.AI

#cs.SD #cs.AI #cs.LG

Tracked by The Continuum Brief · live intelligence network

The Brief · Weekly Dispatch

Stay ahead of the systems reshaping markets.