Clustering Unsupervised Representations as Defense against Poisoning Attacks on Speech Commands Classification System

arXiv:2606.28953v1 Announce Type: cross Abstract: Poisoning attacks entail attackers intentionally tampering with training data. In this paper, we consider a dirty-label poisoning attack scenario on a speech commands classification system. The threat model assumes that certain utterances from one of the classes (source class) are poisoned by superimposing a trigger on it, and its label is changed to another class selected by the attacker (target class). We propose a filtering defense against such an attack. First, we use DIstillation with NO labels (DINO) to learn unsupervised representations
The proliferation of AI systems across critical applications necessitates robust defenses against increasingly sophisticated adversarial machine learning attacks, which this research directly addresses.
Poisoning attacks can compromise the integrity and trustworthiness of AI systems, particularly in sensitive domains like speech command classification, making defense mechanisms crucial for adoption and reliability.
The proposed filtering defense offers a method to enhance the resilience of speech command classification systems against dirty-label poisoning attacks, improving their security baseline.
- · AI developers
- · Speech recognition companies
- · Cybersecurity firms
- · Organizations deploying AI
- · Adversarial attackers
- · Systems vulnerable to data poisoning
Increased trust and adoption of AI systems in critical applications through enhanced security measures.
Development of more sophisticated and robust adversarial attack techniques by malicious actors to circumvent new defenses.
A competitive arms race between AI security researchers and threat actors, driving continuous innovation in both attack and defense strategies.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI