SIGNALAI·Jun 2, 2026, 4:00 AMSignal75Medium term

Cross-Generational Transfer of Adversarial Attacks Reveals Non-Monotonic Safety Alignment in LLMs

arXiv:2606.00813v1 Announce Type: cross Abstract: Safety alignment in LLMs does not improve monotonically across model generations. Studying four generations of Google's Gemma family (7B-31B) with quality-diversity evolution (MAP-Elites) as an automated red-teaming probe, we find that Gemma 3 (12B) exhibits 68.7% +/- 5.7% attack success rate (ASR; mean +/- std, 3 seeds), significantly higher than its predecessor Gemma 2 (45.5% +/- 7.2%; p = 0.030, paired bootstrap) and its successor Gemma 4 (33.9% +/- 1.8%). Replaying evolved attack archives across generations reveals that attacks from other g

Why this matters

Why now

This research provides new empirical evidence that LLM safety alignment is non-monotonic, meaning progress is not linear across different model versions.

Why it’s important

It challenges the assumption that newer LLM generations are inherently safer, highlighting the complex and potentially regressive nature of safety mechanisms.

What changes

Developers and red-teaming efforts must assume that new LLM versions might be more vulnerable to adversarial attacks, requiring continuous and generational-specific evaluations.

Winners

· Red-teaming expertise and services
· Cybersecurity firms specializing in AI
· Independent AI safety researchers

Losers

· LLM developers without robust, continuous safety testing
· Users relying solely on version numbers for safety assurance
· Generic, one-off safety audit methodologies

Second-order effects

Direct

Increased emphasis and investment in generational AI safety evaluation and attack transfer mechanisms.

Second

Potential for regulatory bodies to demand more stringent, continuous safety audits across LLM development cycles.

Third

Divergence in LLM adoption based on proven, transparent safety methodologies rather than simply model size or generation number.

Editorial confidence: 90 / 100 · Structural impact: 60 / 100

Original report

This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.

Read at arXiv cs.CL

#cs.CR #cs.CL #cs.ET #cs.LG #cs.NE

Tracked by The Continuum Brief · live intelligence network

The Brief · Weekly Dispatch

Stay ahead of the systems reshaping markets.