SIGNALAI·May 26, 2026, 4:00 AMSignal75Short term

RepetitionCurse: Measuring and Understanding Router Imbalance in Mixture-of-Experts LLMs under DoS Stress

arXiv:2512.23995v2 Announce Type: replace-cross Abstract: Mixture-of-Experts architectures have become the standard for scaling large language models due to their superior parameter efficiency. To accommodate the growing number of experts in practice, modern inference systems commonly adopt expert parallelism to distribute experts across devices. However, the absence of explicit load balancing constraints during inference allows adversarial inputs to trigger severe routing concentration. We demonstrate that out-of-distribution prompts can manipulate the routing strategy such that all tokens ar

Why this matters

Why now

The increasing adoption of Mixture-of-Experts (MoE) architectures in Large Language Models (LLMs) makes the discovery of their vulnerability to Denial-of-Service (DoS) attacks on routing a critical and timely finding.

Why it’s important

This research reveals a significant security and performance vulnerability in a foundational AI architecture, which could impact the reliability and cost-efficiency of large-scale AI deployments.

What changes

Adversarial inputs can now be proven to manipulate MoE routing, leading to severe load imbalance and potential DoS attacks, necessitating new approaches to model robustness and inference system security.

Winners

· AI security researchers
· Companies developing robust MoE load balancing solutions
· Organizations prioritizing AI model resilience

Losers

· Developers relying on current MoE routing without explicit load balancing
· Users vulnerable to prompt-based DoS attacks
· LLM providers with unmitigated MoE vulnerabilities

Second-order effects

Direct

Immediate efforts will focus on developing and implementing more robust routing and load-balancing mechanisms for MoE architectures.

Second

The discovery could lead to a re-evaluation of MoE security postures and potential short-term delays in large-scale MoE deployments for critical applications.

Third

This vulnerability might prompt a broader industry focus on 'adversarial robustness' not just in model outputs, but also in internal architectural mechanisms and resource allocation.

Editorial confidence: 90 / 100 · Structural impact: 60 / 100

Original report

This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.

Read at arXiv cs.LG

#cs.CR #cs.LG

Tracked by The Continuum Brief · live intelligence network

The Brief · Weekly Dispatch

Stay ahead of the systems reshaping markets.