Inference-Time Backdoors via Chat Templates: From LLM Supply Chains to Agentic System Compromise

arXiv:2602.04653v4 Announce Type: replace-cross Abstract: Open-weight language models are increasingly used in production settings, raising new security challenges. One prominent threat is backdoor attacks, in which adversaries embed hidden behaviors that activate under specific conditions. Previous work has assumed that adversaries have access to training pipelines or deployment infrastructure. We propose a novel attack surface requiring neither: the "chat template". Chat templates are executable programs invoked at every inference call, often implemented in Jinja2, that occupy a privileged p
The increasing deployment of open-weight LLMs in production, coupled with the reliance on shared infrastructure like chat templates, creates novel attack vectors that are now being actively explored beyond traditional training or deployment vulnerabilities.
This research reveals a new and accessible attack surface for LLMs through chat templates, enabling inference-time backdoors without needing access to training data or deployment infrastructure, significantly lowering the bar for adversaries.
Adversaries can now compromise LLMs and agentic systems by injecting malicious code into widely used chat templates, affecting model behavior during inference rather than just during training or deployment.
- · Cybersecurity firms specializing in AI security
- · Developers of secure chat template parsers
- · Organizations prioritizing supply chain security for AI
- · Users of compromised open-weight LLMs
- · Organizations relying on unchecked third-party chat templates
- · Developers of agentic systems
Increased scrutiny and security hardening of LLM chat templates and inference pipelines.
Development of new tools and practices for verifying the integrity and security of LLM-related configurations and dependencies.
Potential for widespread, hard-to-detect compromises of agentic AI systems leading to unexpected or malicious automated actions.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG