VulnAgent-R2: Evidence-Calibrated Multi-Agent Auditing for Repository-Level Vulnerability Detection

arXiv:2603.13384v2 Announce Type: replace-cross Abstract: Software vulnerabilities often depend on cross-file data flow, build options, framework conventions, and runtime guards, so isolated function classifiers produce fragile and poorly calibrated warnings. Repository-level LLM agents can gather richer evidence, but prior variants under-specify reproducibility, verifier behavior, baseline fairness, and statistical uncertainty. We present VulnAgent-R2, a budget-aware agentic auditing framework with three additional reusable modules: counterfactual evidence reweighting, build-aware verificatio
The rapid advancement and integration of large language models (LLMs) into software development and security auditing workflows demands more robust, auditable, and reproducible agentic systems to address their inherent vulnerabilities.
This development indicates a significant maturation in the application of AI agents for complex software security tasks, directly impacting the integrity and reliability of software supply chains.
The introduction of frameworks like VulnAgent-R2 suggests a shift towards more systematic, evidence-based, and budget-aware AI-driven vulnerability detection, moving beyond isolated function analysis.
- · Software security industry
- · Organizations with large codebases
- · AI/ML security tooling developers
- · DevSecOps teams
- · Manual security auditors (for repetitive tasks)
- · Fragile, isolated vulnerability scanners
- · Software with hidden cross-file vulnerabilities
Improved detection of complex, cross-file software vulnerabilities using AI agents.
Reduced incidence of critical exploits in software due to earlier and more comprehensive detection capabilities.
A potential shift in the software development lifecycle to integrate continuous, sophisticated AI-driven auditing from design to deployment, potentially redefining industry security standards.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI