
arXiv:2606.26449v1 Announce Type: cross Abstract: Retrieval-augmented systems routinely present citations alongside generated answers, yet a citation does not confirm that the corresponding source meaningfully shaped the output. This paper introduces ProvenAI, a framework that decomposes transparency in multi-hop question answering into three independently measurable layers: answer correctness, citation fidelity against benchmark supporting evidence, and per-document influence under leave-one-resource-out intervention. Targeting the HotpotQA distractor benchmark through a seven-stage pipeline
The proliferation of generative AI systems necessitates robust methods for verifying the veracity and traceability of their outputs, particularly as these systems are deployed in critical applications.
For a strategic reader, this work directly addresses the 'black box' problem in AI, enabling greater trust and accountability in AI-generated content, which is crucial for adoption in regulated industries and for preventing misinformation.
This research introduces concrete, measurable frameworks (ProvenAI) to evaluate answer correctness, citation fidelity, and document influence, moving beyond simple citation presence to deeper provenance analysis.
- · AI developers
- · Enterprise AI adopters
- · Fact-checking organizations
- · High-stakes AI applications
- · AI systems generating unsubstantiated answers
- · Systems lacking provenance tracking
Increased reliability and trustworthiness of AI-generated answers in retrieval-augmented systems.
Faster adoption of AI in sectors requiring high levels of auditability and transparency, such as finance, healthcare, and legal.
Enhanced regulatory frameworks for AI, possibly requiring 'provenance-native' architectures for certain applications.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI