
arXiv:2605.21726v1 Announce Type: new Abstract: The generative nature of Large Language Models (LLMs) is reflected in the conditional probabilities they compute to sample each response token given the previous tokens. These probabilities encode the distributional structure that the model learns in training and exploits in inference. In this work, we use these probabilities to situate LLMs within the mathematical theory of stochastic processes. We use this framework to design a model-agnostic probabilistic token attribution measure, using Bayes rule to invert the next-token log-probabilities so
The increasing scale and complexity of LLMs necessitate advanced methods to understand their internal workings and attribute their outputs, especially as regulatory scrutiny and enterprise adoption grow.
A robust probabilistic attribution method for LLMs offers new tools for explainability, safety, and debugging, which are critical for trust and widespread deployment in sensitive applications.
The ability to invert log-probabilities to attribute specific tokens probabilistically changes how we can analyze LLM decisions, shifting from opaque outputs to quantifiable influence pathways.
- · AI Safety Researchers
- · LLM Developers
- · Auditors and Regulators
- · Enterprises deploying LLMs
- · Black-box LLM providers
- · Applications requiring only shallow interpretability
Improved toolsets for understanding and debugging generative AI outputs become available.
Enhanced explainability leads to more trustworthy and auditable LLM deployments in critical sectors like finance and healthcare.
Standardized attribution metrics could emerge, influencing future LLM design and regulatory compliance frameworks.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL