Probing Outcome-Level Resemblance and Mechanism-Level Alignment in LLM Risk Decisions: Evidence from the St. Petersburg Game

arXiv:2606.04978v1 Announce Type: new Abstract: LLMs can appear cautious in risk decision-making tasks, yet cautious-looking outputs do not necessarily indicate alignment with human decision-making mechanisms. We investigate this distinction using the St. Petersburg game as a controlled testbed, a classical paradox in which the expected payoff is infinite, yet humans typically report low, finite willingness to pay. We evaluate 28 LLMs with a structured prompt suite that includes the original game; controlled decision variants that perturb truncation, repeated play, numeric endowment, and occup
The proliferation of powerful LLMs and their increasing deployment in decision-making contexts necessitates deeper understanding of their risk-taking behavior and alignment with human cognition.
Understanding how LLMs make risk decisions is crucial for their responsible deployment, especially in high-stakes environments where human-like judgment is expected or critical.
This research provides a structured methodology to evaluate the genuine alignment of LLMs' decision-making mechanisms, rather than just their surface-level outputs, challenging assumptions about AI caution.
- · AI Safety Researchers
- · LLM Developers (seeking robust models)
- · Regulatory Bodies
- · Companies deploying unaligned LLMs
- · Overly optimistic AI implementers
Increased scrutiny and more sophisticated testing methodologies for LLM deployment in critical applications will emerge.
Development of new LLM architectures or fine-tuning approaches specifically designed to explicitly align risk decision-making with human cognitive biases or preferences.
Potential for an 'AI risk alignment' industry to develop, focusing on diagnostics and remediation for decision-making flaws in advanced AI systems.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL