
arXiv:2606.15565v1 Announce Type: cross Abstract: Large Language Models (LLMs) are increasingly being used in museums to as role playing chatbots which let visitors talk to simulated versions of people and artefacts from the past. While such installations can be playful and engaging, they are also problematic because LLMs cannot be trusted to speak truthfully. I identify a fundamental dilemma for the use of LLMs in museum chatbots: LLMs cannot be trusted to tell the truth, and efforts to make them more reliable may ruin that which is attractive about the bots in the first place - their ability
The proliferation of Large Language Models (LLMs) is leading to their application in diverse public-facing roles, including cultural institutions, which necessitates addressing their inherent limitations in factual accuracy.
This highlights a fundamental tension in LLM application between engaging user experience and factual integrity, which is critical for their adoption in domains requiring truthfulness or authority.
The reliance on LLMs for historical or factual representation without robust truthfulness mechanisms is being critically examined, potentially redefining their permissible use cases in public information settings.
- · AI ethics researchers
- · Museum curators focusing on verified content
- · Developers of verifiable AI/truthfulness frameworks
- · Uncritical deployment of LLM chatbots
- · Institutions prioritizing novelty over accuracy
- · Visitors seeking factual historical discussions from LLMs
Museums and cultural institutions will become more cautious about deploying LLM-powered chatbots for factual interactions.
This caution will likely lead to demand for LLMs specifically designed with higher factual accuracy or built-in truthfulness mechanisms for public information roles.
The debate over 'playful engagement' versus 'factual accuracy' in LLM applications will broaden to other sectors, such as education and journalism, influencing regulatory discussions.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG