
arXiv:2605.23032v1 Announce Type: cross Abstract: Brain-LLM alignment is well established in English, yet the brain's language network is neuroanatomically universal across languages. Does alignment also generalize cross-linguistically, and what governs the variation? We test this using fMRI data from 112 participants across English, Chinese, and French (the Le Petit Prince corpus) and seven LLMs spanning English-dominant, Chinese-dominant, and multilingual architectures. Our central finding is that training-language dominance, not an inherent property of English, drives the alignment pattern:
The proliferation of multilingual large language models and advanced neuroimaging techniques allows for deeper cross-linguistic analysis of AI-human cognition alignment.
This research provides crucial insights into the fundamental mechanisms of AI-brain alignment, suggesting that AI models reflect their training data more than universal cognitive structures, influencing future AI development and ethical considerations.
Understanding that training data dominance, rather than inherent linguistic properties, drives brain-LLM alignment shifts focus towards the composition and biases of training datasets for true cross-linguistic generalization.
- · Developers of diverse, multilingual AI models
- · Neuroscience researchers
- · Multilingual AI platforms
- · Developers relying solely on English-centric models for global applications
- · Hypotheses of universal AI cognitive alignment independent of training
Increased emphasis on creating culturally and linguistically diverse training datasets for AI.
Development of specialized LLMs for specific linguistic and cultural contexts, moving away from 'one-size-fits-all' approaches.
Potential for sovereign AI initiatives to focus intensely on developing unique, culturally resonant training data and models for their respective languages.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI