Letting Tutor Personas Speak Up for LLMs: Learning Steering Vectors from Dialogue via Preference Optimization

arXiv:2602.07639v2 Announce Type: replace Abstract: With the emergence of large language models (LLMs) as a powerful class of generative artificial intelligence (AI), their use in tutoring has become increasingly prominent. Prior works on LLM-based tutoring typically learn a single tutor policy and do not capture the diversity of tutoring styles. In real-world tutor-student interactions, pedagogical intent is realized through adaptive instructional strategies, with tutors varying the level of scaffolding, instructional directiveness, feedback, and affective support in response to learners' nee
The proliferation of LLMs creates a demand for more nuanced and adaptable AI applications, especially in specialized domains like education, pushing research towards dynamic persona-based interactions.
This development moves LLMs beyond static responses, enabling them to adopt diverse and adaptive pedagogical styles crucial for effective personalized learning, impacting educational outcomes and AI-human interfaces.
LLMs can now be explicitly trained to embody different tutor personas, allowing for more flexible and context-aware instructional strategies tailored to individual learner needs and preferences.
- · AI education platforms
- · Learners
- · Personalized learning technology developers
- · LLM developers
- · Static AI agents
- · Generic online course providers
LLMs will be capable of exhibiting a broader range of tailored interactive styles for specific user needs, enhancing engagement.
The ability to learn and adopt diverse personas could extend beyond tutoring to customer service, therapy, and other interactive AI applications.
This adaptive persona learning might lead to more sophisticated and human-like AI agents, blurring lines between human and artificial interaction in specific domains.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL