CoTAL: Human-in-the-Loop Prompt Engineering for Generalizable Formative Assessment Scoring and Feedback

arXiv:2504.02323v4 Announce Type: replace Abstract: Large language models (LLMs) have created new opportunities to assist teachers and support student learning. While researchers have explored various prompt engineering approaches in educational contexts, the degree to which these approaches generalize across domains--such as science, computing, and engineering--remains underexplored. In this paper, we introduce Chain-of-Thought Prompting + Active Learning (CoTAL), an LLM-based approach to formative assessment scoring that (1) leverages Evidence-Centered Design (ECD) to align assessments and r
The proliferation of Large Language Models (LLMs) in educational contexts is driving rapid exploration into practical applications, including automated assessment, making prompt engineering for generalizability a current priority.
This development indicates a maturation of LLM application in education, moving beyond basic use cases to address critical issues of scalability and reliability in automated assessment, impacting learning outcomes and teaching efficiency.
Prompt engineering for LLMs is shifting towards methodologies that emphasize generalizability across diverse domains, rather than one-off, domain-specific solutions, thereby expanding their utility.
- · Educational technology companies
- · Teachers and educators
- · Students
- · AI developers
- · Traditional assessment providers
- · Inefficient manual grading processes
More efficient and consistent formative assessment and feedback will become widely available in educational settings.
This improved assessment infrastructure could lead to personalized learning paths more precisely tailored to individual student needs and learning styles.
The widespread adoption of generalizable AI assessment tools might contribute to a shift in how educational curricula are designed and adapted, focusing more on continuous feedback loops.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL