Impacts of Histories and Models on LLM Grading: A Study in Advanced Software Engineering Courses

arXiv:2606.08400v1 Announce Type: cross Abstract: Graduate-level research reading report assessment creates a substantial labor burden for educators. While large language models (LLMs) hold great potential for automating academic grading, their reliability for this specialized task remains understudied, particularly regarding grading consistency, the lack of which represents a primary obstacle to educational fairness. This paper proposes a human-aligned LLM-assisted grading workflow and presents a case study based on 180 student submissions from a graduate advanced software engineering course.
The proliferation of advanced LLMs has naturally led to exploring their application in automating demanding academic tasks like grading, and this paper provides a timely case study on their reliability.
Automating academic grading with LLMs could significantly reduce educator burden and scale specialized education, provided issues of consistency and fairness are addressed.
The potential for more consistent and efficient grading in advanced academic settings, leveraging LLMs, moves closer to practical implementation based on real-world testing.
- · Educators
- · Educational technology providers
- · Students (through faster feedback)
- · AI developers
- · Traditional manual grading systems
- · Institutions resistant to AI adoption
LLMs begin to be integrated into more academic institutions for assessment tasks, starting with less subjective assignments.
The demand for specialized LLMs trained on educational rubrics and subject matter knowledge increases, fostering new development in educational AI.
The role of human educators shifts from primary graders to reviewers, curriculum designers, and AI oversight, leading to restructured academic workflows and teaching methods.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.AI