2025 ASEE Annual Conference & Exposition

FlexiGrader: an LLM-based personalized autograder to enable flexible and open-ended creative exploration in CS1

Presented at DSAI Technical Session 7: Natural Language Processing and LLM Applications

Computer Science courses often rely on programming assignments for learning assessment. Automatic grading (autograding) is a common mechanism to provide quick feedback to students and reduce teacher workload, especially in large classes. However, traditional autograders offer limited personalized feedback and often require all students to solve the same predefined problem, restricting creativity. In this paper, we address these limitations by developing an AI-based autograder that (1) can grade diverse, open-ended assignments where students work on independent, creative projects, enabling a new set of assessments in CS1 (introductory programming) courses, and (2) provides personalized feedback using large language models (LLMs). We present the design of a new assessment strategy in introductory programming courses where each student works on an open-ended problem for their summative assessment. We design generalized scaffolds (project proposal, schematic development, pseudocode, integration of files, and graphs) for these open-ended assessments so that each student completes a project of desired complexity. Existing autograders require rigid structure of inputs and outputs, and therefore, cannot grade such assessments. Our tool, FlexiGrader, integrates code execution verification and unit testing tailored to the specifications of each student individually, followed by code analysis using our fine-tuned Llama model to generate feedback and grades. FlexiGrader is capable of handling submissions from large classes and ensures flexibility in grading free-form assignments, making it easier for instructors to design and assess varied projects. The input requirements of our tool is a cover sheet that describes the individualized project, provides paths to external files, and describes the inputs needed to run the program that the student submits. Beyond this student-driven cover sheet, FlexiGrader provides options for the instructor to describe the grading rubric and choose the criteria that will be graded by the AI model. We hypothesize that live implementation of FlexiGrader in CS1 classrooms can enhance student self-efficacy and creativity in CS education by fostering independent project development. We plan to study this hypothesis in future research. Additionally, we discuss the operational costs of our autograding system, its compatibility with existing autograding frameworks, and the current limitations of our approach. By enabling more creative and personalized assignments, FlexiGrader has the potential to transform assessment practices in introductory computer science courses.

Authors
  1. Mr. Alexis Frias University of California Merced [biography]
  2. Shrivaikunth Krishnakumar San Jose State University [biography]
Note

The full paper will be available to logged in and registered conference attendees once the conference starts on June 22, 2025, and to all visitors after the conference ends on June 25, 2025