2026 ASEE Annual Conference & Exposition

Scaling Formative Feedback in Programming Education through Mastery Learning and AI

Presented at DSAI-Session 7: Adaptive Learning, Personalized Feedback, and Global Teamwork

The mastery learning framework emphasizes formative assessment, where students are given opportunities to fail, receive feedback, and improve until they reach proficiency.
In this approach, timely and consistent feedback is essential, yet remains a major challenge in large-enrollment programming courses.
Current mastery learning platforms typically rely on automated testing, which provides correctness but offers little guidance on code quality or style.
As a result, students may achieve functional solutions without receiving formative insights until instructional staff intervene, creating a resource bottleneck.

This pilot study explores the integration of large language models (LLMs) into an existing mastery learning platform for programming courses.
Our system operates in three phases: (1) automated correctness testing through traditional test cases, (2) generation of targeted semantic feedback via a secure, institutionally hosted LLM API once a correctness threshold is achieved, and (3) post-processing to refine the feedback so that it remains constructive without over-directing the student’s problem-solving process.
The system enables instructors to customize the assessment criteria, aligning feedback with specific learning objectives.

We evaluated the system by deploying it in a graduate-level programming course in the machine learning track at Duke University, using a within-subject, two-stage survey design.
Baseline surveys captured student experiences with grade-only feedback, while post-deployment surveys assessed perceived clarity, usefulness, efficiency, and motivational impact of the LLM-augmented feedback.
Although this pilot study received feedback from a small cohort size (N = 9), the qualitative results indicated that students preferred the AI-supported feedback over the grade-only baseline as they reported improved interpretability, reduced error diagnosis time, and lower perceived reliance on teaching assistants.

This work provides evidence that workflow-constrained LLM pipelines, rather than raw model capability, deliver scalable, pedagogically aligned formative feedback.
The findings suggest a practical hybrid assessment model where deterministic grading ensures correctness, LLMs provide structured interpretive guidance, and instructors focus on higher-order conceptual learning.

Authors

Mr. Weihang Zhang Duke University [biography]

Weihang Zhang is a Master of Engineering student in Electrical and Computer Engineering at Duke University, with a concentration in Machine Learning and Artificial Intelligence. His research interests include large language models for education, automated formative feedback systems, and scalable software infrastructure for programming assessment.
Dr. Javier Pastorino http://orcid.org/https://0000-0002-2641-9833 Duke University [biography]

Dr. Javier Pastorino is an Assistant Professor in Electrical and Computer Engineering at Duke University. His work bridges academia and industry, drawing on more than twenty years of experience across roles in software engineering. He holds a Ph.D. in Computer Science with a focus on artificial intelligence, and his research centers on data management, machine learning for scientific applications, and privacy-aware machine learning from a software engineering perspective. In addition to his research contributions, he is actively engaged in enhancing engineering education through the development of scalable, technology-driven learning systems.

Note

The full paper will be available to logged in and registered conference attendees once the conference starts on June 21, 2026, and to all visitors after the conference ends on June 24, 2026

« View session

For those interested in:

computer science
engineering
Faculty
Graduate