2026 ASEE Annual Conference & Exposition

Toward Scalable Assessment of Undergraduate Reflective Practice: Comparing Multiple Reflection Quality Codebook Validation Approaches

Presented at Reflection

This empirical research, research brief presents preliminary findings on the development and validation of a theory-driven and automation-ready qualitative codebook for assessing reflection quality contextualized to STEM education learning environments. Reflective writing is increasingly common in STEM higher education, supporting students’ metacognition, conceptual understanding, professional identity development, and life-long learning. However, researchers and instructors both face a challenge: how can we assess reflection quality at scale to support systematic analyses in specific courses? Current approaches to qualitative analysis of reflection quality are time- and labor-intensive, vary across raters, and do not typically produce actionable, personalized feedback, especially in large classrooms. In response, this research seeks to answer: How can varying codebook validation methods support the reliability of qualitative codebook applications in STEM undergraduate reflective writing assignments?

To answer this question, we first present our two-dimensional codebook with example excerpts illustrating levels of reflection quality focused on qualities of both abstraction and situatedness. Abstraction refers to the exhibited level of reflective thinking grounded in Bain et al.’s (2024) 5R framework for reflection quality (spanning Reporting/Responding, Relating, Reasoning, and Reconstruction). Situatedness refers to course contextual factors expected to surface in reflections for a given learning environment (e.g., course activities, learning objectives, academic discipline). This codebook was created using abductive coding methods, and it was applied to a subset (N=400 undergraduate student reflection sessions) of our larger data corpus of reflective writing (N=3,000+) to capture all forms of situatedness and associated reflection quality in given text units.

The second portion of this paper focuses on a codebook interrater reliability analysis, an adaptive comparative judgement analysis with expert instructors, as well as a comparative analysis of the affordances and constraints of these two codebook validation approaches. Our methods push the boundaries of validation and application of qualitative codebooks in STEM Education Research towards a reality where researchers and practitioners alike can use and adapt codebooks for reflection quality to specific course contexts and automate their application with the help of AI. This work stands in contrast to traditional codebook generation and uses in DBER today, where face validity is often the primary (and only) form of codebook validation and where codebooks tend to remain specified or bounded for certain educational contexts.

Findings aim towards scaling of reflection quality qualitative analyses with the enablement of AI, where automation-ready, and therefore validated, reliable, and contextualizable, codebooks are necessary. Future work will advance understanding of how STEM undergraduate students develop as reflective practitioners while providing validated tools that leverage AI to enable scalable methods for qualitative analysis of reflection quality.

Authors
Note

The full paper will be available to logged in and registered conference attendees once the conference starts on June 21, 2026, and to all visitors after the conference ends on June 24, 2026