2025 ASEE Annual Conference & Exposition

BOARD # 440: RFE: Machine Learning for Student Reasoning during Challenging Concept Questions - Year 2

Presented at NSF Grantees Poster Session II

In this NSF Grantees Poster Session Paper, we describe our progress on a project funded by NSF Research in the Formation of Engineers (RFE) between engineering education researchers at Tufts University and machine learning researchers at the University of Massachusetts Lowell to use machine learning to understand student reasoning in short-answer responses written by students to challenging questions in mechanics and thermodynamics [1] - [4]. Concept questions are multiple-choice questions that require little to no math and ask students to problem-solve using recently learned concepts [5], [6]. Short-answer justifications to concept questions have been shown to improve student engagement and learning outcomes so these responses can provide a wealth of information to instructors and researchers regarding student understanding [7] - [9]. However, the large amounts of text are difficult to analyze. Researchers have utilized machine learning to automate feedback and grading, provide tutoring, and conduct additional analyses of short- and long-answer texts [10] - [19]. Recently, the application of Transformer-based large language models (LLMs) [20] to qualitative research has emerged due to their generative capabilities, prompting education and machine learning researchers to look further into their use. For this project, we have the following goals:
- For instructors: Gain information about patterns, trends, and ideas of student thinking that they could utilize in their instructional practices and pedagogical decision-making.
- For education researchers: Provide ways to analyze student understanding in various institutional contexts at a scale not feasible with manual coding.
Here, we describe our work applying state-of-the-art Transformer LLMs (including T5 [21], GPT-3 [22], GPT-4 [23], Mixtral-of-Experts [24], and ATLAS.ti Intentional coding powered by OpenAI [25]) to the task of analyzing student responses to concept questions in mechanics and chemical engineering thermodynamics. We then expand upon the work done in Year 2 to improve our language models and progress toward developing a generative AI tool to automate analysis of student responses for the [tool blinded for peer review].

References
[1] H. Auby, N. Shivagunde, A. Rumshisky, and M. D. Koretsky, “WIP: Using machine learning to automate coding of student explanations to challenging mechanics concept questions,” in Proceedings of the 2022 American Society of Engineering Education Annual Conference & Exposition, Jun. 2022. [Online]. Available: https://peer.asee.org/40507
[2] H. Auby and M. D. Koretsky, “Work in progress: Using machine learning to map student narratives of understanding and promoting linguistic justice,” in Proceedings of the 2023 American Society of Engineering Education Annual Conference & Exposition, Jun. 2023.
[3] H. Auby, N. Shivagunde, A. Rumshisky, and M. Koretsky, “Utilizing machine learning to analyze short-answer responses to conceptually challenging chemical engineering thermodynamics questions,” in Proceedings of the 2024 American Society of Engineering Education Annual Conference & Exposition, Portland, Oregon, Jun. 2024.
[4] H. Auby, N. Shivagunde, A. Rumshisky, and M. Koretsky, “Board 408: Toward Building a human-computer coding partnership: Using machine learning to analyze short-answer explanations to conceptually challenging questions,” presented at the 2024 ASEE Annual Conference & Exposition, Jun. 2024. Accessed: May 01, 2025. [Online]. Available: https://peer.asee.org/board-408-toward-building-a-human-computer-coding-partnership-using-machine-learning-to-analyze-short-answer-explanations-to-conceptually-challenging-questions
[5] E. Mazur, Peer Instruction: A User’s Manual. in Series in Educational Innovation. Prentice Hall, 1997.
[6] C. H. Crouch and E. Mazur, “Peer Instruction: Ten years of experience and results,” Am. J. Phys., vol. 69, no. 9, pp. 970–977, Sep. 2001, doi: 10.1119/1.1374249.
[7] M. D. Koretsky, B. J. Brooks, R. M. White, and A. S. Bowen, “Querying the questions: Student responses and reasoning in an active learning class,” J. Eng. Educ., vol. 105, no. 2, pp. 219–244, 2016, doi: 10.1002/jee.20116.
[8] M. D. Koretsky, B. J. Brooks, and A. Z. Higgins, “Written justifications to multiple-choice concept questions during active learning in class,” Int. J. Sci. Educ., vol. 38, no. 11, pp. 1747–1765, Jul. 2016, doi: 10.1080/09500693.2016.1214303.
[9] E. Wheeler and R. L. McDonald, “Writing in engineering courses,” J. Eng. Educ., vol. 89, no. 4, pp. 481–486, 2000, doi: 10.1002/j.2168-9830.2000.tb00555.x.
[10] X. Zhai, Y. Yin, J. W. Pellegrino, K. C. Haudek, and L. Shi, “Applying machine learning in science assessment: a systematic review,” Stud. Sci. Educ., vol. 56, no. 1, pp. 111–151, Jan. 2020, doi: 10.1080/03057267.2020.1735757.
[11] X. Zhai, K. C. Haudek, L. Shi, R. H. Nehm, and M. Urban-Lurain, “From substitution to redefinition: A framework of machine learning-based science assessment,” J. Res. Sci. Teach., vol. 57, no. 9, pp. 1430–1459, 2020, doi: 10.1002/tea.21658.
[12] X. Zhai, K. C. Haudek, C. Wilson, and M. Stuhlsatz, “A framework of construct-irrelevant variance for contextualized constructed response assessment,” Front. Educ., vol. 6, 2021, Accessed: Feb. 08, 2024. [Online]. Available: https://www.frontiersin.org/articles/10.3389/feduc.2021.751283
[13] X. Zhai, L. Shi, and R. H. Nehm, “A meta-analysis of machine learning-based science assessments: Factors impacting machine-human score agreements,” J. Sci. Educ. Technol., vol. 30, no. 3, pp. 361–379, Jun. 2021, doi: 10.1007/s10956-020-09875-z.
[14] X. Zhai, J. Krajcik, and J. W. Pellegrino, “On the validity of machine learning-based Next Generation Science assessments: A validity inferential network,” J. Sci. Educ. Technol., vol. 30, no. 2, pp. 298–312, Apr. 2021, doi: 10.1007/s10956-020-09879-9.
[15] K. C. Haudek and X. Zhai, “Examining the effect of assessment construct characteristics on machine learning scoring of scientific argumentation,” Int. J. Artif. Intell. Educ., Dec. 2023, doi: 10.1007/s40593-023-00385-8.
[16] S. Maestrales, X. Zhai, I. Touitou, Q. Baker, B. Schneider, and J. Krajcik, “Using machine learning to score multi-dimensional assessments of chemistry and physics,” J. Sci. Educ. Technol., vol. 30, no. 2, pp. 239–254, 2021.
[17] S. Hilbert et al., “Machine learning for the educational sciences,” Rev. Educ., vol. 9, no. 3, p. e3310, 2021, doi: 10.1002/rev3.3310.
[18] P. P. Martin, D. Kranz, P. Wulff, and N. Graulich, “Exploring new depths: Applying machine learning for the analysis of student argumentation in chemistry,” J. Res. Sci. Teach., vol. n/a, no. n/a, doi: 10.1002/tea.21903.
[19] B. J. Yik, A. J. Dood, D. C. R. de Arellano, K. B. Fields, and J. R. Raker, “Development of a machine learning-based tool to evaluate correct Lewis acid–base model use in written responses to open-ended formative assessment items,” Chem. Educ. Res. Pract., vol. 22, no. 4, pp. 866–885, 2021.
[20] A. Vaswani et al., “Attention is All you Need,” in Advances in Neural Information Processing Systems, Curran Associates, Inc., 2017. Accessed: Aug. 09, 2024. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
[21] C. Raffel et al., “Exploring the limits of transfer learning with a unified Text-to-Text Transformer,” Jul. 28, 2020, arXiv: arXiv:1910.10683. Accessed: Apr. 03, 2023. [Online]. Available: http://arxiv.org/abs/1910.10683
[22] T. B. Brown et al., “Language models are few-shot learners,” Jul. 22, 2020, arXiv: arXiv:2005.14165. Accessed: Apr. 03, 2023. [Online]. Available: http://arxiv.org/abs/2005.14165
[23] OpenAI et al., “GPT-4 Technical Report,” Mar. 04, 2024, arXiv: arXiv:2303.08774. doi: 10.48550/arXiv.2303.08774.
[24] A. Q. Jiang et al., “Mixtral of Experts,” Jan. 08, 2024, arXiv: arXiv:2401.04088. doi: 10.48550/arXiv.2401.04088.
[25] “AI Coding powered by OpenAI,” ATLAS.ti. [Online]. Available: https://atlasti.com/ai-coding-powered-by-openai

Authors

Harpreet Auby http://orcid.org/https://0000-0002-0117-6097 Tufts University [biography]

Harpreet (he/him) is pursuing a Ph.D. in Chemical Engineering at Tufts University under the guidance of Dr. Milo Koretsky. He earned a B.S. in Chemical Engineering from the University of Illinois at Urbana-Champaign in 2021, followed by an M.S. in STEM Education from Tufts University in 2023. Previously, he worked on studying shifts in learning assistant beliefs and the uptake of the Concept Warehouse. His current focus is analyzing short-answer explanations to statics, dynamics, and thermodynamics concept questions to understand student thinking and applications of LLMs to engineering education research. Harpreet’s research interests encompass chemical engineering education, learning sciences, and social justice.
Namrata Shivagunde University of Massachusetts Lowell
Anna Rumshisky University of Massachusetts Lowell
Dr. Milo Koretsky Tufts University [biography]

Milo Koretsky is the McDonnell Family Bridge Professor in the Department of Chemical and Biological Engineering and in the Department of Education at Tufts University. He received his B.S. and M.S. degrees from UC San Diego and his Ph.D. from UC Berkeley,

Note

The full paper will be available to logged in and registered conference attendees once the conference starts on June 22, 2025, and to all visitors after the conference ends on August 18, 2025

« View session