2024 ASEE Annual Conference & Exposition

Board 408: Toward Building a Human-Computer Coding Partnership: Using Machine Learning to Analyze Short-Answer Explanations to Conceptually Challenging Questions

Presented at NSF Grantees Poster Session

In this paper, we report on the progress of a collaboration between engineering education and machine learning researchers to analyze student thinking in written short-answer responses to conceptually challenging questions using machine learning. Eliciting short-answer explanations that ask students to justify their answer choice to conceptually challenging multiple-choice questions has been shown to improve students’ answer choices, engagement, and overall conceptual understanding [1], [2]. Additionally, these short-answer responses provide valuable information for instructors and researchers to gain insight into student thinking [3]; however, analyzing these responses is cumbersome. Previous work utilizing natural language processing in education research has shown that Large Language Models (LLMs) such as T5 [4] and GPT-3 [5] are capable of coding the student responses reaching F1 scores up to 73% when trained on or prompted with in-context examples respectively [4]. Thus, using NLP to qualitatively code short-answer responses can help researchers and instructors gain more information about student thinking. We have the following goals:
- For instructors, we want to create a tool to help them learn about patterns of student reasoning and sense-making in short-answer responses. Utilizing this information can help them shift their instructional practices.
- For education researchers, we want to create a tool to help them understand and code aspects of student thinking in short-answer responses that can help them develop codes or themes for future study.
- For machine learning researchers, we aim to develop language models and a set of prompting strategies to code student answers. The models should be able to identify and annotate the key concept and reasoning behind the answer choice in the given text. We hope to develop language models and prompting strategies which can generalize on newer science questions which can help instructors use it as a tool to gain deeper insights into students’ understanding of the concept.

At the 2022 ASEE Annual Meeting [6], we described the preliminary results of work done to apply large pre-trained generative sequence-to-sequence language models [4], [5] to automate qualitative coding of short-answer explanations to a statics concept question. At the 2023 Annual Meeting [7], we began to conceptualize a human-computer partnership where human coding and computer coding can influence one another to better analyze student narratives of understanding. Additionally, we began thinking about promoting linguistic justice in our coding processes to ensure all narratives of understanding are attended to. This paper describes the progress we are making to improve our prompting strategies for GPT-4 and finetuning other open source LLMs like Llama-2 [8] on the manually coded answers, extending qualitative and machine learning analysis to another engineering context, as well as what we are doing to conceptualize a human-machine partnership to understand student thinking in written short-answer responses.

Authors

Harpreet Auby http://orcid.org/https://0000-0002-0117-6097 Tufts University [biography]

Harpreet (he/him) is pursuing a Ph.D. in Chemical Engineering at Tufts University under the guidance of Dr. Milo Koretsky. He earned a B.S. in Chemical Engineering from the University of Illinois at Urbana-Champaign in 2021, followed by an M.S. in STEM Education from Tufts University in 2023. Previously, he worked on studying shifts in learning assistant beliefs and the uptake of the Concept Warehouse. His current focus is analyzing short-answer explanations to statics, dynamics, and thermodynamics concept questions to understand student thinking and applications of LLMs to engineering education research. Harpreet’s research interests encompass chemical engineering education, learning sciences, and social justice.
Namrata Shivagunde University of Massachusetts, Lowell
Anna Rumshisky University of Massachusetts, Lowell
Dr. Milo Koretsky Tufts University [biography]

Milo Koretsky is the McDonnell Family Bridge Professor in the Department of Chemical and Biological Engineering and in the Department of Education at Tufts University. He is also co-Director of the Institute for Research on Learning and Instruction (IRLI). He received his B.S. and M.S. degrees from UC San Diego and his Ph.D. from UC Berkeley, all in chemical engineering.

Download paper (2.36 MB)

Are you a researcher? Would you like to cite this paper? Visit the ASEE document repository at peer.asee.org for more tools and easy citations.

» Download paper

« View session