In this paper, we report on the progress of a collaboration between engineering education and machine learning researchers to analyze student thinking in written short-answer responses to conceptually challenging questions using machine learning. Eliciting short-answer explanations that ask students to justify their answer choice to conceptually challenging multiple-choice questions has been shown to improve students’ answer choices, engagement, and overall conceptual understanding [1], [2]. Additionally, these short-answer responses provide valuable information for instructors and researchers to gain insight into student thinking [3]; however, analyzing these responses is cumbersome. Previous work utilizing natural language processing in education research has shown that Large Language Models (LLMs) such as T5 [4] and GPT-3 [5] are capable of coding the student responses reaching F1 scores up to 73% when trained on or prompted with in-context examples respectively [4]. Thus, using NLP to qualitatively code short-answer responses can help researchers and instructors gain more information about student thinking. We have the following goals:
- For instructors, we want to create a tool to help them learn about patterns of student reasoning and sense-making in short-answer responses. Utilizing this information can help them shift their instructional practices.
- For education researchers, we want to create a tool to help them understand and code aspects of student thinking in short-answer responses that can help them develop codes or themes for future study.
- For machine learning researchers, we aim to develop language models and a set of prompting strategies to code student answers. The models should be able to identify and annotate the key concept and reasoning behind the answer choice in the given text. We hope to develop language models and prompting strategies which can generalize on newer science questions which can help instructors use it as a tool to gain deeper insights into students’ understanding of the concept.
At the 2022 ASEE Annual Meeting [6], we described the preliminary results of work done to apply large pre-trained generative sequence-to-sequence language models [4], [5] to automate qualitative coding of short-answer explanations to a statics concept question. At the 2023 Annual Meeting [7], we began to conceptualize a human-computer partnership where human coding and computer coding can influence one another to better analyze student narratives of understanding. Additionally, we began thinking about promoting linguistic justice in our coding processes to ensure all narratives of understanding are attended to. This paper describes the progress we are making to improve our prompting strategies for GPT-4 and finetuning other open source LLMs like Llama-2 [8] on the manually coded answers, extending qualitative and machine learning analysis to another engineering context, as well as what we are doing to conceptualize a human-machine partnership to understand student thinking in written short-answer responses.
Are you a researcher? Would you like to cite this paper? Visit the ASEE document repository at peer.asee.org for more tools and easy citations.