Natural language processing (NLP) techniques are widely used in linguistic analysis and have shown promising results in areas such as text summarization, text classification, autocorrection, chatbot conversation management, and many other applications. In education, NLP has primarily been applied to automated essay or open-ended question grading, semantic evaluation of student work, or the generation of feedback for intelligent tutoring-based student interaction. However, what is notably missing from NLP work to date is a robust automated framework for accurately analyzing text-based educational survey data. To address this gap, this case study uses NLP models to generate codes for thematic analysis of student needs for teaching assistant (TA) support and then compares code assignments for NLP vs. those assigned by an expert researcher.
Student responses to short answer questions regarding preferences for TA support were collected from an instructional support survey conducted in a broad range of electrical, computer, and mechanical engineering courses between 2016-2021 in engineering (N>1400) at a large public research institution. The resulting dataset was randomly split into training (60%), validation (20%), and test set (20%). A popular NLP topic modeling approach (Latent Dirichlet Allocation—LDA) was applied to the training dataset, which determined the optimal number of topics of code represented in the dataset to be four. These four topics were labeled as: (1) examples, where students expressed a need for TAs to illustrate additional problem-solving and applied content in engineering courses; (2) questions and answers, where students desired more opportunities to pose questions to TAs and obtain timely answers to those questions; (3) office hours, encompassing additional availability outside of formally scheduled class times; and (4) lab support. For the testing and validation datasets, an experienced researcher then used these four labels as codes to identify the ground truth for each student's response. Ground truth was then compared to NLP model predictions to gauge the accuracy of the model. For the validation dataset, the accuracy with which NLP identified each response as containing or not containing each code ranged from 79.4% to 91.1%, while for the testing dataset, such accuracies ranged from 81.1 to 92.2%. The codes identified by NLP were then combined into themes by a human researcher, resulting in three themes (problem-solving, interactions, and active/experiential learning). Conclusions reached regarding the three themes were identical whether the NLP codes or (human) researcher codes were used for data interpretation.
Short-answer questions, despite their value in providing deeper insight into the student experience, are infrequently used in educational research because the resulting data often requires prohibitive human resources to analyze. This study has demonstrated, in a case study of student preferences for TA support, the value of NLP in understanding large numbers of textual, short-answer responses from students. The fact that NLP models can deliver the same bottom line in minutes compared to the hours that traditional thematic analysis methods consume is promising for expanding the use of more nuanced, richer text-based data in survey-based education research.
Are you a researcher? Would you like to cite this paper? Visit the ASEE document repository at peer.asee.org for more tools and easy citations.