2023 ASEE Annual Conference & Exposition

Predicting Academic Performance for Pre/Post-Intervention on Action-State Orientation Surveys

Presented at Effective Teaching and Learning, and Post-Pandemic Classrooms

The objective of this study is to analyze the responses to the action/state orientation surveys administered to freshman and junior students in engineering and psychology majors and explore the individual survey responses as potential predictors of the students’ academic performance using statistical methods including machine learning algorithms and related data analytics.

The datasets used (so far) for this objective include students in the following cohorts:
- Spring 2021 Cohort (1) – Electrical Engineering Juniors,
- Spring 2021 Cohort (2) – General Engineering Freshman,
- Spring 2021 Cohort (3) – Psychology Majors,
- Fall 2021 Cohort (1) – General Engineering Freshman, and,
- Fall 2021 Cohort (2) – Psychology Majors.

In addition to the direct responses, we also generated functions to represent features and attributes for each response, such as efficacy, habits, hesitation, preoccupancy, volatility, engagements in curricular and extracurricular activities.

The student populations from all cohorts were combined to create a master survey list. Binary categories have been defined as academic failure (GPA < 2.0) or not (GPA > 2.0) based on the self-reported GPA by the students. Since students with GPA > 2.0 have constituted a much larger percentage of the population, we approached this problem as one-class anomaly detection, a well-defined area of machine learning. We implemented six different machine learning algorithms including K-Means clustering, deep neural networks (DNNs), principal component analysis (PCA), Guassian process regression (GPR), one-class autoencoders (OCAE) and one-class support vector machines (OCSVM) to identify if a student is academically successful (GPA > 2.0) or not. The highest accuracy topologies were OCAEs and OCSVMs.

The ML models were trained using only the students with GPA > 2.0 with randomly selected survey questions. Once a model has been created and trained, we tested the architecture using survey responses that were never seen by the model. This test dataset consisted of a subsample of students with GPA > 2.0 and all the students with GPA < 2.0. As a reminder, up until this point the model had never seen any survey data from students with GPA < 2.0. The expectation was that the model would accurately categorize these test instances as anomaly samples based on the reconstruction error comparisons with the normal samples.

The train/test procedure was repeated for thousands of combinations of 18 randomly selected survey questions from the 60-question survey to find out which questions more consistently result in better predictions of academic failure. The best performing 18 feature groups were recorded for the top-10 most accurate classification scenarios using the area-under-curve (AUC) score as an indicator of percentage-based performance for binary classification tasks (i.e., is the student’s GPA < 2.0 or not) specifically for heavily biased datasets such as this. For instance, a score of 0.744 means that approximately ~%74.4 of the time we can identify a student’s likelihood of having a lower GPA using the survey questions used for that specific combination. After analyzing the performance results, and looking at the top performing combinations, we observed that the responses to questions such as 59, 26, etc. have disproportionally larger representations among the more accurate categorizations. Most of these questions involve study habits (as expected), but some also include extracurricular activities such as involvement in student clubs including IEEE as and on-campus housing activities.

Future work will include analyzing other cohorts and validate the model’s performance on new surveys. We will also investigate ensemble learning to find out if similarly grouped survey questions can be used for independent categorization which can then be combined with ML methods such as majority voting or winner-takes-all. Finally, models trained on pre/post intervention surveys will be tested on post/pre intervention surveys respectively to analyze the differences in model performance.

Authors
  1. Prof. Ismail Uysal University of South Florida [biography]
  2. Paul E. Spector University of South Florida
  3. Dr. Chris S. Ferekides University of South Florida
  4. Mehmet Bugrahan Ayanoglu University of South Florida
Download paper (1.41 MB)

Are you a researcher? Would you like to cite this paper? Visit the ASEE document repository at peer.asee.org for more tools and easy citations.