2024 ASEE Annual Conference & Exposition

WIP: Traditional Engineering Assessments Challenged by ChatGPT: An Evaluation of its Performance on a Fundamental Competencies Exam

Presented at Educational Research and Methods Division (ERM) Technical Session 19

ChatGPT, a chatbot which produces text with remarkable coherence, is leading higher education institutions to question the relevance of the current model of engineering education and, particularly, assessment. Among the many reasons behind this questioning is the fact that ChatGPT has been shown to be able to pass various engineering exams.
In this research, the GPT-3.5 and GPT-4 models were used to solve different real-life versions of the Fundamental Competencies Exam (FCE), an exam used by a selective Latin American engineering school upon the completion of quintessential engineering courses like basic dynamics, ethics for engineers, and probability and statistics. The formulation of the questions seeks to demonstrate that the student has the fundamental knowledge of the discipline.
We adopted a strategy in which the questions were extracted from the FCE modules, and translated to LaTeX. The statements of each question were presented in the absence of supplementary context to avoid influence between questions within the same exam. In addition, a comparative analysis of the effectiveness of the GPT-4 model is performed, evaluating its performance with and without image interpretation capability, due to the recent inclusion of the multimodal function of ChatGPT-4.
The results obtained reveal that the difference in the pass rate of GPT-3.5 and GPT-4 is considerable, with 47.38% and 63.06% respectively. While the GPT-4 version without images achieved a passing rate sufficient to pass the exam in all modules, the results that include questions with images increased even more, reaching a 64.38% pass rate. We must continue to solve different versions of the exam. These data will allow us to perform multiple analyses related to the historical performance of the exam, providing a proxy to assess how difficulty has changed over the years.
In light of these preliminary results, and given the tight constraints imposed on the model, it is imperative to question whether the FCE effectively assesses the fundamental skills required for an engineer and whether it is the best method for assessing foundational engineering competencies amidst the advent of innovative AI tools.

Authors
  1. Trini Balart Pontificia Universidad Católica de Chile [biography]
  2. Dr. Jorge Baier Pontificia Universidad Católica de Chile [biography]
  3. Martín Eduardo Castillo Pontificia Universidad Católica de Chile [biography]
Download paper (1.88 MB)

Are you a researcher? Would you like to cite this paper? Visit the ASEE document repository at peer.asee.org for more tools and easy citations.