2025 ASEE Annual Conference & Exposition

An Assessment of ChatGPT 4o's Performance on Mechanical Engineering Concept Inventories

Presented at ME Division Technical Session 2 - Harnessing AI and Machine Learning to Transform ME Education

Large Language Models (LLMs) like OpenAI’s ChatGPT-4o show promise for enhancing engineering education through real-time support and personalized feedback. However, their reliability in interpreting the conceptual diagrams central to mechanical engineering remains uncertain. This study evaluates ChatGPT-4o’s performance on four concept inventories—Force Concept Inventory, Materials Concept Inventory, Mechanics Baseline Test, and Mechanics of Materials Concept Inventory—using assessments by two Mechanical Engineering professors based on correctness, depth of explanation, and application of theoretical knowledge. While ChatGPT-4o demonstrates the ability to provide robust explanations, it often lacks the contextual depth required for higher-order concept mastery, especially when reasoning from diagrams. These findings align with existing literature highlighting AI’s limitations in discipline-specific support. Future research should refine AI responses to better align with engineering problem-solving approaches and explore hybrid models integrating AI assistance with human instruction, potentially leading to more effective AI-augmented learning platforms in mechanical engineering education.

Authors
  1. Hillary E. Merzdorf Cornell University [biography]
  2. Xiaosu Guo Orcid 16x16http://orcid.org/0009-0007-0571-3746 University of Texas at Dallas
  3. Sami Melhem Texas A&M University [biography]
  4. Dr. Kristi J. Shryock Texas A&M University [biography]
Download paper (1.62 MB)