Large Language Models (LLMs) like OpenAI’s ChatGPT-4o show promise for enhancing engineering education through real-time support and personalized feedback. However, their reliability in interpreting the conceptual diagrams central to mechanical engineering remains uncertain. This study evaluates ChatGPT-4o’s performance on four concept inventories—Force Concept Inventory, Materials Concept Inventory, Mechanics Baseline Test, and Mechanics of Materials Concept Inventory—using assessments by two Mechanical Engineering professors based on correctness, depth of explanation, and application of theoretical knowledge. While ChatGPT-4o demonstrates the ability to provide robust explanations, it often lacks the contextual depth required for higher-order concept mastery, especially when reasoning from diagrams. These findings align with existing literature highlighting AI’s limitations in discipline-specific support. Future research should refine AI responses to better align with engineering problem-solving approaches and explore hybrid models integrating AI assistance with human instruction, potentially leading to more effective AI-augmented learning platforms in mechanical engineering education.
The full paper will be available to logged in and registered conference attendees once the conference starts on June 22, 2025, and to all visitors after the conference ends on June 25, 2025