2026 ASEE Annual Conference & Exposition

A Multimodal Framework for Embodied Cognition in Oral Explanations

Presented at Computers in Education (CoED): Learning, Engagement & Inclusion (7 of 9) -- T508B

This study advances a multimodal analytic framework for investigating embodied cognition during oral explanations in statistics education. Grounded in theories of embodied, situated, and distributed cognition, gesture is conceived not just as an accessory to speech but also as reasoning that is enacted through coordinated movement and language. Using computer vision, gesture trajectories, rhythm, and spatial anchoring are captured, while speech-to-text large language models (LLMs) transcribe and semantically analyze verbal explanations. A central Inference Agent integrates these modalities to reveal how gesture and discourse converge or diverge as indicators of conceptual understanding. Rather than claiming autonomous interpretation, the system functions as an epistemic instrument that visualizes the coupling between gesture and conceptual understanding. By aligning theories of embodied cognition with computational observation, this work reframes oral assessment as a dialogic event in which knowing unfolds through motion, voice, and interpretation—positioning understanding itself as an embodied relation between cognition, expression, and inference.

Authors
  1. Mr. Amirreza Mehrabi Purdue Engineering Education [biography]
  2. Junior Anthony Bennett Orcid 16x16http://orcid.org/https://0009-0004-6441-3805 Purdue University – West Lafayette (College of Engineering) [biography]
  3. Aashvi Majmundar Purdue University – West Lafayette (College of Engineering)
Note

The full paper will be available to logged in and registered conference attendees once the conference starts on June 21, 2026, and to all visitors after the conference ends on June 24, 2026