2025 ASEE Annual Conference & Exposition

Improving the Accessibility of Mathematical and other STEM Content in Engineering courses through Machine Learning Models

Presented at Minorities in Engineering Division(MIND) Technical Session 7

Transcribing equations and diagrams from STEM lecture slides pose significant challenges due to their varied structures. Existing methods focus on well-formatted research papers or isolated Mathematical content, lacking the flexibility to handle diverse Engineering educational materials. Furthermore, there is no widely available open-source software capable of both detecting and transcribing equations or diagrams from real-world STEM slides to Mathematical markup such as LaTeX, much less to natural language. This gap limits the accessibility of STEM content for students with disabilities or students with generally unmet needs, particularly in higher education settings as equations and diagrams become increasingly complex.

To address this, we evaluate and enhance existing machine learning models in computer vision for detection and transcription of equations and diagrams from STEM slides. To understand the strengths and limitations of existing methods we score them on their ability to handle different course materials. Then, we plan to improve both accuracy and efficiency in handling diverse content types, including handwritten equations and varied font styles.

We test these models on a custom dataset of lecture materials for six STEM courses at a large Midwestern land-grant university, impacting more than 1,000 engineering students per semester (mostly undergraduate). We apply character-error metrics for transcription to assess the performance of these models, with the consideration of computation availability to support our goal of integrating these models into our previous digital learning platform.

Through our evaluation of the models, we then extract the desirable components of the models to build a robust end-to-end Mathematical transcription pipeline for STEM content.We develop an open-source tool capable of accurately transcribing mathematical content from STEM slides, significantly enhancing accessibility for diverse learners. The state-of-the-art models are often either closed-source or lack the necessary flexibility to handle educational content. By refining and integrating open-source models into our previous digital learning platform, we aim to improve the accessibility of MATH and other STEM education.

Authors
  1. Louis Asanaka University of Illinois at Urbana - Champaign
  2. Delu Zhao University of Illinois at Urbana - Champaign [biography]
  3. Meghana Gopannagari University of Illinois at Urbana - Champaign
  4. Sonika Tamilarasan The University of Illinois at Chicago
  5. Alan Tao University of Illinois Urbana-Champaign
  6. Nancy Zhang University of Illinois at Urbana - Champaign
  7. Adelia Solarman University of Illinois at Urbana - Champaign
  8. Xiuhao Ding University of Illinois at Urbana - Champaign [biography]
  9. Dr. Pablo Robles-Granda University of Illinois at Urbana - Champaign [biography]
  10. Yang Victoria Shao University of Illinois Urbana Champaign [biography]
  11. Dr. Chrysafis Vogiatzis Orcid 16x16http://orcid.org/0000-0003-0787-9380 University of Illinois at Urbana - Champaign [biography]
  12. Dr. Hongye Liu University of Illinois at Urbana - Champaign [biography]
Note

The full paper will be available to logged in and registered conference attendees once the conference starts on June 22, 2025, and to all visitors after the conference ends on June 25, 2025