2025 ASEE Annual Conference & Exposition

Automating Structured Information Extraction from Images of Academic Transcripts Using Machine Learning

Presented at DSAI Technical Session 6: Academic Success, Performance & Complexity

The admissions process for the University of Toronto requires its staff to spend countless hours manually reviewing student transcript images to make critical decisions about their academic future. Academic transcript images are tedious to read and transcribe due to their myriads of visual features, such as colored backgrounds, watermarks, multi-column layouts, and small text. To streamline this process, this report investigates the development of an AI system specifically designed for transcribing grade data from academic transcript images into organized tables. While models for table extraction are not novel, existing methods are limited when dealing with academic transcripts due to their unique features and a lack of representation in pre-existing datasets used for training. To our knowledge, this report presents the first labeled, open-source dataset of purely academic transcript images used for training computer-vision based machine learning algorithms. Two primary approaches for image-to-text table reconstruction were explored; the first is a pipeline comprising a YOLOv8 object detection model, Tesseract OCR engine, and a Mistral7b large language model (LLM). The second option implemented a fine-tuned multimodal language model (MiniCPM-Llama3-V-2_5). The multimodal LLM showed superior accuracy on a small test set, with a multi-stage prompting strategy further enhancing its recall on images with more complex multi-column layouts. Future work could greatly improve on this solution by leveraging the trained YOLOv8 object detection model as a preprocessing step, as well as continuing to develop the dataset with a greater diversity of images and prompting formats. Additionally, given the uniquely finite number of transcript formats in circulation, it’s hypothesized that a larger, more inclusive dataset could be used to train a high precision model with near-universal applicability within the target domain. This work forms the foundation of future analytics projects at the University of Toronto, providing a platform with which admissions data may be used to predict student success, and to better track student progress over their academic career.

Authors

Declan Kirk Bracken University of Toronto [biography]

Declan Bracken is an M.Eng. student at the University of Toronto in the department of Mechanical and Industrial Engineering pursuing an emphasis in Analytics. This paper is the final product of an 8 month M.Eng. project supervised by Professor Sinisa Colic and it's work is intended for implementation into the admissions process at the University of Toronto's M.I.E department.
Dr. Sinisa Colic Ph.D. University of Toronto [biography]

Dr. Colic is an Assistant Professor, Teaching Stream with the Department of Mechanical and Industrial Engineering. He completed his PhD at the University of Toronto in the area of personalized treatment options for epilepsy using advanced signal processing techniques and machine learning. Dr. Colic currently teaches several courses at University of Toronto covering a broad range of topics in mechatronics, data science and machine learning / deep learning.

Download paper (2.37 MB)

» Download paper

« View session

For those interested in:

Academia-Industry Connections
computer science
engineering
engineering technology
Faculty
information technology