2024 ASEE Annual Conference & Exposition

Board 209: Bridging Language Barriers in Healthcare Education: An Approach for Intelligent Tutoring Systems with Code-Switching Adaptation

Presented at NSF Grantees Poster Session

The recent rapid development in Natural Language Processing (NLP) has greatly enhanced the effectiveness of Intelligent Tutoring Systems (ITS) as tools for healthcare education. These systems hold the potential to improve health-related quality of life (HRQoL) outcomes, especially for low-literacy populations such as the Hispanic community with limited reading and writing skills. However, despite the progress in pre-trained multilingual NLP models, there exists a noticeable research gap when it comes to code-switching within the medical context. Code-switching is a prevalent phenomenon in multilingual communities where individuals seamlessly transition between languages during conversations. This presents a distinctive challenge for healthcare ITS aimed at serving multilingual communities, as it demands a thorough understanding of and accurate adaptation to code-switching, which has thus far received limited attention in research.

The hypothesis of our work asserts that the development of an ITS for healthcare education, culturally appropriate to the Hispanic population with frequent code-switching practices, is both achievable and pragmatic. Given that text classification is a core problem to many tasks in ITS, like sentiment analysis, topic classification, and smart replies, we target text classification as the application domain to validate our hypothesis.

Our model relies on pre-trained word embeddings to offer rich representations for understanding code-switching medical contexts. However, training such word embeddings, especially within the medical domain, poses a significant challenge due to limited training corpora. In our approach to address this challenge, we identify distinct English and Spanish embeddings, each trained on medical corpora, and subsequently merge them into a unified vector space via space transformation. In our study, we demonstrate that singular value decomposition (SVD) can be used to learn a linear transformation (a matrix), which aligns monolingual vectors from two languages in a single meta-embedding. As an example, we assessed the similarity between the words “cat” and “gato” both before and after alignment, utilizing the cosine similarity metric. Prior to alignment, these words exhibited a similarity score of 0.52, whereas after alignment, the similarity score increased to 0.64. This example illustrates that aligning the word vectors in a meta-embedding enhances the similarity between these words, which share the same meaning in their respective languages. To assess the quality of the representations in our meta-embedding in the context of code-switching, we employed a neural network to conduct text classification tasks on code-switching datasets. Our results demonstrate that, compared to pre-trained multilingual models, our model can achieve high performance in text classification tasks while utilizing significantly fewer parameters.

Authors

Dr. Zechun Cao http://orcid.org/0000-0002-4542-7791 Texas A&M University, San Antonio [biography]

Zechun Cao received his master's and Ph.D. degrees in computer science from the University of Houston. His research lies at the intersection of cybersecurity, privacy, and artificial intelligence (AI). His doctoral thesis centers around developing network and host intrusion detection methods leveraged by intelligent user behavior recognition. He also collaborates with economists and city planners on devising AI algorithms that result in long-lasting real-world impact. More recently, he has been passionate about designing algorithms and tools to keep users' private confidential data secure in an AI-driven world. Dr. Cao's work has been published in international conferences and journals. He is a member of ACM and IEEE and has served as a TPC member and reviewer for various journals and international conferences.
German Zavala Villafuerte
Ali Jalooli
Renu Balyan
Sanaz Rahimi Moosavi
Francisco Iacobelli Northeastern Illinois University [biography]

Dr. Iacobelli is a Computer Scientist with a research focus at the intersection between human-computer interaction, natural language processing, education and artificial intelligence. He has been applying this research to healthcare and to bridge health disparities. Dr. Iacobelli is an associate professor in the Computer Science Department at Northeastern Illinois University where he has taught since 2011. He is also an associated faculty member of the Center for Advancing Safety in Machine Intelligence (CASMI) at Northwestern University.

Download paper (2.09 MB)

Are you a researcher? Would you like to cite this paper? Visit the ASEE document repository at peer.asee.org for more tools and easy citations.

» Download paper

« View session