2026 ASEE Annual Conference & Exposition

Work-in-Progress: PiazzaPlus–Enhancing Piazza Search with Semantic Matching

Presented at Electrical and Computer Engineering Division (ECE) Technical Session 2

Many courses use discussion forums like Piazza to support learning outside the classroom, where students rely on built-in search tools, including Piazza's native search to find relevant information. However, these systems typically use keyword-based matching and lack semantic understanding. As a result, when students phrase their queries differently from how related posts are written, irrelevant threads are often retrieved, causing students to miss existing answers and ask duplicate questions. To address this gap, we propose PiazzaPlus: a hybrid semantic-keyword retrieval engine to help students find relevant answers. Our pipeline starts with preprocessing posts to clean up text, auto-captioning images using a large multimodal model (LMM) and concatenating captions with the text body of the post. Preprocessed posts are chunked into documents, which are then vectorized using an embedding model, and the generated vectors are stored in a vector database. At query time, a keyword search retrieves the top 100 keyword-similar posts, which are then re-ranked by a semantic stage based on similarity in meaning to the user's query. Our hybrid search is accessible via a browser extension interface. To date, we have extensively tested PiazzaPlus quantitatively. We evaluated retrieval quality using mean average precision (MAP), which is the average of precision scores at the ranks where relevant items occur, averaged across queries. We also used mean reciprocal rank (MRR), the average of the reciprocal of the rank of the first relevant result. Furthermore, we used average Precision@3 (P@3), which is the fraction of the top three results that are relevant. A human expert evaluating the relevance of suggested posts indicates that our engine achieved a 102.5% improvement in MAP, a 143.5% increase in MRR, and a 72.4% improvement in P@3 over Piazza's native search. An independent evaluation using an LLM-as-a-judge approach demonstrated corresponding improvements of 138.4%, 165.5%, and 102.7%, respectively. These results support our hypothesis that semantic–keyword integration resolves context-mismatch issues. We discuss qualitative instructor feedback based on students' informal comments. The feedback informs our future work to scale deployment, conduct field trials, collect formal feedback from students and explore LLM-driven summarization to further facilitate learning.

Authors

Ibraheem El Sheikha University of Toronto [biography]

Ibraheem El Sheikha is an undergraduate student at the University of Toronto pursuing a BASc in Computer Engineering.
Salma Emara University of Toronto [biography]

Salma Emara is an Assistant Professor, Teaching Stream in the Department of Electrical and Computer Engineering at the University of Toronto. She received her B.Sc. in Electronics and Communications Engineering from the American University in Cairo in 2018, and her Ph.D. in Computer Engineering from the University of Toronto in 2022. Her Ph.D. research focuses on improving reinforcement learning algorithms to solve problems in computer networking algorithms. Currently, she is interested in building software-tools for programming education and pedagogical practices that build testing and debugging skills for beginner programmers.

Note

The full paper will be available to logged in and registered conference attendees once the conference starts on June 21, 2026, and to all visitors after the conference ends on June 24, 2026

« View session

For those interested in:

computer science
engineering
Faculty
information technology
undergraduate