Many courses use discussion forums like Piazza to support learning outside the classroom, where students rely on built-in search tools, including Piazza's native search to find relevant information. However, these systems typically use keyword-based matching and lack semantic understanding. As a result, when students phrase their queries differently from how related posts are written, irrelevant threads are often retrieved, causing students to miss existing answers and ask duplicate questions. To address this gap, we propose PiazzaPlus: a hybrid semantic-keyword retrieval engine to help students find relevant answers. Our pipeline starts with preprocessing posts to clean up text, auto-captioning images using a large multimodal model (LMM) and concatenating captions with the text body of the post. Preprocessed posts are chunked into documents, which are then vectorized using an embedding model, and the generated vectors are stored in a vector database. At query time, a keyword search retrieves the top 100 keyword-similar posts, which are then re-ranked by a semantic stage based on similarity in meaning to the user's query. Our hybrid search is accessible via a browser extension interface. To date, we have extensively tested PiazzaPlus quantitatively. We evaluated retrieval quality using mean average precision (MAP), which is the average of precision scores at the ranks where relevant items occur, averaged across queries. We also used mean reciprocal rank (MRR), the average of the reciprocal of the rank of the first relevant result. Furthermore, we used average Precision@3 (P@3), which is the fraction of the top three results that are relevant. A human expert evaluating the relevance of suggested posts indicates that our engine achieved a 102.5% improvement in MAP, a 143.5% increase in MRR, and a 72.4% improvement in P@3 over Piazza's native search. An independent evaluation using an LLM-as-a-judge approach demonstrated corresponding improvements of 138.4%, 165.5%, and 102.7%, respectively. These results support our hypothesis that semantic–keyword integration resolves context-mismatch issues. We discuss qualitative instructor feedback based on students' informal comments. The feedback informs our future work to scale deployment, conduct field trials, collect formal feedback from students and explore LLM-driven summarization to further facilitate learning.
The full paper will be available to logged in and registered conference attendees once the conference starts on June 21, 2026, and to all visitors after the conference ends on June 24, 2026