Building on our prior work, WIP: Leveraging AI for Literature Reviews (Gong & Maitra, 2025), which introduced an AI-assisted step by step guideline for new researchers, this study extends the framework to evaluate Retrieval-Augmented Generation (RAG) capabilities. We propose a systematic approach for assessing a single large language model’s (LLM) RAG capabilities using three quantitative metrics—correctness, completeness, and compliance—that capture factual accuracy, contextual coverage, and adherence to citation or formatting standards. These metrics are tested on literature-retrieval and summarization tasks drawn from perovskite solar-cell and additive-manufacturing datasets, allowing a comparison between AI-generated and human-verified reviews. By quantifying how reliably an LLM retrieves, attributes, and integrates external sources, the study establishes a reproducible method for benchmarking AI tools such as ChatGPT, Perplexity, and Elicit in research workflows.
The second phase explores multi-agent RAG collaboration, modeling interactions among multiple LLMs (e.g., GPT-4, Claude, Gemini) as agents with distinct “personalities.” Each agent is characterized by behavioral traits—willingness to change, cooperation, verification precedence, and acceptance of reasoning—that influence how information is exchanged and reconciled. Through multi-shot dialogue experiments, we visualize knowledge exchange networks that reveal patterns of consensus, contradiction, and reasoning acceptance across agents. Preliminary results show that inter-agent verification enhances completeness and factual alignment but may reduce efficiency when compliance constraints dominate. This multi-level RAG evaluation framework links algorithmic accuracy with collaborative reasoning behavior, offering new directions for teaching undergraduates how to critically evaluate, verify, and synthesize AI-generated research outputs within ethical and academically rigorous contexts.
http://orcid.org/https://0000-0003-4318-9387
Pennsylvania State University, Behrend College
[biography]
The full paper will be available to logged in and registered conference attendees once the conference starts on June 21, 2026, and to all visitors after the conference ends on June 24, 2026