Prior research has highlighted the significant challenges that students face when navigating database systems, particularly in mastering SQL and NoSQL query languages. These challenges typically fall into two categories: syntax errors and semantic errors. Syntax errors occur when a student's query violates the grammatical rules of the SQL or NoSQL language, resulting in queries that the database system cannot execute. On the other hand, semantic errors arise when a query is syntactically correct but does not produce the expected result because it does not accurately reflect the student's intention or understanding of the data. Numerous common error types and overarching learning hurdles have been identified among learners, with a predominant focus on syntax errors in previous research.
However, there has been a noticeable gap in the study and categorization of semantic errors, which are equally critical for students’ learning and proficiency in database systems. Our study aims to fill this gap and contribute to the educational domain by significantly improving the precision and efficiency in identifying semantic errors in student submissions of SQL and NoSQL queries. We strive to achieve this by integrating the advanced capabilities of a Generative Pre-Trained Transformer (GPT) model with an existing feedback system, enhancing both the accuracy and effectiveness of error detection. We have utilized diverse datasets of student submissions, which were employed to fine-tune our GPT models. This tailored training process has enabled the models to better recognize and highlight semantic errors, while simultaneously providing constructive and meaningful feedback. The GPT models, through this customized training, have developed a deeper understanding of common student errors, leading to notable improvements in error detection accuracy and feedback quality.
Preliminary results from our research are highly encouraging, demonstrating significant advancements and highlighting the potential of large language models in database learning. By integrating these state-of-the-art computational tools into the learning environment, our study lays the groundwork for the creation of intelligent systems that offer nuanced and context-aware feedback. Such systems have the potential to significantly alleviate the learning obstacles associated with database systems, thereby enhancing the educational experience and support available to students
Are you a researcher? Would you like to cite this paper? Visit the ASEE document repository at peer.asee.org for more tools and easy citations.