This research paper explores how common curricular design patterns can be extracted from plan of study data in a systematic fashion. Analyzing curricula has become more and more data-driven, as is exemplified by the adoption of Curricular Analytics – a method of quantitatively determining the “complexity” of a plan of study (PoS). The technique represents the pre- and corequisite relationships in a curriculum as a network, which not only allows for degree requirements to be visualized but also analyzed using network analysis techniques. One of the additional advantages of representing curricula as networks is the ability to decompose them into smaller subnetworks representing curricular design patterns, such as the Calculus sequence and the core Mechanics sequence (i.e., Statics, Dynamics, and Strength of Materials). However, considering the steep data requirements and standardization across course names, there is little work exploring how these curricular design patterns manifest in engineering program across the United States.
We posed the following research question: “What are common curricular design patterns in programs across the United States for mechanical, electrical, civil, industrial, and chemical engineering?” We leveraged existing data collected as part of an ongoing project connected to the Multiple Institution Database for Engineering Longitudinal Development (MIDFIELD), a data sharing agreement between 21 US institutions. Unlike previous efforts with Curricular Analytics, these data spans 13 universities across 5 disciplines and for the past 10 years – comprising 494 PoS networks, one of the largest and most diverse samples to date.
We performed data mining using Python, facilitated by established APIs and libraries such as NetworkX and pandas. Network analysis, employing the NetworkX library, allowed us to create dependency graphs for courses and identify recurring patterns. However, due to the dataset’s vast scope, courses exhibited a multitude of names that varied across institutions and years. Thus, we categorized course chains based on subject matter and common sets of courses. Manual generalization across a large dataset was infeasible, so we turned to OpenAI’s large language model (LLM) API, focusing on the GPT-4 model. We used the API to categorize classes based on their subject matter – manually verified to offset issues with hallucinations.
From mining the data, we were able to obtain results in the form of common chains with the counts and instances where they appear in networks across the POS dataset. There were 12 types of course sequences with counts higher than 50 across plans of study. These sequences provide the basis for exploring common curricular design patterns (e.g., Calculus, Mechanics) and were observed across multiple subjects like Mathematics, Physics and General Engineering.
There are multiple avenues to explore in the future using the PoS dataset. For example, the common sequences themselves have applications in curriculum planning. But the scope of the study can be changed to analyze changes in course sequencing over the years for a particular discipline, or across an institution. We will provide examples of these different design patterns in the paper.
Are you a researcher? Would you like to cite this paper? Visit the ASEE document repository at peer.asee.org for more tools and easy citations.