In recent years, there has been significant growth in analytics programs at the undergraduate and masters’ levels in Industrial and Systems Engineering (ISE) departments at universities across the country. These programs create new challenges for ISE faculty because they involve the teaching of an interdisciplinary blend of software engineering, statistics, simulation, optimization, and business analysis disciplines.
Analytics courses frequently involve significant amounts of programming assignments to generate, assess, and refine predictive models and extensive use of sample datasets for both teaching and assessment purposes. When teaching analytics techniques, especially predictive analytics, instructors are always looking for datasets that contain statistical characteristics that we want to discuss including multi-collinearity, interaction effects between variables, skewed distributions, and nonlinear relationships between predictor and response variables. Instructors generally must either search for existing datasets that have these attributes or create them “manually” using programmatic techniques.
This paper describes work done to develop an academic toolset to permit instructors to specify the statistical properties desired in an analytic dataset (using a newly defined high-level dataset specification language), to generate multiple, randomized versions of this dataset (using a newly developed Python library), to provide automation for creating individualized datasets for each student (to avoid inappropriate collaboration on assignments and take-home exams among students), and to provide for automated grading support for assignments and examinations.
Are you a researcher? Would you like to cite this paper? Visit the ASEE document repository at peer.asee.org for more tools and easy citations.