As Data Science (DS) and Machine Learning (ML) become integral to modern industries, university programs often emphasize modelling and algorithmic theory, while employers increasingly seek graduates proficient in the full data and model lifecycle skills typically required in real-world environments. This exploratory study investigates gaps in DS and ML education that limit graduates’ readiness for production-level work, particularly in Machine Learning Operations (MLOps), data and ML engineering, and model lifecycle management.
Using a multi-methods triangulation design, the research synthesizes data from surveys recent graduates (n=6) and hiring managers (n=7), with a comparative content analysis of university syllabi (n=5) and entry-level job descriptions (n=5) to assess how academic instruction aligns with industry skill expectations. The central research question guiding this study is: How do academic approaches to teaching data science and machine learning align with industry expectations for deployable, maintainable, and ethical data and model systems?
Results indicate a contrast in priorities: while employers and job postings explicitly require proficiency in containerization, Continuous Integration and Continuous Deployment (CI/CD) pipelines, and model monitoring, sampled academic curricula remain predominantly focused on model development with minimal operational content. Hiring managers reported that graduates are largely unprepared for collaborative team environments, often requiring 6 to 12 months of on-the-job training to reach full productivity. This paper highlights the need for curricular evolution to embrace the software engineering practices necessary for deployable, maintainable ML systems.
http://orcid.org/0000-0002-2274-0152
Purdue Polytechnic Institute, Purdue University – West Lafayette
[biography]
The full paper will be available to logged in and registered conference attendees once the conference starts on June 21, 2026, and to all visitors after the conference ends on June 24, 2026