2025 ASEE Annual Conference & Exposition

Case Study: Using Synthetic Datasets to Examine Bias in Machine Learning Algorithms for Resume Screening

Presented at Engineering Ethics Division (ETHICS) Technical Session - Ethics in ML/AI

The increasing use of artificial intelligence (AI) in recruitment, particularly through resume screening algorithms [1], has raised significant ethical concerns. These systems, designed to automate the hiring process by filtering and ranking candidates, rely heavily on machine learning (ML) techniques and historical data to make decisions. However, they can unintentionally perpetuate biases present in the data, leading to discriminatory outcomes. A well-known example is Amazon’s hiring tool [2], which was found to favor male candidates over women, a result of being trained on biased historical hiring data.

In this case study, we developed a synthetic dataset designed to mimic the one used in the Amazon case to explore similar bias issues. The dataset consists of artificial resumes generated to reflect a diverse applicant pool. Each resume contained demographic information, previous work history, education, as well as skills and activities. Using this dataset, we trained a machine learning algorithm to rank candidates based on resumes of current employees at a fictional company. This algorithm was trained using code in a Jupyter notebook, which allowed for students to later modify and interact with the algorithm. The dataset and corresponding machine learning algorithm used to train the model are available on GitHub [3].

As with Amazon’s tool, the algorithm began to exhibit biased decision-making, favoring certain demographics over others. We took special care to highlight even with the exclusion of explicit demographic information in the resumes, the algorithm still learned to bias against previously underrepresented groups. This allowed us to bring attention to the ethical implications of deploying such AI tools in the hiring process, and more broadly, the dangers of using AI in any decision-making capacity that can profoundly affect individuals' lives. This exercise taught essential problem-solving techniques for addressing these challenges, equipping students with practical tools for developing and evaluating machine learning algorithms in a more ethical manner.

This exercise serves as an interactive framework for students to engage with real-world ethical dilemmas in AI and machine learning. Through this case study, students learned about the importance of ethical oversight in engineering practices, particularly in the development and application of algorithms. They gained hands-on experience in recognizing and addressing bias in AI systems, offering valuable lessons they can carry into professional practice.

We first introduced this case study in a graduate-level course on ethics in automation. However, the case study can be integrated into various other courses, including engineering ethics, machine learning, and data science. It offers an accessible and engaging way to teach both technical and ethical concepts, making it ideal for undergraduate and graduate courses that emphasize the intersection of technology and ethics.

[1] B. Spar and I. Plentenyuk, "Global Recruiting Trends 2018: The 4 Ideas Changing How You Hire," LinkedIn, Jan. 2018. [Online]. Available: https://news.linkedin.com/2018/1/global-recruiting-trends-2018#:~:text=Recruiters%20and%20hiring%20managers%2C%20globally,and%20nurturing%20candidates%20(55%25).

[2] "Insight: Amazon scraps secret AI recruiting tool that showed bias against women," Reuters, Oct. 2018. [Online]. Available: https://www.reuters.com/article/world/insight-amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK0AG/.

[3] https://github.com/annikaLindstrom/EthicsInAI

Authors
  1. Annika Haughey Duke University [biography]
  2. Dr. Brian P. Mann Duke University [biography]
Note

The full paper will be available to logged in and registered conference attendees once the conference starts on June 22, 2025, and to all visitors after the conference ends on June 25, 2025