The findings suggest that automated, data-driven algorithms incorporating fairness and diversity constraints can lead companies to hire people who appear, on paper, to be less qualified than candidates brought on through a process that ignores demographics. But even in terms of employee quality on paper, the cost to a company is likely minimal to achieve a fairer outcome, according to the research.
“If you force a company to hire, on average, 10 women for every 10 men, you might reduce the number of top candidates they hire, such as those with the highest GPA or a degree from an Ivy League school, simply because you added an extra constraint to the search,” says Niazadeh. “But, in reality, you might not hurt the utility of the search by much.” He explains that there may be several optimal ways of hiring people, and while a demographics-blind policy yields the best results in terms of short-term scores, other methods are still reasonable.
The simulations also suggest that this kind of nondiscriminatory practice benefits organizations in the longer run: imposing quotas, even when one group boasts stronger qualifications than another, produces a better workforce (as measured by the hypothetical candidates’ true quality) than hiring on the basis of short-term scores alone. An organization will find better employees if it recruits, say, a 50/50 male-female team in which 16 out of 20 boast Ivy League degrees than if it hires 20 Ivy League graduates, the majority of whom are men. “Imposing socially aware constraints such as demographic parity or [a] quota can even make the search more efficient in terms of true unobserved qualities,” write the researchers.
The exception comes when extreme constraints are imposed in settings where systemic discrimination has created vastly disparate groups in terms of formal qualifications—for example, if the demand were that 10 Black STEM PhDs be hired for every 10 white STEM PhDs, despite the fact that, according to a report commissioned by the Alfred P. Sloan Foundation, only 5 percent of PhD holders in the science, technology, engineering, and math fields were Black as of 2021. Under circumstances such as these, the simulations reveal, positions often go unfilled, reducing the long-term utility of a team because the team itself is smaller than it should be.
Many people have thought about algorithmic fairness in decision-making, says Niazadeh. “When it comes to designing machine-learning algorithms for high-stakes applications such as loan decisions, computer scientists and economists have studied algorithms that favor disadvantaged groups. This is in response to evidence that demographics-blind ML algorithms discriminate due to skewed data,” he says. But the “fair” ML algorithms have tended to make straightforward choices based on one-time signals—for example, deciding whether a loan application gets approved on the basis of a potential borrower’s credit history.
Hiring decisions are often more complex in nature. Here, it takes time and resources beyond scanning a résumé to find out if a candidate is any good. Markers of quality are dynamic, since a hiring manager’s opinion of each candidate may change after a first interview, a second interview, and a site visit.
“That’s the technical challenge,” says Niazadeh. “Hiring a person is more complicated than opening Door 1, 2, or 3 and seeing what you get.” The researchers argue that the complexity calls for a Markovian scheduling framework. (A Markovian model, named for its creator, the late Andrey Markov, describes a sequence of events in which the probability of the next depends on the outcome of the previous one.) This framework goes beyond static ML problems and even Weitzman’s indices.
While the researchers’ algorithmic approach has the potential to influence hiring in many countries, especially when it involves a sequential search process such as in executive recruiting, Niazadeh predicts that US organizations might balk, given the political and legal questions around diversity and inclusion. Even those open to quotas may find the inner workings of the tools uncomfortable, he adds, because they rely on a degree of randomness: when two candidates appear to be equally qualified, the algorithm essentially flips a coin.
But he says that some policymakers have agreed to use randomization in selecting citizens for assemblies or juries and in distributing legislative seats. This approach, he says, helps achieve the optimum outcome under the fairest conditions, on average.