Do Diverse Leadership Teams Produce Better Performance?
An analysis challenges the ‘business case for diversity.’
Do Diverse Leadership Teams Produce Better Performance?It’s been a dozen years since Harvard’s Sendhil Mullainathan and I sent out nearly 5,000 fictitious resumes in response to help-wanted ads in Boston and Chicago. To half of the resumes, we randomly assigned white-sounding names such as Emily Walsh or Greg Baker. To the other half, we assigned African American–sounding names, such as Lakisha Washington or Jamal Jones. We responded to more than 1,300 employment ads in the sales, administrative support, clerical, and customer-services job categories—and found that white names received 50 percent more callbacks for interviews. Our evidence indicated that hiring managers were discriminating against African Americans on the basis of their names.
Since then, researchers from nearly all continents have devoted much time and effort to accumulating evidence corroborating the notion that discrimination exists. They have convincingly established that outsider groups sometimes don’t get a fair shake in cultures around the globe. I appreciate the necessity of accumulating this evidence. Local policy debates require local evidence. However, maybe because economists have devoted so much attention to measuring the extent of discrimination, little effort has been put toward answering some key questions: What causes discrimination? What does it cost us? And what can we do to mitigate it?
Black people in the United States are less likely to be employed and more likely to be arrested or shot while unarmed. Women are scarce at the top of the corporate, academic, and political ladders despite the fact that (in rich countries at least) they are more likely to get better grades and graduate from college. While many in the media would argue that discrimination is a key force in driving these patterns, convincingly establishing that it is indeed the case proved difficult.
The study Mullainathan and I conducted has been repeated, like variations on a theme, all over the world, with researchers sending out pairs of fictitious resumes or letters of interest for a potential rental or job, giving some of the applicants a perceived minority trait. By far the bulk of field experiments conducted to measure discrimination have used this “correspondence method,” and studies of labor-market discrimination based on race and ethnic backgrounds have been the most popular application of the method to date. In Peru, whites were compared to indigenous applicants. Han, Mongolians, and Tibetans were compared in China. In Australia, whites were compared to Chinese applicants. In Belgium, nonimmigrants were compared to immigrants. In Ireland, candidates with Irish names were compared to candidates with distinctly non-Irish names. In all these cases, researchers found evidence of discrimination. Correspondence studies have also been used to establish discrimination involving gender, caste, religion, employment status, and appearance.
Correspondence studies in the housing market have revealed discrimination against Arabic names in Sweden, against blacks and other minority ethnicities in the US and the United Kingdom, and against immigrants (particularly Muslims) in Italy and Spain. Expansion of online platforms allows researchers to use the correspondence method to also study discrimination in retail markets.
More evidence has been accumulated using audit studies, where two people who are similar except for one trait, such as race, ethnicity, or gender, are matched up. They apply for a job or for housing, or bargain for a car, following a similar script. In one study, car dealers quoted lower prices to white men and made higher offers to white women and black men and women. Another demonstrated that fancy restaurants (which pay more) tend to offer more jobs to men, whereas low-price and lower-paying restaurants hire more women. And yet another showed that employers were less likely to call back someone with a criminal record, especially if the job candidate was black. At a sports-card market, minority buyers received worse offers—and experienced minority buyers had to work hard to get the same prices offered to white men.
Discrimination has been further demonstrated in controlled lab settings. One of the most important research insights of the past two decades is that implicit, unintentional, unconscious attitudes can be measured. The implicit-association test (IAT), introduced in 1998, is a computer-based test in which a test taker must categorize up to 60 words and pictures of faces that appear in the center of the screen. A taker of the race IAT has to quickly decide whether a face is African American or European, or whether a word such as “happiness” or “tragedy” is associated with good or bad. Categories appear on either side of the screen in pairs that are either compatible or incompatible with a stereotype. For example, in a compatible version, “African American” and “bad” could appear in one corner, while “white” and “good” are in the other. In the incompatible version, the categories are paired against stereotypes, so that “African American” appears with “good” and “white” with “bad.” The test taker marks her decision by hitting a key on the right or left side of the keyboard. The race IAT revealed an implicit bias against African Americans; most people respond more quickly when “African American” is paired stereotypically, with “bad” rather than “good.”
Studies suggest the area of the brain associated with emotions and fear is activated when a person is unconsciously processing black faces.
Neuroscience studies have further demonstrated the role of conscious and unconscious processing. One study showed a correlation between the IAT and amygdala activation, or the fear response, during the processing of black faces.
Goldberg paradigm experiments, which are laboratory versions of audit or correspondence studies, have produced yet more evidence. In the original Goldberg paradigm, students graded written essays that were identical except for the male or female name of the author. This initial experiment demonstrated a bias: females got lower grades unless the essay was on a stereotypically feminine topic.
As discrimination is indeed pervasive, what is causing it? Two workhorse economics models come to drastically different conclusions. In 1957, the late Gary S. Becker, of the University of Chicago, put forth a taste-based explanation, arguing that employers may simply dislike hiring members of a minority. In this model, which was developed for the context of the labor market, some employers may indulge this distaste by refusing to hire, say, blacks; or if they do hire them, they may underpay them. If enough employers do this, a wage differential will emerge in equilibrium between otherwise identically productive minority and majority employees. Racist employers will experience lower profits. And if the conditions of perfect competition are satisfied, discriminating employers will be wiped away and taste-based discrimination will disappear.
However, several other economics papers—by Columbia University’s Edmund S. Phelps, by Stanford’s Kenneth Arrow, and by University of California, Irvine’s Dennis J. Aigner and the late Glen G. Cain of the University of Wisconsin—have argued that discrimination is due to imperfect information. In this model, if an employer (or landlord, or car salesman, and so on) doesn’t know a lot about a person standing in front of her, she will use information she has—or thinks she has—about the group that person belongs to.
Many economists view this “statistical discrimination” as the more disciplined explanation. However, with a few exceptions, field experiments have failed to link proven patterns of discrimination to a specific economic theory that explains the root causes.
Psychologists have made considerable progress in their own understanding of the roots of discrimination, on a largely parallel track, and they have advanced two theories that help nail down the “why” as well as blur the sharp line economists tend to draw between taste-based and statistical explanations.
Early scholarship in psychology viewed prejudice as a form of abnormal thinking and equated it to a psychopathology (think Adolf Hitler) that could be treated by addressing the personality disorders of the subset of the population that was “diseased.” Neuroscience studies have shown that different regions of the brain are activated under conscious versus unconscious processing, suggesting that unconscious processes are distinct mental activities. For example, studies suggest the area of the brain associated with emotions and fear is activated when a person is unconsciously processing black faces, while the conscious processing of the same faces increases brain activity in areas related to control and regulation. Implicit biases are more likely to drive behavior under distracting conditions, ambiguity, or high time pressure and cognitive load.
Most importantly, whether discrimination is taste based or statistical, or something else entirely, its destructive power lies in the way it can turn perceived differences into real ones. Female students who are told that girls aren’t good at math may self-select into nonmath majors, never realizing a potential talent and at the same time emptying the math field of positive, stereotype-defying models for others to follow. If teachers or employers assume that students of a particular race are less smart, they will invest less in them. Thus, discrimination can create or exacerbate existing differences between groups. Prejudice that starts as taste based and inefficient can easily morph into the more “justifiable” form.
It’s a wide-open field for researchers interested in quantifying the costs of discrimination, both to the outsider group or groups and to society as a whole. One preliminary question recently under investigation asks, Are leaders’ biases against some groups affecting the performance of those groups? Two recent field studies in economics provide what I believe are the first field-based answers to this question. In one, three economists—Sciences Po’s Dylan Glover, Harvard’s Amanda Pallais, and William Pariente of Université Catholique de Louvain—studied a French grocery-store chain. They find that the cashiers, many of whom were from Africa, performed worse on days when assigned a manager who was more biased against them. Biased managers put less effort into managing minority workers, were less likely to check on their cashier stations, and demanded less effort from the workers.
Similarly, University of Warwick’s Victor Lavy and Bank of Israel’s Edith Sand looked at primary-school teachers’ biases, and at their students’ achievement. Being assigned to a gender-biased teacher early in school had long-term implications for students regarding their occupational choices and hence their earning ability as adults, the research finds.
There is surprisingly scant field-based literature on the costs and benefits to organizations and groups of the limited diversity that directly results from discrimination. A long literature in political economy and development has tended to emphasize the cost of diversity, in particular ethnic diversity. If members of different groups do not like each other, diversity can create hold-ups and breed conflicts—and business owners ultimately have to make a trade-off between the cost of communication and collaboration and the benefits of diverse viewpoints. There’s little in the field-experiment literature to help them make this decision. In one study, which looked at teams of undergraduate students who started businesses as part of a class, there was a clear benefit to having gender diversity on a team. But some findings are subtle and invite more questions.
And if an organization’s head wants to help undo or weaken discrimination, what should she do? There is research exploring the impact of role models, and some has surprising results. In a project that looks at academic evaluation committees, including a woman on a committee doesn’t necessarily help female candidates, and actually could hurt them. This evidence is fascinating, as well as a little depressing. It would be interesting to see if it also carries through in other settings, such as management or political decisions. And even if there is no direct positive effect from having a woman or minority member in a leadership position, could such presence change social norms, or cause a backlash?
Researchers have started to address these questions, and related ones. But there has been relatively little activity in designing creative field experiments to better document either the consequences of discrimination or interventions that may undermine it. Given the large amount of theoretical and lab-based work that has not yet been taken to the field, a lot of promising future field research is ripe for the picking in this area. The issue of interventions to undermine discrimination is particularly ready for more field experimentation.
The issue of diversity, or lack of it, is dogging Silicon Valley, where the gender and racial makeup of the workforce has changed little since major technology firms including Google and Intel began publicizing their diversity numbers two years ago. US colleges and universities that use affirmative-action practices may be forced to change them, depending on how the US Supreme Court decides a case that is pending as of press time. In the political realm, US voters elected the first black president in Barack Obama and now are faced with a candidate, Hillary Clinton, aiming to be the first female president. With diversity such a central issue today, more economic research is greatly needed. I strongly encourage researchers to take on this work in the near future.
Marianne Bertrand is the Chris P. Dialynas Distinguished Service Professor of Economics at Chicago Booth.
This essay is adapted from “Field Experiments on Discrimination,” a chapter prepared for the Handbook of Field Experiments and coauthored with Esther Duflo, the Abdul Latif Jameel Professor of Poverty Alleviation and Development Economics in the Department of Economics at the Massachusetts Institute of Technology.
An analysis challenges the ‘business case for diversity.’
Do Diverse Leadership Teams Produce Better Performance?Telling a customer ‘it could be better’ can make them less willing to buy.
To Sell More, Say LessSometimes unresolved disputes fester when neither party will take the risk of apologizing first.
What Prevents Us from Saying ‘Sorry’?Your Privacy
We want to demonstrate our commitment to your privacy. Please review Chicago Booth's privacy notice, which provides information explaining how and why we collect particular information when you visit our website.