AI News, Approaching fairness in machine learning

Approaching fairness in machine learning

As machine learning increasingly affects domains protected by anti-discrimination law, there is much interest in the problem of algorithmically measuring and ensuring fairness in machine learning.

Across academia and industry, experts are finally embracing this important research direction that has long been marred by sensationalist clickbait overshadowing scientific efforts.

In this first post, I will focus on a sticky idea I call demographic parity that through its many variants has been proposed as a fairness criterion in dozens of papers.

In a second post, I will introduce you to a measure of fairness put forward in a recent joint work with Price and Srebro that addresses the main conceptual shortcomings of demographic parity, while being fairly easy to apply and to interpret.

So, if you’re interested in the topic, but less so in seeing pictures of Terminator alongside vague claims about AI, stay on for this blog post series and join the discussion.

Domains such as advertising, credit, education, and employment can all hugely benefit from modern machine learning techniques, but some are concerned that algorithms might introduce new biases or perpetuate existing ones.

Historically, the naive approach to fairness has been to assert that the algorithm simply doesn’t look at protected attributes such as race, color, religion, gender, disability, or family status.

In the case of a binary decision C\in\{0,1\} and a binary protected attribute A\in\{0,1\}, this constraint can be formalized by asking that In other words, membership in a protected class should have no correlation with the decision.

For example, in the context of representation learning, it is tempting to ask that the learned representation has zero mutual information with the protected attribute.

Consider, for example, a luxury hotel chain that renders a promotion to a subset of wealthy whites (who are likely to visit the hotel) and a subset of less affluent blacks (who are unlikely to visit the hotel).

Assuming for a moment that such a perfect predictor exists, using C for targeted advertising can then hardly be considered discriminatory as it reflects actual purchase intent.

You also won’t salvage demographic parity by finding more impressive ways of achieving it such as backpropping through a 1200 layer neural net with spatial attention and unconscious sigmund activations.

One definition of algorithmic fairness: statistical parity

If you haven’t read the first post on fairness, I suggest you go back and read it because it motivates why we’re talking about fairness for algorithms in the first place. In this post I’ll describe one of the existing mathematical definitions of “fairness,”

In other words, some people will just have no reason to apply for a loan (maybe they’re filthy rich, or don’t like homes, cars, or expensive colleges), and so takes that into account.

In words, it is the difference between the probability that a random individual drawn from is labeled 1 and the probability that a random individual from the complement is labeled 1.

The data points in this dataset correspond to demographic features of people from a census survey, and the labels are +1 if the individual’s salary is at least 50k, and -1 otherwise.

Here a positive value means it’s biased against the quoted thing, a negative value means it’s biased in favor of the quoted thing.

We can generalize statistical parity in various ways, such as using some other specified set in place of , or looking at discrepancies among different sub-populations or with different outcome labels.

In fact, the mathematical name for this measurement (which is a measurement of a set of distributions) is called the total variation distance. The form we sketched here is a simple case that just works for the binary-label two-class scenario.

First, you could have some historical data you want to train a classifier on, and usually you’ll be given training labels for the data that tell you whether should be or . In the absence of discrimination, getting high accuracy with respect to the training data is enough.

If the labels alone are all we have to work with, and we don’t know the true labels, then we’d need to apply domain-specific knowledge, which is suddenly out of scope of machine learning.

In other words, it may be that teal-haired people are truly less creditworthy (jokingly, maybe there is a hidden innate characteristic causing both uncreditworthiness and a desire to dye your hair!) and by enforcing statistical parity you are going against a fact of Nature.

Though there are serious repercussions for suggesting such things in real life, my point is that statistical parity does not address anything outside the desire for an algorithm to exhibit a certain behavior.

The obvious counterargument is that if, as a society, we have decided that teal-hairedness should be protected by law regardless of Nature, then we’re defining statistical parity to be correct. We’re changing our optimization criterion and as algorithm designers we don’t care about anything else.

A Gentle Introduction to the Discussion on Algorithmic Fairness

Recent years have seen the rise of machine learning algorithms: After breaking the benchmark on nearly every imaginable computer vision related tasks, machine learning algorithms are now in our homes and back-pockets, all the time.

A less known fact is that they have recently begun replacing human decision makers in a number of more sensitive domains, such as the criminal justice system and the field of medical testing.

It turned out that training machine learning algorithms with the standard utility-maximization objectives (e.g, maximizing prediction accuracy on the training data) sometimes resulted in algorithms that behaved in a way in which a human observer will deem unfair, often especially towards a certain minority.

One recent example I liked is by considering how the top-selling image for the search term “woman” in Getty Image’s library of stock photography changed in the last decade: The 2007 most popular image is a naked woman lying on a bed, while the 2017 most popular image of a woman hiking alone on a rocky trail.

In many cases these definitions have trade-offs with accuracy (i.e, achieving them means necessarily paying a price in terms of the model’s accuracy), but somewhat more unexpected is that many of these definitions have trade-offs within themselves.

Specifically, we will consider our feature space X to be some observed features that can be calculated from a person’s resume (e.g: name, birth year, address, gender, programming languages, university degree and credits, etc) and the output will be either Y=1 if this person should be brought to be interviewed, Y=0 otherwise.

In our example, suppose that majoring in Physics in high-school is highly indicative of an applicant’s future success in the tech industry, even though it’s not the actual knowledge of physics that is required.

The aware approach, on the other hand, uses a process that is not fair (explicitly uses gender information, and learns different classification rules for people of different parts of the population), but can actually reach an outcome that is more fair towards the minorities.

One natural way to come up with definitions for algorithmic fairness is to look at how fairness is defined outside the computer science community and formulate those ideas as mathematical definitions, criteria to which we will hold our algorithms.

The mathematical equivalence of the disparate impact principle at its most extreme version (allowing no adverse effect on members of the protected group) for binary classification tasks is the Statistical Parity condition: it essentially equalizes the outcomes across the protected and non-protected groups.

The main criticism against the notion of statistical parity and its variants (such as those that allow some parity, e.g up to 20%, between the two groups) is very natural: Do we really want to equalize the outcomes between the protected and non-protected groups?

There has recently been very public debate on exactly this issue, in light of the computer engineer fired by Google for suggesting women are less suited to certain roles in tech and leadership [3].

Consider for example any classification task in which there is a clear causal relationship between the protected attribute and the output variable, e.g: predicting whether an individual will give birth in the next decade.

[5] claims that while it is the responsibility of scientists to bring forth the discussion about the trade-offs, and possibly to design algorithms in which the tradeoffs are explicitly represented and available as turning parameters that can be easily adjusted, it is ultimately up to the stakeholders to determine the tradeoffs.

Machine Learning Bias and Fairness with Timnit Gebru and Margaret Mitchell: GCPPodcast 114

Original post: This week, we dive ..

Fairness in Machine Learning

Machine learning is increasingly being adopted by various domains: governments, credit, recruiting, advertising, and many others. Fairness and equality are ...

City Council - July 30, 2018 - Part 2 of 2

City Council, meeting 44, July 30, 2018 - Part 2 of 2 Agenda and background materials: ...

Cornel West: "Speaking Truth to Power"

A discussion on Institutional Provincialism with Dr. Cornel West and the MIT community from February 8, 2018. Program: Welcome & Introduction: Mr. Ty Austin ...

2018-08-01 Question Period

Question Period: August 1, 2018.

Gender equality

Social Europe: equal treatment between women and men

Stanford Roundtable 2014: The climate conversation you haven’t heard

Join moderator Lesley Stahl, correspondent for 60 Minutes, for a Roundtable discussion that applies the expertise and perspective of Stanford's brain trust and ...

Right to Counsel National Consortium - Part 1

Richard Epstein, "The Coming Meltdown in Labor Relations"

Labor relations consists of two broad areas—unions and employment discrimination. Both areas have been stable for some time. The last major labor law reform ...

Prescription Drug Pricing: New Solutions

Experts from the pharmaceutical, government, academic, medical and health care arena gather at the Johns Hopkins Bloomberg School of Public Health for a ...