AI News, Artificial Intelligence Has a ‘Sea of Dudes’ Problem
Artificial Intelligence Has a ‘Sea of Dudes’ Problem
Earlier this month, Bill Gates took the stage at the Recode conference to talk about philanthropy with his wife, Melinda.
At one of 2015's biggest artificial intelligence conferences—NIPS, held in Montreal—just 13.7 percent of attendees were women, according to data the conference organizers shared with Bloomberg.
To learn to identify flowers, you need to feed a computer tens of thousands of photos of flowers so that when it sees a photograph of a daffodil in poor light, it can draw on its experience and work out what it's seeing.
Speech recognition software with a data set that only contains people speaking in proper, stilted British English will have a hard time understanding the slang and diction of someone from an inner city in America.
If everyone teaching computers to act like humans are men, then the machines will have a view of the world that's narrow by default and, through the curation of data sets, possibly biased.
Google developed an application that mistakenly tagged black people as gorillas and Microsoft invented a chatbot that ultimately reflected the inclinations of the worst the internet had to offer.
"From a machine learning perspective, if you don't think about gender inclusiveness, then oftentimes the inferences that get made are biased towards the majority group—in this case, affluent white males,"
She's also investigated how machine learning systems suffer if the designers don't properly account for gender. "If un-diverse stuff goes in, then closed-minded, inside-the-box, not-very-good results come out,"
"It's amazing how many companies are, on the one hand, disappointed with the representation of women in these roles and, on the other hand, happily pushing out hiring content like this,"said Snyder, who is a former Amazon employee.
The organization hosts talks and presentations by female researchers, and also has a public directory of several hundred women working in machine learning, giving people a way to reach out to women in the community.
"Some of the cultural issues that play into women not being involved in the field could also lead to important questions not being asked in terms of someone's research agenda."
Not wanting to miss an opportunity to conduct research, Li did a study of the program and found that women who attended it had a statistically significant increase in technical knowledge, confidence, and interest in pursuing careers in AI.
Machines Taught by Photos Learn a Sexist View of Women
Last fall, University of Virginia computer science professor Vicente Ordóñez noticed a pattern in some of the guesses made by image-recognition software he was building.
If a photo set generally associated women with cooking, software trained by studying those photos and their labels created an even stronger association.
The researchers’ paper includes a photo of a man at a stove labeled “woman.” If replicated in tech companies, these problems could affect photo-storage services, in-home assistants with cameras like the Amazon Look, or tools that use social media photos to discern consumer preferences.
Yatskar describes a future robot that when unsure of what someone is doing in the kitchen offers a man a beer and a woman help washing dishes.
Tech companies have come to lean heavily on software that learns from piles of data, after breakthroughs in machine learning roughly five years ago.
When they asked software to complete the statement “Man is to computer programmer as woman is to X,” it replied, “homemaker.” The new study shows that gender bias is built into two big sets of photos, released to help software better understand the content of images.
Both datasets contain many more images of men than women, and the objects and activities depicted with different genders show what the researchers call “significant” gender bias.
In the COCO dataset, kitchen objects such as spoons and forks are strongly associated with women, while outdoor sporting equipment such as snowboards and tennis rackets are strongly associated with men.
Away from computers, books and other educational materials for children often are tweaked to show an idealized world, with equal numbers of men and women construction workers, for example.
“The datasets need to reflect the real statistics in the world.” One point of agreement in the field is that using machine learning to solve problems is more complicated than many people previously thought.
Using machine learning to predict gender
With that in mind, we ran a simple data categorization job, fired up our brand new CrowdFlower AI feature, and tried to answer just that.
To put it another way: if you fetch social data about, say, an especially seedy strip clubs, odds are, you’re going to get a few more male-authored tweets than female-authored ones.
But, importantly, we did a little something extra: in addition to a swath of random tweets, we also captured the user’s profile description (the “about me,”
With our data fetched, we ran a data categorization job where we asked our contributors to visit the profile pages of Twitter accounts and judge the gender of each.
We weren’t expecting the model to be super confident about its predictions–after all, each data row had just a single tweet, a profile, and some ancillary information to look at.
Here comes the science: First, our machine learning feature looks at each data row (which in this case is a tweet, a profile, etc.) and the judgment our contributors made for each of those rows.
And since we pulled the colors these accounts used for their links and sidebars, the model was able to look at hex codes and figure out which colors were most often associated with men, women, or brands.
few words on some of the other predictors we found: We’ll get to the anti-predictors in a second, but since a fair share them predict an account belongs to a woman, let’s look at that data instead.
That, of course, isn’t to say that every female account we saw had one of those, but rather that if a heart appeared in a tweet or profile, our model was very confident that account belonged to a woman.
As we did with the gents, here are a few others worth comment: Our model also looks at data that appears in the set but is actually unlikely to correlate to a certain account type.
In the end, the model is only about 60% confident it can look at an account, complete with link color, description, and a single random tweet with the word “and”
Removing bias from machine learning
This is partly a data problem, but because many of the biases are ‘unconscious’ we only spot them when the machine learning algorithms trained on the biased data sets produce outrageously sexist or racist results.
Their de-biasing system uses real people to identify examples of the types of word connections that are appropriate (brother/sister, king/queen) and those that should be removed from the massive language datasets used to train intelligent systems.
Professor Fei-Fei Li–one of the comparatively small number of female stars in the AI world who has recently moved to Google –brought this point home at a Stanford forum on AI recently, arguing whether AI can bring about the “hope we have for tomorrow”and whether this depends in part on broadening gender diversity in the AI world.
With such an imbalance in the sheer numbers of male and female engineers and with competition for candidates in AI so intense, why is it so hard to get more women in the workplace in engineering roles even with gender neutral and inclusive recruitment practices?
We need to encourage women who have left the industry after children or for other reasons to retrain and come back andshow all female engineers that machine learning companies are great places to work – policies should be built in and we must lead from the top.
How to Fix Silicon Valley’s Sexist Algorithms
The presidential campaign made clear that chauvinist attitudes toward women remain stubbornly fixed in some parts of society.
The resulting data sets, known as word embeddings, are widely used to train AI systems that handle language—including chatbots, translation systems, image-captioning programs, and recommendation algorithms.
This makes it possible for a machine to perceive semantic connections between, say, “king” and “queen” and understand that the relationship between the two words is similar to that between “man” and “woman.” But researchers from Boston University and Microsoft Research New England also found that the data sets considered the word “programmer” closer to the word “man” than “woman,” and that the most similar word for “woman” is “homemaker.” James Zou, an assistant professor at Stanford University who conducted the research while at Microsoft, says this could have a range of unintended consequences.
The researchers also developed a way to remove gender bias from embeddings by adjusting the mathematical relationship between gender-neutral words like “programmer” and gendered words such as “man” and “woman.” But not everyone believes gender bias should be eliminated from the data sets.
What constitutes a terrible bias or prejudice in one application might actually end up being exactly the meaning you want to get out of the data in another application.” Several word embedding data sets exist, including Word2Vec, created by researchers at Google, and GloVe, developed at Stanford University.
Researchers Combat Gender and Racial Bias in Artificial Intelligence
When Timnit Gebru was a student at Stanford University’s prestigious Artificial Intelligence Lab, she ran a project that used Google Street View images of cars to determine the demographic makeup of towns and cities across the U.S.  While the AI algorithms did a credible job of predicting income levels and political leanings in a given area, Gebru says her work was susceptible to bias—racial, gender, socio-economic. She was also horrified by a ProPublica report that found a computer program widely used to predict whether a criminal will re-offend discriminated against people of color.
Companies, government agencies and hospitals are increasingly turning to machine learning, image recognition and other AI tools to help predict everything from the credit worthiness of a loan applicant to the preferred treatment for a person suffering from cancer.
Now in the wake of several high-profile incidents—including an AI beauty contest that chose predominantly white faces as winners—some of the best minds in the business are working on the bias problem.
Even when the data accurately mirrors reality the algorithms still get the answer wrong, incorrectly guessing a particular nurse in a photo or text is female, say, because the data shows fewer men are nurses.
Improving the training data is one way. Scientists at Boston University and Microsoft's New England lab zeroed in on so-called word embeddings—sets of data that serve as a kind of computer dictionary used by all manner of AI programs.
In this case, the researchers were looking for gender bias that could lead algorithms to do things like conclude people named John would make better computer programmers than ones named Mary.
academic behind the 2011 push towards AI fairness—have also proposed using different algorithms to classify two groups represented in a set of data, rather than trying to measure everyone with the same yardstick. So for example, female engineering applicants can be evaluated by the criteria best suited to predicting a successful female engineer and not be excluded because they don't meet criteria that determine success for the larger group.
While they see promise in various approaches, they consider the challenge not simply technological but legal too because some of the solutions require treating protected classes differently, which isn't legal everywhere.
Google researchers are studying how adding some manual restrictions to machine learning systems can make their outputs more understandable without sacrificing output quality, an initiative nicknamed GlassBox.
- On Tuesday, June 18, 2019
How to Make a Text Summarizer - Intro to Deep Learning #10
I'll show you how you can turn an article into a one-sentence summary in Python with the Keras machine learning library. We'll go over word embeddings, encoder-decoder architecture, and the...
Andrew Ng: Artificial Intelligence is the New Electricity
On Wednesday, January 25, 2017, Baidu chief scientist, Coursera co-founder, and Stanford adjunct professor Andrew Ng spoke at the Stanford MSx Future Forum. The Future Forum is a discussion...
Intro to Amazon Machine Learning
The Amazon Web Services Machine Learning platform is finely tuned to enable even the most inexperienced data scientist to build and deploy predictive models in the cloud. In this video, we...
Build a TensorFlow Image Classifier in 5 Min
In this episode we're going to train our own image classifier to detect Darth Vader images. The code for this repository is here:
Lifecycle of a machine learning model (Google Cloud Next '17)
In this video, you'll hear lessons learned from our experience with machine learning and how having a great model is only a part of the story. You'll see how Google BigQuery, Cloud Dataflow...
Age and Gender Classification using DeepLearning
Here's What's Really Going on With That Study Saying AI Can Detect Your Sexual Orientation
Last week, scientists made headlines around the world when news broke of an artificial intelligence (AI) that had been trained to determine people's sexual orientation from facial images more...
Large Scale Machine Learning
Dr. Yoshua Bengio's current interests are centered on a quest for AI through machine learning, and include fundamental questions on deep learning and representation learning, the geometry...
Why We Need More Women Working in Data Science @PeejJain #DataTalk
In our today's #DataTalk, we had a chance to talk with Payal Jain about the importance of women and diversity in the data science industry. Learn about her work with:
Dinesh Nirmal, IBM | IBM Machine Learning Launch
Dinesh Nirmal, Vice President of Analytics Development for IBM Analytics, sits down with Dave Vellante & Stu Miniman at the IBM Machine Learning Launch Event at the Waldorf Astoria Hotel in...