AI News, Five principles for applying data science for social good

Five principles for applying data science for social good

Editor's note: Jake Porway expanded on the ideas outlined in this piece in his Strata + Hadooop World NYC 2015 keynote address, 'What does it take to apply data science for social good?'

It’s a satirical take on our sector’s occasional tendency to equate narrow tech solutions like “software-designed data centers for cloud computing” with historical improvements to the human condition.

Whether you take it as parody or not, there is a very real swell in organizations hoping to use “data for good.” Every week, a data or technology company declares that it wants to “do good” and there are countless workshops hosted by major foundations musing on what “big data can do for society.” Add to that a growing number of data-for-good programs from Data Science for Social Good’s fantastic summer program to Bayes Impact’s data science fellowships to DrivenData’s data-science-for-good competitions, and you can see how quickly this idea of “data for good” is growing.

Yes, it’s an exciting time to be exploring the ways new datasets, new techniques, and new scientists could be deployed to “make the world a better place.” We’ve already seen deep learning applied to ocean health, satellite imagery used to estimate poverty levels, and cellphone data used to elucidate Nairobi’s hidden public transportation routes.

At DataKind, we’ve spent the last three years teaming data scientists with social change organizations, to bring the same algorithms that companies use to boost profits, to mission-driven organizations in order to boost their impact.

Hillary Clinton, Melinda Gates, and Chelsea Clinton stood on stage and lauded the report, the culmination of a year-long effort to aggregate and analyze new and existing global data, as the biggest, most comprehensive data collection effort about women and gender ever attempted.

These datasets are sometimes cutely referred to as “massive passive” data, because they are large, backward-looking, exceedingly coarse, and nearly impossible to make decisions from, much less actually perform any real statistical analysis upon.

The promise of a data-driven society lies in the sudden availability of more real-time, granular data, accessible as a resource for looking forward, not just a fossil record to look back upon.

Mobile phone data, satellite data, even simple social media data or digitized documents can yield mountains of rich, insightful data from which we can build statistical models, create smarter systems, and adjust course to provide the most successful social interventions.

To affect social change, we must spread the idea beyond technologists that data is more than “spreadsheets” or “indicators.” We must consider any digital information, of any kind, as a potential data source that could yield new information.

“data science is not overhead.” But there are many organizations doing tremendous work that still think of data science as overhead or don’t think of it at all, yet their expertise is critical to moving the entire field forward.

As data scientists, we need to find ways of illustrating the power and potential of data science to address social sector issues, so that organizations and their funders see this untapped powerful resource for what it is.

It was clear that, like so many other well-intentioned efforts, the project was at risk of gathering dust on a shelf if the team of volunteers couldn’t help the organization understand what they had learned and how it could be integrated into the organization’s ongoing work.

Take, for example, a seemingly innocuous challenge like “providing healthier school lunches.” What initially appears to be a straightforward opportunity to improve the nutritional offerings available to schools quickly involves the complex educational budgeting system, which in turn is determined through even more politically fraught processes.

DataKind is piloting a collective impact model called DataKind Labs, that seeks to bring together diverse problem holders, data holders, and data science experts to co-create solutions that can be applied across an entire sector-wide challenge.

The current approach appears to be “get the tech geeks to hack on this problem, and we’ll have cool new solutions!” I’ve opined that, though there are many benefits to hackathons, you can’t just hack your way to social change.

Under this media partnership, we will be regularly contributing our findings to O'Reilly, bringing new and inspirational examples of data science across the social sector to our community, and giving you new opportunities to get involved with the cause, from volunteering on world-changing projects to simply lending your voice.

Organizing Your Social Sciences Research Paper: Writing Field Notes

The ways in which you take notes during an observational study is very much a personal decision developed over time as you become more experienced in observing.

However, all field notes generally consist of two parts: Field notes should be fleshed out as soon as possible after an observation is completed.

Your initial notes may be recorded in cryptic form and, unless additional detail is added as soon as possible after the observation, important facts and opportunities for fully interpreting the data may be lost.

Finding Meaning: The Data Problem with Small Area Predictive Analysis in the Social Sciences (Part 1)

I know that community health and environmental risk factors weigh in heavily when it comes to rates of students put in special education programs as well as overall student performance.

From anecdotal experience, I’ve found these factors, which are outside the control of the student, heavily affect their chances of graduating high school as well as continuing their education beyond high school.

So I set out to see if I could find statistically significant links using Small Area Analysis between educational attainments and environmental factors as well as community health indicators (such as life expectancy, access to alcohol, rate of premature births and many others).

Small Area Analysis would allow me to analyze how trends vary throughout the city, especially because what it means to “be a New Yorker” changes so much that life can be very different between two families living just ten blocks away from each other.

I initially set out to find data from any large metropolitan area (and hopefully from several) with my desired data aggregated into and separated by appropriately sized neighborhoods.

In addition, all the demographic data was concentrated in the earlier columns, so I created a list of the column headings by using original_CHS_columns = raw_CHS_data.columns.values.tolist() and it was easy enough to drop the first 33 columns.

So then I had to check for NaN values within the CHS data, and to my consternation there were many columns with such values, so I aggregated them into a list using the following code and found that all but one (the first) of the columns were simply “reliability notes”.

Sociology Research Methods: Crash Course Sociology #4

Today we're talking about how we actually DO sociology. Nicole explains the research method: form a question and a hypothesis, collect data, and analyze that ...

Modern Educayshun

The follow up to #Equality, Modern Educayshun delves into the potential dangers of a hypersensitive culture bred by social media and political correctness.

Before Theory Comes Theorizing or How to Make Social Science More Interesting

Speaker(s): Professor Richard Swedberg Chair: Professor Nigel Dodd Recorded on 15 October 2015 at Sheikh Zayed Theatre, New Academic Building Editor's ...

What Is Sociology?: Crash Course Sociology #1

Today we kick off Crash Course Sociology by explaining what exactly sociology is. We'll introduce the sociological perspective and discuss how sociology ...

Hardest Computer Science Course Explained | Angel of Death UoG

Since you guys really liked the last computer science video I decided to talk about my hardest CS course, nicknamed the "angel of death" at my school. It has a ...

4 Revisions+Phy+Chem+Math+Tuition+Coaching+JEE MAINS/NEET + Pre-Boards | MY Strategy | DPS RKP

For JEE MAINS Students : Instagram Handle : Facebook .

Overpopulation – The Human Explosion Explained

In a very short amount of time the human population exploded and is still growing very fast. Will this lead to the end of our civilization? Check out ...

Psychological Research - Crash Course Psychology #2

You can directly support Crash Course at Subscribe for as little as $0 to keep up with everything we're doing. Also, if you ..

11 Secrets to Memorize Things Quicker Than Others

We learn things throughout our entire lives, but we still don't know everything because we forget a lot of information. Bright Side will tell you about 11 simple ...

Issues of Measurement in Sociology

This video provides a brief overview of measurement issues in sociology. Levels of measurement, operationalization of concepts, and other issues are ...