At Metis, Students & Alumni Use Data Science for Social Good
Regardless of industry, most organizations and businesses see the same issues rise to the surface. It's a challenge to stretch resources, get the maximum return on investment, and know exactly how to improve upon existing practices and policies. That's where data scientists come in, with their ability to extract information and insights from the growing amount of available data. And while the need for such data experts is well-documented, the need for them in sectors like local government, schools, and nonprofits is not always considered.
At the Metis Data Science Bootcamp, students often use their projects as opportunities to explore issues related to society, public policy, the environment, and other areas of civic interest. There's a growing group of socially conscious data scientists-in-training who are looking to apply their skills to make a positive impact.
Recently, one such student, Jeff Kao, created a final project uncovering that more than 300,000 pro-repeal Net Neutrality comments sent digitally into the Federal Communications Committee (FCC) were faked. His blog post summarizing the project went viral, and was cited in The Washington Post, Engadget, Fortune, Business Insider, and on the Late Show with Seth Meyers.
Below, read about three other Metis graduates who not only chose socially conscious projects while in the bootcamp, but who are now using their data science skills at Bill & Melinda Gates Foundation, New York City Department of Education, and Mathematica Policy Research.
Data Scientist, Bill & Melinda Gates Foundation
Metis Graduate, Spring 2017
Recent news has been dominated by coverage of a wide range of troubling natural disasters, coupled with both praise and criticism for the various responses to all the damage and suffering. It's timely then that recent Metis graduate Emily Miller's final project, Targeting Disaster Relief from Space, focused on how data science can improve response accuracy in the event of a natural disaster.
To accomplish her project goal, she used a dataset on Typhoon Haiyan, which occurred in the Philippines in 2013, focusing on the importance of understanding which specific areas suffer the most damage after a storm in order to prioritize relief efforts. She built a neural network to detect damaged buildings, and using the predictions from her model, created density maps of damage to illustrate priority areas for relief efforts.
Screenshot from Emily Miller's final project
Now a Data Scientist at the Bill & Melinda Gates Foundation, Miller is doing this sort of work every day, focusing on areas like agricultural development and poverty alleviation.
"I've been in the field of international development for a number of years because working to help improve the lives of others is very intrinsically motivating to me. Data science alone is quite fun but using it to help tackle some of the world's biggest challenges is a very rewarding space to be in," she said. "Combining data science with agricultural development work is still quite new, so what I most enjoy about my job is getting to show people what's possible — whether that's piloting models or visualizing existing data in new ways to make it more accessible, I feel my work is changing the conversation around how we use data."
Research Analyst in the Office of School Performance for the New York City Department of Education
Metis Graduate, Summer 2015
For her final project at Metis, Erin Dooley worked with data from DonorsChoose.org, a crowdfunding platform allowing teachers to raise money for materials needed in the classroom. While it's 70% success rate is certainly impressive, Dooley was curious about the 30% of projects that don't get funded, leaving teachers empty-handed or possibly forced to pay out-of-pocket for needed materials. She used a Decision Tree classifier to predict the success of projects posted by teachers on the site.
"I scored my models based on their precision, rationalizing that it was better to underestimate the probability of success on a project and have a teacher work a little harder to get funding rather than give them a false sense of security. My final model was able to correctly identify about 70-75% of the projects that ended up not getting funded," she said. After that, she built an app on which teachers could enter information about their projects and get a score for how likely they were to get funded.
Screenshot from Erin Dooley's original application.
Fittingly, Dooley is now a Research Analyst in the Office of School Performance for the New York City Department of Education, where she works on a team that builds tools allowing leaders better access to data about their school's performance.
"I like that my work impacts students, even indirectly," said Dooley. "I also like that there is freedom on my team to explore the questions that I find interesting. Besides the data analysis part of my job, I've also had the opportunity to teach Python to other analysts at the DOE, and work on my presenting and communication skills, which I find harder than the technical aspects of my job!"
Front-End Data Scientist at Mathematica Policy Research
Metis Graduate, Winter 2017
Metis is structured around 5 projects, and while the two examples above demonstrate final projects (which give students the freedom to choose their own direction), the following is an example of what's called Project McNulty, which is geared toward supervised learning and visualizations.
Metis graduate Eva Ward chose to focus her efforts on the water crisis in Flint, Michigan, comparing what happened there to what's going on in Massachusetts public schools (because the Massachusetts Department of Environmental Protection initiated an assistance program with public schools, making drinking water quality data from schools across the state is available.)
"As an environmental engineer by training and practice as well as a former AmeriCorps member and public servant, I was very interested in piecing together how this could have happened: my hypothesis is that Flint was at the center of a perfect storm of aging physical infrastructure, mismanagement of the public water supply, and socio-economic characteristics," she wrote in a Github repository about the project.
With the Flint framework in mind, she built models to explore the relationship between physical infrastructure, public water supply management indicators, socio-economic characteristics, and drinking water quality in Massachusetts public schools. You can see her full presentation slide deck here and dig into the data she used here.
Ward is now a Front-End Data Scientist at Mathematica Policy Research, an organization dedicated to improving public well-being by bringing the high standards of quality, objectivity, and excellence to information collection and analysis.
"I enjoy being able to develop my data analysis and visualization skills by working on a variety of projects focused on different sectors, such as health, human services, and the environment," she said. "Regardless of sector, our projects generally focus on figuring out what works and what doesn't when trying to improve a social program."
When asked why she pursues this line of work, Ward says, "I have always wanted to leave something, even just a little better than I found it."
As the volume of worldwide data continues to grow, as the pace of technology excels, and as competition ultimately rises alongside it all, the demand for data scientists like these, who see data as a way to make the world a better place, will only continue to increase. And Metis hopes for the continual privilege of training these passionate, forward-thinking individuals.