Could machine learning put impoverished communities back on the map?
Satellite images reveal enormous amounts of information about oncoming hurricanes, military troop movements and changes to the polar ice cap.
Thanks in part to the work of Stanford computer scientist Stefano Ermon, they can also help us understand and ultimately assist impoverished communities around the world. He recently embarked on a two-year study that builds on research in which his team created machine learning models to accurately infer poverty and wealth at the community level from satellite imagery. The model used things like nighttime light intensity, as well as features visible during the day, such as roads, tall buildings and even swimming pools, to accurately predict whether homes have access to electricity, piped water and sanitation. The program also predicts crop yields at harvest time and could help identify year-to-year changes that allow farmers to recognize and adapt to climate change.
Ermon’s approach begins by analyzing images of towns for which there is solid on-the-ground survey information. The program then teaches itself to find visual patterns, from color intensities to edges, that correlate with wealth or access to piped water. Over time, the program gets better at making predictions about social and economic conditions in areas that have no survey data.
It’s often difficult to explain why the system “sees” what it does, because it looks for patterns rather than specific features of the landscape. The program was never taught to recognize swimming pools, for example, but it nevertheless taught itself that swimming pools were tied to wealth and incorporated them into its model for mapping poverty.
“The beauty of the machine learning is we don’t know what exactly the program will look for, but it somehow figures out the right things,” Ermon says.
“Whatever its model is picking up correlates very well with the truth on the ground for where we have survey data.”
Ermon says the work has been an interdisciplinary effort involving, among Stanford students and researchers, Earth scientists David Lobell and Marshall Burke, political scientist Jeremy Weinstein and economist Pascaline Dupas.
In time Ermon hopes machine learning could use satellite imagery and on-the-ground data to almost literally put the world’s poorest communities back on the map. That’s important, because there is no accurate or current economic data for many remote communities. Some national governments don’t even want to provide it. That makes it difficult to know which areas need help or even which programs are effective. Examples of the work include:
Poverty and economic growth
Satellite images of nighttime light intensity offer a rough indicator of poverty and wealth, but they can be misleading. Is that dark area a dense but impoverished village or a vast estate owned by one rich family? To get a more accurate measure, Ermon and his colleagues used machine learning to combine the clues from nighttime light intensity with data from daytime images. It began by analyzing images of areas for which there is good survey data and identifying visual cues that correlated with wealth. Overall, the system maps changes in wealth down to the village level. It also allowed the researchers to analyze which areas saw rising or declining wealth. In Kenya, for example, it shows that wealth is highly concentrated around the capital city of Nairobi but that it had increased sharply in the southern region near the Tanzanian border.
Access to critical infrastructure
Clean water and sanitation are crucial to improving health, preventing illness and reducing child mortality. For many impoverished areas, however, on-the-ground data about water and sewage infrastructure is often limited or out of date. Pipelines are difficult to see by satellite, especially when it comes to pipes connected to individual homes. But by feeding information from data-rich communities into its model so that the computer can “learn” about other parts of the world, Ermon and his colleagues used machine learning to infer the presence of pipes with surprising accuracy — 86% in the case of sewage lines and 74% for water pipes.
The machine learning system also improves on current methods for estimating access to electricity. Even though nighttime light intensity is tied directly to electricity, it doesn’t offer much nuance. The light might indicate electricity for a commercial area, for example, but not for individual homes. By piecing together additional clues about what a given locale looks like in daytime satellite images, the new system produces a more nuanced view of life on the ground.
Food security
How much food can a field generate? Can the available agricultural land produce enough to support the local population? Ermon’s machine learning system trained itself to predict agricultural productivity based on the color intensity of fields shown in satellite images. For example, it can predict crop yields at harvest time by analyzing color intensities, water content and temperature during the growing season. And that’s just the start. Over time, he says, it should be possible to map crop disease and the likely agricultural impacts of climate change. That would allow farmers to proactively shift to crops better suited to new conditions.
This work is supported by the Stanford King Center on Global Development.