How Satellite Data Can Aid the Identification of Underserved and Unelectrified Communities
Introduction
Over the last decade, the use of satellite imagery has become increasingly widespread in the data science community. The introduction of new satellites in recent years has allowed for higher spatial and temporal resolutions that were not previously possible. In addition, the democratization of satellite data has been accelerated by the abundance of companies offering images for a low cost, or even for free, in many cases. These advances have allowed for the scope of satellite image data analysis to broaden exponentially, with many now able to employ sophisticated techniques, such as machine learning, to empower new applications of this technology.
While there is much debate about the ethical dilemmas posed by the ubiquity of satellite data in our modern age, one area in which it can be leveraged for good is the identification of underserved and endangered communities. Indeed, there are many case studies that exemplify the utility of satellite data to aid with humanitarian and social causes, from detecting communities without access to electricity to pinpointing households affected by natural disasters for fast rescue and relief efforts [1-4]. This use of satellite imagery to help people in need across the world will only increase as both the data analysis tools and the satellite technology continue to improve.
As part of our ongoing efforts to alleviate poverty and improve living conditions in remote areas here at Lotus Project, we developed an RDM index to allow for the identification of regions and villages in need of electrification. Several data sources were used to calculate the RDM index, one of which was satellite image data. Furthermore, we were also able to utilize satellite data to produce a deep learning model capable of identifying small buildings in rural villages. In combination with other satellite images and geospatial data, this model may allow for more direct identification of villages without access to electricity.
The features of satellite data
Satellite data is often associated with the types of images found on services such as Google Maps or Bing Maps. However, satellites capture an incredible range of data that allow for applications far beyond common mapping solutions. In addition to capturing visible light data, satellites can also detect other wavelengths in the electromagnetic spectrum such as near-infrared (NIR) and short-wavelength infrared (SWIR). The range and versatility of the data recorded by the multi-spectrum instruments found on satellites, such as Sentinel-2 [5], has led to the emergence of new types of imaging techniques for a wide array of uses.
One imaging technique that is a staple of many remote sensing applications is NDVI (normalized difference vegetation index). As the name suggests, NDVI is calculated by normalizing the difference between the red light and NIR bands, shown by the following equation:
NDVI = (NIR-Red)/(NIR+Red)
Using red visible light and NIR to calculate the vegetation index works because the chlorophyll found in live vegetation strongly absorbs visible light, while the cellular structures of leaves reflect NIR light. The resulting NDVI image will contain values from -1 to +1, with positive values indicating areas of high vegetation. This allows for NDVI to be used to identify the level of vegetation present in a given area, and by extension can be used to track changes in vegetation (e.g. deforestation). In regions with high levels of vegetation, such as Vietnam, NDVI imaging can also be used to easily distinguish between developed urban environments and small rural settlements. For example, the NDVI images for the Vietnamese capital, Hanoi, and the smaller city of Tuyên Quang demonstrate how this vegetation index produces greatly contrasting images for each location.
Built-up areas such as Hanoi contain far lower NDVI values than more rural cities such as Tuyên Quang. This is observed throughout Vietnam, where small villages will appear as very small dark patches amongst the lighter areas of high vegetation. As such, NDVI values can be used in the RDM index to indicate populated areas vs more rural settlements.
Another type of satellite data that is often studied is night light coverage. The night light activity in a country or region is generally considered to be a good approximation of the economic development, with several studies showing that there is a correlation between the two [6-8]. For our project to identify unelectrified villages, the night light data is directly relevant. However, it becomes especially useful when we combine this night light data with the electrical grid. Below is the night light data for northern Vietnam with a spatial resolution of approximately 460 m/pixel, overlaid with the electrical transmission lines.
As one would suspect, the city of Hanoi and its surroundings are lit up with a dense electrical transmission line network. There are also many spots away from the national grid that are lit up, indicating off-grid electrical solutions. The areas of particular interest, however, are the dark regions away from electrical transmission lines. It is here where there may be unelectrified villages in need of sustainable energy solutions.
Building detection
After identifying areas where small settlements lacking electricity may be present, the next step would be to find said settlements. However, to do so manually would be inefficient and impractical, especially when expanding the search beyond Vietnam. This is precisely the type of scenario where we can employ deep learning to help identify villages and save us a great deal of time.
The first step of the process was to create a dataset to train our deep learning model. Satellite images were acquired using SAS Planet to provide the best possible spatial resolution at no cost. Buildings were annotated by hand with bounding boxes using Make Sense, after which both the images and the annotations were processed using Roboflow to augment the training set and export the files ready for use with our model. The training set created to-date consisted of almost 1700 images, containing a wide variety of building densities and types.
We used the ResNet50 convolutional neural network, a model pre-trained on more than a million images, as our starting point. This was particularly useful as it allowed us to transfer across many learned features, such as edge detection, avoiding the lengthy computational time that would be required if starting from scratch. From there, we used Tensorflow and Keras to train the model on the training set of building images to optimise its performance for building detection from satellite images.
Early findings show that, despite the relatively small training set size, the model is able to predict the locations of buildings present in these images with reasonable accuracy, as seen in the examples above. However, by increasing the size and variation within the training set, we hope to further improve the model’s performance, allowing us to identify buildings in satellite images with a higher degree of confidence. Once this is complete, we will be able to feed the model with satellite images of areas that show up as dark on the night light data, and use the predictions produced to get coordinate data of potential villages. This will enable us to identify areas more rapidly and with more precision that can then be explored in greater depth to identify which would make the most suitable candidates for the implementation of off-grid renewable energy projects.
Conclusion
The use of satellite image data to tackle many problems around the globe is becoming increasingly prevalent in the modern world. Our work using this technology has already aided us in identifying possible locations around northern Vietnam that could greatly benefit from off-grid electrical power to deliver quality of life improvements and stimulate the local economy. As we continue to leverage satellite data alongside our RDM map, we hope to be able to expand the use of deep learning techniques to explore other developing countries and identify unelectrified communities.
CARLOS NOBLE JESUS