Research-to-impact process
Outlined below are several new ideas and ongoing research projects in our lab. This research feeds into a careful cadence for the development and deployment of products built around the CoRE stack. See this manual as an example of research outputs for many datasets and indicators related to water security, and how the outputs are now being used by field partners. Much opportunity of using data and technology for climate action and environmental sustainability requires careful translational work from research to deployment, and therefore this process works well only through a collaboration between academia, engineers, and implementation organizations. With each day, we are perfecting this process so that it can become a model for translating research to impact at scale.
Skillsets: Most of these projects involve the use of machine learning on geospatial data and statistical analysis, but also span human-computer interaction and systems building for scaling computation. Interested students should write to me.
Last updated – Oct 2024.
Forests and ecology
These projects are being integrated into a broader product vision of guiding forest restoration projects to provide decision support and evidence for communities to manage their forests in a sustainable manner. We want to be able to help spot specific locations requiring restoration, then provide guidance in terms of which tree species to plant in these areas, and build a process to do this in consultation with communities to leverage local ecological knowledge.
Tree Species Biodiversity Estimation and Traits Classification - Shiva, Dhruvi
Biodiverse ecosystems enhance resilience, allowing forests to adapt to changing conditions, such as shifting temperatures and increased pests. A mix of species can improve carbon sequestration, and support healthy soil and water systems, which are vital for ecosystem stability. In this project we aim to monitor tree species biodiversity using satellite imagery. We use PCA and unsupervised clustering methods to determine high and low biodiverse hotspots in different eco-regions. We are also working on identifying traits such as leaf-shed duration and leaf size, to build a finer assessment of tree species occurrence.
Co-creation partnerships: ATREE, NCF.
Species Distribution Modelling - Jyotiraditya, Shourya, Dhruvi
Utilizing Species Distribution Modeling (SDM), the project evaluates various environmental and climatic factors that influence the distribution and habitat preferences of different tree species. The aim of this study is to support forest restoration planning and conservation efforts by recommending specific species in degraded or less biodiverse sites to improve conservation strategies across India's diverse eco-regions.
Co-creation partnerships: ATREE, NCF.
Forecasting Forest Fires in India Using Climate and Social
Variables - Raman, Anaad
In India, large forest fires are usually natural, while small forest fires are typically caused by human activity. Both types of fires can be detrimental to forest resources, biodiversity, and communities that live alongside the forests. Although large forest fires are easier to detect through remote sensing data, small forest fires often go unnoticed. Furthermore, while efforts are made to detect forest fires, there is currently no model capable of forecasting both small and large forest fires. A reliable forecast could provide communities and forest authorities with valuable time to reach the fire site, control the blaze, and evacuate people if necessary. We are building models to detect and forecast both types of fires. Climate variables, such as latent heat flux and long-wave radiation flux, indicate the conditions necessary for a fire to occur. Meanwhile, human factors, including the extent of settlements, development, and the proximity of indigenous populations to the forest, reflect the likelihood of human-caused fires. Our goal is to model these variables to predict the probability of forest fires in specific regions.
Multi-resolution mapping of trees - TBD
We want to develop a methodology to identify a forest patch of a few dozen acres and profile it repeatedly through drone monitoring on a fortnightly basis. When coupled with geo-tagging of trees done by field workers, this will provide us with a detailed imagery of changes in tree characteristics such as leaf shedding, flowering, fruiting, etc. We will use computer vision techniques to automatically annotate the forest patch. And then use these annotations as groundtruth data to train models using lower-resolution satellite data to estimate the tree characteristics. This way, through drone-assisted groundtruth data collection in a small area, we will be able to characterize changes in large forest areas. We want to start doing this in the IIT Delhi campus where we have already done a detailed geo-tagging of trees.
Agro-ecology
These projects are meant to feed into a broader product vision to match locations appropriately with agroforestry and cropping systems to build climate resilience, improved livelihood, and food security, for local communities.
Plantation Suitability Scoring Model - Anoushka, Shiva, Dhruvi
Tree plantations can be economically
beneficial for farmers but could be undertaken in areas that cannot meet the
demand of water or other resources for the specific tree species. It is
therefore important to understand the suitability of a region for the
plantation of a particular tree species. This project aims to assess the
suitability of putting up a plantation at a given site by assigning a score
that aggregates multiple ecological and social parameters related to climate,
soil, topography etc. Remote sensing information of variables like mean
temperature, soil texture and NDVI are aggregated using weights and suitability
classes defined by the study and by domain experts.
Co-creation partnerships: Say Trees, FES.
Cropping Diversity Assessment - Prakkhyaat
The importance of diversity in cropping patterns is well recognized, to allow for soil regeneration, better nutrition, and higher resilience from climatic factors or pest attacks. In this study, we used district-wise and season-wise crop production statistics, supplemented with Sentinel-2 spectral data, to estimate cropping diversity in an area. We use PCA and unsupervised clustering methods to process and analyze the data, and calculate biodiversity indices that can be traced over time.
Hydrology
These projects aim to improve the underlying datasets and methods built into the Know Your Landscape and Commons Connect tools for water security planning by communities.
Evapotranspiration Downscaling - Vidushi, Utkarsh, Shivani
In countries like India which
have historically been reliant on rainfed agriculture, the increasing need of
water for irrigation to support greater cropping intensity and shifts towards
horticulture, has largely been supported through groundwater-based irrigation.
Cheap electricity has enabled a rapid increase in borewells almost across the
country, which to some extent has enabled more equitable access to water than
other irrigation approaches like canals but has also led to groundwater stress
in many regions. One way to indirectly estimate groundwater abstraction is to
estimate evapotranspiration from cropping areas as a proxy for crop water
consumption. Remote sensing has been used to estimate evapotranspiration
but existing open data products largely have a low spatial resolution which is
not adequate to support local decision making for water use. In our initial
experiments, we used machine learning methods to downscale MODIS data
(originally at 500m) outputs of evapotranspiration at 30m using meteorological
variables from GLDAS, and land surface characteristics from Landsat as input
features. We observed that our method was not able to accurately match in situ
data but was able to successfully provide relative differences in
evapotranspiration. As a further effort to build an accurate and scalable
Indian ET product, we plan to conduct the next set of experiments using surface
energy balance methods, land cover specific models, and crop specific models to
estimate ET at finer scales.
Co-creation partnerships: WELL Labs.
Impact of New Waterbodies on Downstream Waterbodies - Ankit, Shivani
Our CSO partner, FES, has built a
method to assess the groundwater recharge potential of a site to determine its
suitability for water conservation structures like checkdams,
trenches, and percolation tanks. During field testing of several of these
outputs, experienced CSO field staff and volunteers indicated that in addition
to assessing this recharge potential, it will be useful to also assess the runoff
accumulation capacity for the proposed site. We are building geospatial algorithms
that simulate rainfall and runoff on digital elevation maps to determine potential
runoff accumulation in drought and non-drought years. Further, once new water
bodies are constructed, they may reduce runoff accumulation in downstream water
bodies. We have built a method to obtain a connectivity graph of water bodies,
and then study how the new water bodies may alter the runoff flow.
Co-creation partnerships: FES.
Modelling Changes in Groundwater Levels in Shallow and Deep
Aquifers - Badrinath, Shivani
Changes in groundwater levels can
be correlated with changes in cropping patterns, cropping intensity,
construction of groundwater recharge structures, and rainfall patterns, to
understand the interplay between hydrological and social aspects impacting
groundwater use. The Central Ground Water Board (CGWB) conducts seasonal
measurements across the country of about 25,000 observation wells. The data is
known to have several limitations though. The spatial resolution is too coarse
to understand groundwater patterns at finer scales such as villages and
panchayats which is the level at which decision making is done. The observation
wells mostly measure water table levels in the unconfined shallow aquifer
zones. Most regions in India have mixed aquifer systems and CGWB data often
does not tally with ground reports where borewells are used to access
groundwater from deeper or confined aquifer zones. The Atal Bhujal
Yojana (ABY), a Government of India scheme, focused on improving groundwater
levels in India, monitors groundwater levels in deeper aquifers using borewell
data. We have implemented a water-balance method to estimate changes in
groundwater levels at a micro-watersheds scale on a fortnightly basis. We use
the precipitation data over a micro-watershed, subtract from it the runoff, and
subtract further the evapotranspiration from the micro-watershed, leaving the
balance as change in groundwater. The methodology currently does not segregate
between groundwater usage in the shallow and deeper aquifer zones and does not
take lateral flow into account. In ongoing research, we are using CGWB well
data and ABY borewell data to build a machine learning based calibration model
that controls for aquifer properties to get more accurate measures of
groundwater changes in shallow and deep aquifers.
Identifying groundwater recharge and discharge zones - TBD
Depth to water in wells when measured from a sufficiently high density of wells in an area can reveal the underlying aquifer structure and contours of groundwater availability. This is especially useful to identify groundwater recharge and discharge areas, and guide the planning of water structures.
Co-creation partnerships: WASSAN.
Flood hazard mapping and forecasting - Vibhanshu
Floods affect millions of people globally each year, causing significant loss of life, displacement, and economic damage, with regions like the Kosi River Basin being particularly vulnerable. Accurately identifying areas that face flood hazards and forecasting future flooding events are both useful. We are using methods to detect transient surface water presence using Sentinel data to first build a hazard map of areas prone to flooding. Using this hazard map, we are then building machine learning models to predict new areas that could see floods in the coming fornight based on weather forecasting data. We are also trying to learn about streamflow data availability in different river basins to factor that as well for early-warning systems.
Impact assessment and social-ecological interactions
Good modeling can lead to valuable new insights for planning. These projects are meant to integrate into the Know Your Landscape and Commons Connect tools to conduct impact projections and compare different water security plans with one another.
Site Level Impact Assessment of Farm Ponds - Sanya, Anika, Ramneek
NREGA is a significant initiative aimed at promoting sustainable livelihoods by providing demand-driven employment, with a focus on natural resource management projects such as farm ponds. Farm ponds play a crucial role in water conservation, improving irrigation, and enhancing agricultural resilience in drought-prone regions. However, evaluating the impact of these projects has been challenging, as past studies have often relied on traditional survey methods and were limited to small regions, making it difficult to capture broader variability. This project utilizes Double Machine Learning (DML) on remote sensing data to improve the precision of impact assessments and better capture the Heterogeneous Treatment Effects (HTE) of farm ponds on crop yields. DML, combined with Difference-in-Differences (DiD), allows us to control for confounding factors and generate more accurate, region-specific insights. Initial results indicate a significant increase in crop productivity at certain treated sites, with further analysis expected to uncover unobserved factors influencing these outcomes. Moving forward, this approach could optimize NREGA's strategies, enhance policy decision-making, and provide more reliable projections for future impacts, particularly in understanding how natural resource management interventions can be tailored to different environmental contexts.
Landscape Level Impact Assessment of NRM structures for
Water Conservation - Aryansh, Ramneek
Natural Resource Management (NRM)
interventions such as farm ponds, check dams, trenches cum bunds, and so on,
are believed to positively impact the crop yield and drought resilience in the
surrounding cropping region either through supplemental irrigation or through
direct improvement in soil moisture in the immediate neighbourhood
of the intervention. Our recent study on
assessing the site-level impact of farm ponds substantiates this hypothesis.
However, at a landscape-level, NRM structures could have a negative impact on
downstream regions as they capture the runoff which would otherwise have flown
into and benefitted the downstream regions, or improved
water availability due to the NRM structures could prompt an increase in
groundwater extraction and leave less water for other cropping areas. In this
study, we aim to study the impact of NRM structures on the zone of influence of
downstream water bodies and on cropping areas outside the zone of influence of
water bodies by leveraging remote-sensing data products and employing the
Difference-in-Differences and the Double ML methods. We are currently building
models to estimate the zone of influence around water bodies and other NRM
structures as a function of their size. Our preliminary tests confirm the
hypothesis that the soil moisture around NRM structures decreases with
increasing distance. The next step in the framework is to identify sites that
are similar to treated sites but did not receive the
intervention (counterfactual sites), and then compare the outcomes (vegetation
indices) in their zone of influence.
Co-evolution of Economic Wealth and Environmental Sustainability
in Indian Villages - Badrinath, Mrunal
The Government of India conducts
nationwide surveys on socio-economic, health, and education indicators.
However, these surveys are time-consuming, conducted at variable frequencies,
and have significant delays in delivering results. Our project aims to address
this by developing a Relative Wealth Index (RWI) using satellite data as an
indicator to quantify the socio-economic status of villages. Additionally, we
are working on identifying sustainability factors to assess how villages can develop
economically while being environmentally sustainable as well. We are currently
analyzing changes in RWI of villages against environmental factors such as forest
cover, agricultural land-use, and groundwater stress, with population caste
division and climate variable as additional co-variates. We classify villages
into forest-fringe, rainfed, or irrigated (via canal or groundwater)
categories, and assess RWI and sustainability changes over time. Eventually we also
want to build indicators of ecological pressure to identify areas that are
especially vulnerable to ecological stress.
System Dynamics and Modeling of Microwatersheds - Anamitra, Shivani, Ramneek
The sustainable management of microwatersheds is critical for addressing long-term water security, particularly in regions reliant on groundwater for agriculture. However, understanding the intricate dynamics of these systems, which summarize climatic, hydrological, physiological, and socio-economic factors at the microwatershed level, presents a complex challenge. In this study, we leverage remote sensing data to model and analyze the dynamics of microwatersheds, using which "what if" scenarios can be constructed to evaluate the impacts of various interventions like the creation of water bodies or the building of NREGA structures. These interventions aim to reduce groundwater stress and facilitate a transition toward rainwater-fed irrigation systems. Our simulations incorporate various climate-induced scenarios to assess the influence of prospective interventions on water availability and system sustainability.
Simplifying the practice of socio-ecological sustainability
on the ground
We believe that technology will positively impact communities only when they are able to understand and operate it themselves. Building upon our experience of having done this successfully at Gram Vaani with communities being able to use voice-based participatory media for social accountability, knowledge sharing, and community awareness, we now envision a similar model wherein community stewards from within local communities can use the data and technology outputs of the CoRE stack for better planning and ecological management of their landscapes. To do this, we need training and knowledge building tools for community stewards.
Leveraging LLMs for Landscape Assessment and Climate
Action Recommendations - Priyadarshini, Shivani
The planning for NRM assets needs
to be locally contextual at fine scales of micro-watersheds, characterized by variables
including rainfall patterns, groundwater stress, soil types, terrain, etc., and
described in easy ways for field cadre and community stewards to follow. Large
Language Models (LLMs) can potentially play a role in automatically generating
such descriptions using their comprehensive knowledge base and ability to
process time-series data. We intend to generate LLM assisted landscape level descriptions
by using data of various socio-ecological variables
and a series of prompts. Given the landscape characterization, we then intend to
explore whether LLMs are also able to recommend landscape specific NRM assets
by using their own knowledge base along with data on MGNREGA assets that have
been approved in different areas, region-specific CSO reports on water security,
and recently prepared DPRs consisting of proposed water assets. We have started
with investigating the ability of using LLM to interpret univariate and
multivariate time-series of various socio-ecological variables.
Co-creation partnerships: Viksit Labs, Saarland University.
Land-use and land-cover classification
"You can only manage what you can measure" -- high fidelity monitoring of land-use changes is critical to understand the current state of affairs and to use the data to uncover causal pathways that would have led to the current situation. These projects improve and add new elements to the extensive monitoring already available on the CoRE stack.
Crop Classification using Remote Sensing Data - Pratham
Accurate crop identification is essential to monitor agricultural practices. We are developing a model that predicts crop types by leveraging vegetation indices obtained form satellite imagery. The methodology involves training machine learning models using historical crop data, remote sensing inputs, and feature engineering to capture relevant patterns. Initial results demonstrate promising accuracy in being able to distinguish between major crops in an area, such as to distinguish maize, paddy, and other cereals grown commonly during the Kharif season in Karnataka. This research paves the way for more efficient crop monitoring systems that can help determine sustainable agricultural management.
Co-creation partnerships: WELL Labs.
Identification of Ponds, Wells, and Plantations from
Google Maps - Aatif
The Mahatma Gandhi National Rural Employment Guarantee Act (MGNREGA) has led to the creation of numerous assets across India, including ponds, wells, and plantations. However, there is a lack of comprehensive information regarding these assets' precise locations and conditions. Our study presents an approach to automatically detect, delineate, and geographically locate MGNREGA assets using satellite maps. We download high-resolution image tiles from mapping sources at various zoom levels and apply object detection and image segmentation models to identify these assets. We then plan to use this mapping to assess (a) Equity in access to these assets within and across villages among different social groups; (b) Impact of these assets on agricultural productivity; and (c) potential future benefits of new assets. By providing a scalable solution for identifying MGNREGA assets, our research offers valuable insights for rural development policymakers and researchers. The developed methodology complements efforts to ensure quality planning and long-term monitoring of assets created over the years.
Entropy-Based Differentiation of Agricultural Land from Scrubland
Using High-Resolution Geo-Spatial Data - Raman
In India, single-cropped agricultural
land and scrubland (also referred to as shrublands, wastelands, etc.) often
exhibit similar spectral signatures, making it challenging to distinguish
between them using traditional methods applied on remote sensing data. However,
the recent availability of high-resolution data and advancements in computer
vision applied to geo-spatial data have shown promising results in delineating
field boundaries. Our investigation reveals that when these models are applied
to both agricultural land and scrubland, they provide satisfactory delineation
of fields while also segmenting scrubland into approximate blobs. To further
differentiate these blobs from fields, we have developed an entropy-based
method. This method operates on the assumption that agricultural fields will
exhibit less variation in texture or display structured variations due to human
activity, whereas scrubland, being naturally formed, exhibits more random and
unstructured texture. This insight is central to our approach, which we aim to
model and refine to improve our land use and land cover (LULC) classification.
Co-creation partnerships: WELL Labs.
Identifying and monitoring Orans - TBD
Orans are community-managed large tracts of land in arid areas in Rajasthan, where a delicate balance of water use for cropping, grazing of trees and grasses for livestock, and socio-economic development to meet the aspirations of the local communities, needs to be maintained. KRAPAVIS has surveyed over 100 Orans and identified key stress markers in each of them. But there are over 5,000 Orans in Rajasthan that need to be mapped and understood to be able to guide further development.
Co-creation partnerships: KRAPAVIS.
Scaling computation
All of the above work requires significant computation on large datasets - satellite imagery, remote sensing multispectral data, machine learning, groundtruth validation, error correction, etc. Managing long data flow chains and being able to run the algorithms at scale is crucial to make this vision a reality.
Optimizing Satellite Image Processing Workloads - Vatsal, Rahul
Planetary-scale computation on
large datasets, such as those covering India, presents significant challenges
in terms of efficiency, particularly for long pixel dependency tasks that
current platforms like Google Earth Engine do not handle efficiently. The
problem is exacerbated by limited GPU memory and the need for distributed
processing to manage the substantial memory requirements of some tasks. To
address this, we are developing a framework utilizing GPUs and efficient
kernels, along with GPU-based Python libraries like rasterio,
cupy, and cucim, to perform
hydrological and environmental analyses more effectively. Initial benchmarks
for our runoff script have shown promising results, with accuracy and
processing times comparable to Google Earth Engine, demonstrating the potential
of our approach. Future work will focus on implementing long pixel dependency
pipelines for flow accumulation, developing a distributed setup to overcome
memory constraints, and utilizing Popper for efficient workflow execution. This
approach has the potential to transform large-scale environmental and
hydrological analyses by significantly enhancing performance and scalability.