Identifying malaria vector breeding habitats with remote sensing data and terrain-based landscape indices in Zambia

Background Malaria, caused by the parasite Plasmodium falciparum, is a significant source of morbidity and mortality in southern Zambia. In the Mapanza Chiefdom, where transmission is seasonal, Anopheles arabiensis is the dominant malaria vector. The ability to predict larval habitats can help focus control measures. Methods A survey was conducted in March-April 2007, at the end of the rainy season, to identify and map locations of water pooling and the occurrence anopheline larval habitats; this was repeated in October 2007 at the end of the dry season and in March-April 2008 during the next rainy season. Logistic regression and generalized linear mixed modeling were applied to assess the predictive value of terrain-based landscape indices along with LandSat imagery to identify aquatic habitats and, especially, those with anopheline mosquito larvae. Results Approximately two hundred aquatic habitat sites were identified with 69 percent positive for anopheline mosquitoes. Nine species of anopheline mosquitoes were identified, of which, 19% were An. arabiensis. Terrain-based landscape indices combined with LandSat predicted sites with water, sites with anopheline mosquitoes and sites specifically with An. arabiensis. These models were especially successful at ruling out potential locations, but had limited ability in predicting which anopheline species inhabited aquatic sites. Terrain indices derived from 90 meter Shuttle Radar Topography Mission (SRTM) digital elevation data (DEM) were better at predicting water drainage patterns and characterizing the landscape than those derived from 30 m Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) DEM. Conclusions The low number of aquatic habitats available and the ability to locate the limited number of aquatic habitat locations for surveillance, especially those containing anopheline larvae, suggest that larval control maybe a cost-effective control measure in the fight against malaria in Zambia and other regions with seasonal transmission. This work shows that, in areas of seasonal malaria transmission, incorporating terrain-based landscape models to the planning stages of vector control allows for the exclusion of significant portions of landscape that would be unsuitable for water to accumulate and for mosquito larvae occupation. With increasing free availability of satellite imagery such as SRTM and LandSat, the development of satellite imagery-based prediction models is becoming more accessible to vector management coordinators.


Background
Malaria transmission throughout Africa is heterogeneous in space and time [1,2]. In the continent, it is estimated that 609 million people are at risk for malaria [3]. In Zambia, 34% of the population live in endemic risk areas while 48% of the population are in epidemic risk [4,5]. In epidemic areas malaria control programmes could make significant inroads in morbidity and mortality [3]. Recent analyses [6,7] suggest that larval control could play an important role in future control programmes, especially under these circumstances. In southern Zambia, malaria transmission is mainly associated with Anopheles arabiensis [8,9]. The prevailing climatic conditions in this dry, sub-humid environment restrict Anopheles gambiae complex and other anopheline mosquitoes, and limit the availability of suitable breeding habitats for An. arabiensis. Consequently, because of the restricted breeding habitats, malaria transmission risk is expected to be tightly clustered in space and time. Thus, identifying anopheline breeding habitats would allow focused control interventions to interrupt malaria transmission.
However, the identification of larval breeding sites is challenging. In southern Zambia, typical habitats for Anopheles larvae are partly sunlit, pools ranging in size from foot prints, to ponds, to slow moving streams [10]. Anopheles breeding habitats develop during the rainy season after the heavy rains, but begin to disappear at the start of the dry season until few or none remain. Such conditions will generate a complex dynamic of colonization, extinction and re-colonization of local anopheline populations, and help drive the seasonality of malaria transmission. Permanent breeding habitats that may act as "sources" of re-infestation in the wet season could represent targets for mosquito control interventions [11][12][13]. However, ground-based monitoring of potential aquatic breeding sites is labour-intensive, expensive and too difficult to maintain, except when these targets are few in number, easily accessible and well-defined [14].
A practical alternative approach is needed to define and identify small, broadly distributed larval breeding sites over broad geographic regions. One approach is statistical analysis with remotely-sensed characterizations of the environment and derived data products to identify and locate suitable breeding sites. Topographic data and derived indices are widely used by ecologists to describe landscape terrain and predict plant and animal species distributions. Raw elevation and slope are the most commonly used, but, increasingly, topographic indices (e.g., topographic wetness) and classifications (e.g., landform) based on topographic data are being evaluated [15]. Additionally, hydrological models are being applied to determine water drainage patterns. Following the trend in ecological studies, studies of vector populations have begun exploring the relationships of topographic descriptors with the distributions of mosquitoes. Some studies have modelled hydrology to examine the impact of distance to streams on disease risk [16], while others have examined the influence of water flow on the spatio-temporal dynamics on mosquito populations [17][18][19].
The objective of this study was to describe the spatial distribution of habitats of potential anopheline larval habitats in southern Zambia, by performing larval water surveys at the end of one rainy season, prior to the onset of the following season and at the end of a second rainy season. Different sources of readily available elevation data and terrain-based methods were evaluated to assess modelling approaches and data sources for predicting the abundance and distribution of such habitats. The long-term task is to develop methods for sustainable control of Anopheles larval populations that are targeted in space and time.

Study area
Located 40 km south of the Kafue Flats in Zambia's Southern Province, the study area was centered on the Nachiko seasonal stream at 26°52' 52" E, 16°20'51" S ( Figure 1 -map of study area). There are three seasons, rainy (November -April), cool-dry (May -August) and hot-dry (August -November) with the majority of the 60-100 cm average annual rainfall occurring in the rainy season. During the rainy season, water accumulates and remains flowing until the last weeks when water pools develop. The Nachiko Stream is a tributary to the perennial Munyeke River on the northern border of the study area. Various ponds, seasonal water pools and natural springs occur in the area. The landscape is predominately covered by munga scrub and grasslands, interrupted by tree cover over rivers and streams. In addition, there are agricultural fields of maize and peanuts. Approximately 240 households are in the area. The residential compounds consist of sleeping houses primarily constructed of brick, and peri-domestic structures included cooking shelters, maize storage, cattle corrals, goat and pig pens, chicken coups and pigeon houses. Plasmodium falciparum malaria transmission occurs seasonally with peak transmission in March and April corresponding with increased numbers of An. arabiensis and Anopheles funestus mosquitoes [9]. However An. funestus was locally extirpated following a drought in 2004-2005 [20]. Few residents had bed nets in 2006, and coverage was sporadic in 2007. In contrast, after the implementation of Zambia's national bednet campaign in October-November 2008, bednet coverage became evenly dispersed through the communities.

Study design
A ground survey of~18 km 2 was conducted on the ground by a field crew to search for water pools and aquatic habitats suitable for anopheline mosquitoes between March 19, 2007 andApril 5, 2007. Global positioning system receivers (GPS) were used to conduct an extensive survey of the region, to collect ground control points and identify the location of water pools. In addition, residents were asked the locations of sites where they gathered water. Because most transient water pools disappeared after a day or two following a heavy rain leaving a subset of water pools more suitable for mosquito breeding, sampling for larvae was done during two or more days after a rain. Locations of water sites were collected using a Trimble XM GPS applying one minute of location averaging with a precision dilution of point of ≤4. Surveys were repeated on October 17 -20, 2007 prior to the onset of rainy season to identify sites that still retained water and larval anophelines. A third survey was conducted at the end of the following rainy season from March 26 -April 4, 2008 to determine if breeding sites persisted from season to season. The October 2007 survey included the entire original study area, while the follow up March -April survey encompassed the area from the central stream and east.

Digital elevation data
Two digital elevation models (DEM) for the area with 1 m horizontal resolution, were evaluated; the Shuttle Radar Topography Mission (SRTM) version 3 DEM http://dds.cr.usgs.gov/srtm/version2_1/SRTM3/ with 90 m pixels and Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) (U.S. Geological Survey) DEM with 30 m pixels. Elevation values in both DEMs corresponded to the reflective surface on the Earth that could include soil surface, vegetation (including tree tops) or man-made structures.
SRTM imagery was collected during a 2001 space shuttle mission using a multi-frequency, multipolarization radar system. Each pixel represented a 30 m average of elevation around each pixel's centre. The relative horizontal accuracy is ±15 m (90% circular error) with a relative vertical accuracy of ±6 m (90% vertical error). ASTER imagery was collected on June 28, 2007, and is derived from the near infrared bands 3N and 3B collected in 15 m optical stereo. It has a relative accuracy ≥10 m.
Nadir-viewing and backward-viewing bands of Level-1A imagery (15 m horizontal resolution) collected by the visible near infrared (VNIR) sensor on the ASTER satellite was used to create DEMs with a 30 m horizontal resolution. The DEM generation was performed using an automated stereo-correlation method with ephemeris and attitude data from the ASTER instrument and the Terra spacecraft platform instead of ground control points (NASA, https://lpdaac.usgs.gov/). The RMSE-xyz (root-mean squared) is generally more than 25 meters accurate.

Topographic wetness
The digital elevation models were processed in Imagine 9.1 and imported into ESRI ArcGIS 9.1 (Redlands, CA). All imagery and point locations were geo-referenced to UTM zone 35S, WGS 1984. The ArcGIS extension Terrain Analysis Using Digital Elevation Models extension (TauDem; Tarboton, Utah State University, 2005, http:// hydrology.neng.usu.edu/taudem/) was used to model water flow throughout landscape and subsequently calculate an inverse topographic wetness index (TWI).
The data were smoothed to fill in isolated elevation pits (or spikes) which typically represent errors or areas of internal drainage that interrupt the estimate of water flow. Slope and flow directions, were determined from the multiple direction algorithm (MDA) method [21], in combination with the flat area flow direction method [22]. The MDA method used the steepest slope of triangular facets allowing water to flow in any direction. Topographic wetness index is an indicator of potential moisture, assuming there is surface homogeneity of soil and vegetation. It is calculated using the ratio of upslope contributing area (A) and local slope (the tangent of slope).

Topographic position index
Topographic position classifies the landscape by slope position (low, middle, high) and landform type (plain, valley, ridge) [23]. It was generated using ArcView 3.3 with an extension by Jenness (2006) [24]. The topographic position index (TPI) is the difference between the elevation at a point and the mean elevation of neighborhood cells. TPI values near zero are typical of flat or mid-slope locations. High values signify high areas, such as hill tops and ridges, while low values are indicative of valley floors. Because TPI is a scale dependent variable, a local and an area-wide scale were considered (500 m and 2 km). The 500 m neighbourhood helps in detecting local valleys and hills, while the 2 km neighbourhood enables identification of larger scale features such a large U-shaped valleys, gently sloped hills, and tops of plateaus. Information from the magnitude of the TPI and the area's slope, was used to classify the slope position (SP) according to Weiss (2001) [23]. This classification uses the TPI score standard deviations (SD) and slope values. Slope position classes created were valley, lower slope, flat slope, middle slope, upper slope and ridge (Table 1).
Ten landform (LF) classes (deep streams, shallow valleys and mid-slope drainage pathways, upland drainage areas, U-shaped valleys, plains, open slopes, upper slopes and mesas, local ridges and hills in large valleys, midslope of ridges and small hills in plains, high ridges) were generated by comparing standardized TPI values (standardized TPI = [TPI -TPI mean]/[TPI standard deviation]) for TPI values at 500 m (TPI 500 ) and TPI values at 2 km (TPI 2000 ) and slope using criteria developed by Weiss (2001) ( Table 2).

Statistical analyses
Statistical tests were performed using Stata 8.2 (Stata-Corp, College Station, Texas). Univariate logistic regression was used to assess the association of terrain variables individually. Backwards general linear logistic regression (GLM) was applied using to create risk models for presence/absence (1/0) of water pools, Anopheles larvae and An. arabiensis larvae. Independent explanatory variables were selected based on the criteria of Pearson correlation coefficient <0.8. Absence of water was represented by 100 randomly chosen ground control locations. Explanatory variables which were assessed included terrain variables (slope, aspect, TWI, TPI 500 and TPI 2000 ) and LS variables (2:4, 5:7, and 3:1). To reduce the number of variables included in prediction modelling, TPI 500 and TPI 2000 were included in predictive models without SP or LF. Post-estimation specificity, sensitivity and   [27] in the R-software package 'glmmML' http://cran.r-project.org/web/ packages/glmmML with the variable cluster (i.e., agglomerations of sites within 60 m of each other) as a random effect. Such models that include a random intercept allow accounting for the amount of unexplained variance that can arise due to the spatial proximity of sites (i.e., spatial dependence between near observations). Residuals derived from GLMM models were, also, tested for global spatial autocorrelation using global Moran's I.

Mosquito species
Two water-larval surveys were conducted. Of the 200 sites with water found during the first survey ( Figure 2), 69% [n = 139] contained anopheline mosquito larvae. Of the anopheline sites, 26 [19%] contained recognized malaria vector species. Overall, nine anopheline species were identified. The maximum diversity at any site was 5 species. Anopheles rufipes was the most common mosquito found followed by Anopheles squamosus, Anopheles coustani, and An. arabiensis (Table 3). Anopheles arabiensis and Anopheles quadriannulatus were the only An. gambiae complex mosquito species identified in the study area. Anopheles arabiensis was found at 13 percent of all water sites (n = 26). During the October survey prior to the onset of the rainy season, only five water sites that were identified during the first survey still contained water, and none had any Anopheles spp. larvae. No additional water sites were found during this survey. During the March 2008 survey, 122 water sites were located along and to the east of the stream, and 51 contained Anopheles larvae of which five were An. arabiensis larvae. All of these 122 sites had been detected in 2007 and few new sites were identified far from the previous year's locations.
Landforms classified from ASTER provided a less defined characterization of larval habitats. For example, 43% of all larval sites and 54% of An. arabiensis sites were found in ASTER valleys and drainage pathways, which composed 22% of the area (Table 5) (Figure 3). No single ASTER classified landform type was associated water presence, but ridges predicted water absence (OR = 0.21, 95% = 0.08 -0.56, P < 0.002). Anopheles presence was significant in valleys (OR = 2.18, 95% CI = 1.22 -3.89, P < 0.001) and small hills in plains (OR = 2.85, 95% CI = 1.2 -6.77, P < 0.02). None of the ASTER landform categories alone were significantly associated with presence of An. arabiensis larvae.

Multivariate Analyses
Further characterization of the remote sensing data and products was conducted to evaluate the presence of potential breeding sites (water presence) as well as those sites that yielded Anopheles spp as well as sites that yielded An. arabiensis larvae.
The best GLM logistic model (AIC = 306.6) found to predict the occurrence of water (ROC = 0.81) ( Table 6) had four LS and SRTM landscape indices as the main predictors ( Table 7). The sensitivity of this model was 92.0% and the specificity 49%. The variables from the LS-SRTM model were: TPI500, SRTM slope, SRTM aspect, and SRTM TWI. This model indicates that water was most likely to occur in local depressions (SRTM TPI500) with flattened slopes (SRTM slope). A classification of the LS-SRTM model with a cutoff probability of ≥0.5 resulted in 78.3% of sites being correctly classified. A model using ASTER data was 50 AIC units lower than the best model (Table 6) and presented a marginal predictive value (0.65) using three topographic variables ( Table 7). The GLMM further increased model fit for the LS-SRTM model (AIC = 275.3) (Table 6), and had three  topographic variables (TPI 500 , aspect and slope) as the main predictors ( Table 8). The random intercept in the latter model was not significant (Intercept = -1.16, Pvalue = 0.12), suggesting a limited effect of proximity between sites in the prediction of water habitats. The GLMM, also, improved the model fit for the LS-ASTER model (AIC = 298.7), but its support was much less than the LS-SRTM model. The model that best predicted (AIC = 238.4) ( Table 6) Anopheles spp mosquito larval habitats had a ROC value of 0.85 and SRTM topographic indices and LS ratios (SRTM slope, SRTM TPI500, SRTM TPI20, and TWI) as the main predictors ( Table 9). The model predicted anopheline larvae present in local depressions (SRTM TPI500) with flattened slopes (SRTM slope). The model (cutoff probability ≥0.5) correctly classified 70% of sites (sensitivity = 81.3%, specificity 52%). The GLMM improved model fit (AIC = 221.4) and had only TPI 2000 as the main predictor (Table 10). As observed with the water GLMM model, the random effects term of the Anopheles spp LS-SRTM model (AIC = 225) was not statistically significant (Intercept = -1.59, P-value = 0.051). However, the ASTER-LS GLMM model (AIC = 252.1) for Anopheles spp was found to have a significant random intercept (Intercept = -1.88, P-value < 0.001).

Discussion
Malaria transmission risk depends on the presence of specific anopheline species, on the characteristics and productivity of their breeding sites, their location in relation to human settlements, and on the effective dispersal range of the mosquitoes. Such characteristics Figure 3 Locations of water sites with anopheline larvae overlaid on landform types derived from ASTER Imagery.
Clennon et al. International Journal of Health Geographics 2010, 9:58 http://www.ij-healthgeographics.com/content/9/1/58 determine the context in which malaria transmission occurs, and the type and extent of control actions that can be performed. The ability to locate larval habitats and understand their distribution in space and time is an important component in planning and implementing effective and sustainable vector control strategies [6,28,29]. It also would substantially shrink the geographic extent that needs targeting in any control programme. However, identifying and monitoring these conditions on a meaningful spatial scale is daunting in many circumstances.
The present study shows that processed remotely sensed data combined with field calibrations allows prediction of potential vector breeding habitats in areas characterized by highly seasonal malaria transmission. Specifically, large portions of region can be excluded from interventions with a high degree of certainty making it possible to prioritize target regions for surveillance, monitoring and treatment. The models presented in this study show strong internal validation, but have yet to be externally validated to determine the strength of these results for other areas. While local spatial variation increased fit for many models, it often failed to significantly affect the predictive value of the models.
In sub-tropical Africa the amount and seasonality of rainfalls drastically affects the occurrence and productivity of mosquito breeding habitats and limits the distribution of certain anopheline species [2,30]. In the Nachiko study area, a single rainy season with the lower rainfall amounts (600 -1,000 mm) coupled with the low winter temperatures permit An. arabiensis and An. funestus to       Most of these water pools were prone to drying, and nearly all disappeared during the dry season. An. arabiensis was found in water pools of various sizes, depths, sun illumination, and turbidity towards the end of the rainy seasons but was absent at the end of the intervening dry season in the few pools of water that remained. The various physical conditions in which An. arabiensis larvae were found precluded attempts to more specifically predict its spatial distribution with the available environmental monitoring tools. However, nearly all its breeding sites were in landforms classified as deep streams and valleys-a relatively limited portion of the study region. The multivariate analyses comparing An. arabiensis sites with other water containing locations suggested some distinctions. However, these were not sufficient to suggest that targeting of water sites based on these characteristics would be satisfactory. These results may occur because this species is more generalist in choosing breeding habitats when water is limited. This contrasts to areas with two rainy seasons where An. arabiensis is found with higher mosquito densities and appears to prefer clean, clear pools [10]. However, for models described here, sensitivity was typically high so that they rarely failed to identify potential sites. Consequently, few if any of the sites were likely to be missed -a major consideration in any control programme.
In the Nachiko Area, no habitats contained larval An. arabiensis during the driest months of the year (October-November). This suggests that there may be repeated recolonizations of the region near the start of the rainy season. Three kilometers to the north is a seasonal river that develops into a series of mostly clear water pools during the dry season. Larval An. rufipes have been found even during the driest of months suggesting that An. arabiensis may persist there at an extremely low population density (although none were observed). Given that Anopheles mosquitoes are not occupying aquatic habitats outside of seasonal large rivers, a focus towards surveying them during the dry season and applying larvicides may be a cost effective approach in controlling malaria vector mosquitoes near the study area.
When coupled with field surveys for calibration, satellite imagery allows the measurement of factors and the strength of their association in limiting the distributions of anopheline species both on large and local scales characterizing environmental conditions on wider extents than those that can be done on the ground. These data can act as surrogates of vegetation and moisture [31]. Fine-scale predictive models were developed based on topographic and satellite (LandSat TM5) imagery indices, which identify areas likely to hold water following the rains and predict aquatic habitats where anopheline mosquitoes (and specifically An. arabiensis) can be present.
However, these analyses also demonstrate that these data products may differ in their utility. For example, both SRTM and ASTER produce DEM products that can be used in land form classification. However, the SRTM derived products appeared to provide a better  interpretation. These analyses demonstrate that the use of finer resolution satellite imagery (e.g., ASTER) is not always better than slightly more coarse imagery (e.g., SRTM) to use for predictive modelling. Information derived from Landsat imagery also appeared more successful than ASTER data in classification of breeding sites. Topographic conditions conducive for water pooling are local depression with flat slopes. These locations are generally places with sandy-loam to clay soils where water will accumulate but because of the flat slopes do not drain. When coupled with soil moisture and vegetation indices derived from LandSat satellite imagery, it is apparent that vegetation, soil typing and moisture levels in conjunction with terrain-based modelling can limit the survey area needed to find the vast majority of suitable aquatic habitats for anopheline larval development. Topographic indices were important variables in predicting the presence of water pools during the mosquito season in Southern Zambia. The scale at which any particular variable is measured influences its predictive value, and measuring at a finer scale is not always beneficial. For example, 90 m SRTM elevation data did a superior job at representing potential water flow across the landscape by limiting the influence of factors such as differences in canopy height. Given that the model present here is from a limited study area the generalizability still requires testing on a wider extent. As landscape and climate patterns change from rolling hills and a semi-arid environment to flatter, more sub-humid environment to the north and south, the model will begin to be less successful in its predictions. Current efforts are underway with epidemiologic studies to examine spatial variation in the prevalence of malaria infection relative to predicted mosquito breeding sites in the area to determine if the relationship does exist. In addition, further surveys for predicted breeding sites are being conducted by independent researchers to directly evaluate the predictions.
In 2003, Zambia initiated a new national malaria control initiative Artemether/lumefantrine chemotherapy was selected as the first-line treatment for malaria, and then bed nets and rapid diagnostic testing were introduced [32]. The final phase of the new national malaria control initiative will include residual house spraying. Bed nets have been recently (October 2007) introduced into the Nachiko Study Area, and this is associated with a continued decline of malaria transmission in addition to that seen since the introduction of artemether/lumefantrine anti-malarial chemotherapy. If these measures were to be combined with residual spraying for adult mosquitoes and larviciding, a significant stride in malaria control could be achieved [6,33]. Targeted larviciding along drying river beds when larval sites are limited, may be feasible through landscape analyses and the residual An. arabiensis population can be reduced to drastically lower mosquito population come the subsequent rainy season. Models from this study and terrain classifications can be uploaded into newer GPS units with software such as ArcPad (ESRI, Redlands, CA) and be used to navigate to areas with a high potential to contain aquatic habitats where targeted treatments can be implemented.