MosquitoMap and the Mal-area calculator: new web tools to relate mosquito species distribution with vector borne disease
© Foley et al; licensee BioMed Central Ltd. 2010
Received: 29 October 2009
Accepted: 18 February 2010
Published: 18 February 2010
Mosquitoes are important vectors of diseases but, in spite of various mosquito faunistic surveys globally, there is a need for a spatial online database of mosquito collection data and distribution summaries. Such a resource could provide entomologists with the results of previous mosquito surveys, and vector disease control workers, preventative medicine practitioners, and health planners with information relating mosquito distribution to vector-borne disease risk.
A web application called MosquitoMap was constructed comprising mosquito collection point data stored in an ArcGIS 9.3 Server/SQL geodatabase that includes administrative area and vector species x country lookup tables. In addition to the layer containing mosquito collection points, other map layers were made available including environmental, and vector and pathogen/disease distribution layers. An application within MosquitoMap called the Mal-area calculator (MAC) was constructed to quantify the area of overlap, for any area of interest, of vector, human, and disease distribution models. Data standards for mosquito records were developed for MosquitoMap.
MosquitoMap is a public domain web resource that maps and compares georeferenced mosquito collection points to other spatial information, in a geographical information system setting. The MAC quantifies the Mal-area, i.e. the area where it is theoretically possible for vector-borne disease transmission to occur, thus providing a useful decision tool where other disease information is limited. The Mal-area approach emphasizes the independent but cumulative contribution to disease risk of the vector species predicted present. MosquitoMap adds value to, and makes accessible, the results of past collecting efforts, as well as providing a template for other arthropod spatial databases.
Mosquitoes are required for the natural transmission of important diseases such as malaria, dengue, Japanese encephalitis, Yellow fever, West Nile, lymphatic filariasis, and Chikungunya. Knowing when and where mosquito disease vectors occur could be vital information to combat these diseases. Over 3,500 mosquito species are currently formally recognized, but only a minority transmit disease, and these vectors vary geographically in their medical importance. A first step toward understanding mosquito distribution is to gather high quality taxonomic and geographical information about mosquito occurrence. Creating a computerized database of mosquito collection records is not a new idea [1, 2], but technological advances such as the Internet make a distributed database more achievable.
Suitable mosquito collection data can be retrieved from museum specimens, from records maintained by mosquito control agencies, or from the scientific literature. Foley et al.  demonstrated the value of museum mosquito collection records for understanding mosquito biogeography and ecology, and for planning mosquito surveys. The basic information required are the longitude and latitude, the species identification, and date of the mosquito collection. Other information, such as on the habitat, add value to the record, and Foley et al.  listed over 60 fields of information about a collection event that could be recorded. These authors proposed that standards be adopted for recording collection data, because of the growth and interoperability of online inventories such as the Global Biodiversity Information Facility (GBIF) .
Point collection data can be matched to remotely sensed data or climatic averages to develop mosquito species-specific models of distribution or habitat suitability [6, 7]. Similarly, pathogen or disease suitability models have been developed  that can be fine-tuned with more detailed vector information. Ecological niche modeling has been identified as a broadly applicable method for disease studies, including for predicting interactions among participating species . More specifically, if the generalized spatial extent of a mosquito vector, human host and pathogen can be approximated for an area of interest (AOI), and the extent of co-occurrence (aka the Mal-area ) quantified, then the value for different AOI could be compared to provide a simplified estimate of relative disease risk. However, any attempt to understand disease risk by this method should account for the diversity of vector species within the particular AOI.
Here we describe the development of an online spatial database for mosquito collection records and distribution models called MosquitoMap  that we designed for medical entomologists, vector disease control workers, preventative medicine practitioners, and health planners to promote knowledge about mosquito distribution. We also describe a unique tool within MosquitoMap, called the Mal-area calculator (MAC) that relates the distribution of vectors, humans and pathogens/disease.
The 'map layer' link contains layers for mosquito point records, vector and disease models, and other layers of relevance to mosquitoes and vector borne disease (i.e. human population density, GADM administrative areas, populated places, streams, and water bodies - see below). These layers are designed to become active at different zoom levels, and their transparency is adjustable under 'Tools'. A preview feature is available under 'data uploader' to assist users to check the quality of their location data.
Collection record input schema - point occurrences
MosquitoMap has over 60 fields of information , many of which are based on Darwin Core standards . Data for some fields use controlled vocabulary terms based on international standards (e.g. ISO3166-1 for Country names). Information about a mosquito collection includes: submitter details; provenance, taxonomy, data and time details; geolocation details; specimen details; collection methodology; habitat details; and details about associated parasites, such as the detection of malaria sporozoites. One data field allows a record of the global unique identifier (GUID) associated with a collection record, i.e. a combination of institution code:collection code:catalog number. The GUID is designed to identify a record and can be reported separately in the literature.
MosquitoMap specimen records include vouchered material or observations only. Records can refer to individuals or to pools of individuals of the same species caught during the same collection event. Other information that does not have a dedicated field can be stored under the 'Remarks' field. The goal of the MosquitoMap application is to map specimen collections so, unlike a museum inventory, it is not primarily concerned about the derivatives of a single specimen, such as exuviae or genitalia, unless these are the sole representatives of the specimen. However, details about the derivatives of a single specimen can be noted under the 'Remarks' field. MosquitoMap reserves a category under CollectionCode for information about type specimens ('Type'), and the term 'LitRev' is used to flag information derived from the published literature. If a catalog number is unavailable, a temporary one is assigned, beginning with 'MMap'. The Institution code 'MMap' is also reserved for records, mainly from the literature, which cannot be readily prescribed to an institution.
A downloadable Excel spreadsheet is available via the 'Contribute Data' page of MosquitoMap that explains and lists the input data fields and controlled vocabulary terms for MosquitoMap, plus provides country and species lists for copying and pasting into new collection record databases. The spreadsheet provides advice to assist with data cleaning, and presently, the user must email their data to firstname.lastname@example.org. As mentioned above, a data preview tool is available to assist with maximizing data quality, e.g. to identify incorrect georeferences that do not fall on land or appear in the wrong country. The georeferencing and species identification accuracy of mosquito point occurrence data is important . Every effort is made to ensure that the taxonomic data of the species within MosquitoMap is up-to-date using resources such as the Systematic Catalog of Culicidae , and the 'check coordinates' option in the program DIVA-GIS 5.3. The submitter name is offered as a searchable text data field within MosquitoMap partly to acknowledge the effort made to submit data. MosquitoMap is designed to share data within a framework of open access and due attribution, and data providers and users are asked to abide by the MosquitoMap data use agreement (see 'Contribute Data' page).
Vector and pathogen/disease models
In addition to the layer containing mosquito collection points, other map layers are available including: human population density for 2000 ; Global Administrative Areas (version 0.9) , populated places , and water bodies and streams . Disease models are available for Japanese encephalitis, dengue, Rift Valley fever, yellow fever , and malaria. The layers for malaria comprise spatial limits and endemic levels of Plasmodium falciparum in 2005 (hypoendemic, mesoendemic and hyper-holoendemic), and spatial limits and endemic levels of P. vivax in 2005 [18, 19]. The malaria layers were digitized (0.09495°) from images that were available from the Malaria Atlas Project . Errors of translation and geopositioning may have occurred, particularly in coastal areas, which were not present on the original image. As models of disease distribution become available to us, they will be hosted on MosquitoMap.
MosquitoMap currently includes distribution models for the Asian malaria vectors Anopheles minimus and An. harrisoni, and for the Korean species: An. sinensis, An. kleini, An. belenrae, An. pullus, An. lesteri, An. sineroides, An. koreicus, and An. lindesayi. ESRI grid files for these vector distribution models are available for download via the MosquitoMap website. The vector maps for Korea represent the entire anopheline mosquito fauna for that country and thus provide the most complete information for testing the MAC concept (see below).
The first pop-up menu allows the user to define the AOI for the pathogen and vector models. When a country is chosen this filters a global table of mosquito vector species, so that only pathogen and distribution models for mosquito species listed for the country are considered in the calculations. The user can further refine the AOI by choosing one of the GADM second order administrative areas, or use the area selection toolbar to apply a polygon selection. To improve performance, vector and disease rasters that overlap the AOI were first converted to vector format using ArcToolbox™ and then merged into a set of contiguous polygons for each disease. For each polygon, an attribute was created with a URL linking to the original raster dataset, provided for the Mal-area calculation.
Vector and pathogen distribution maps are usually classified according to probability of occurrence (integers of 1-100), number of models supporting occurrence (typically 1-10), or as absence and presence (0, 1). These different categorizations can lead to normalization issues and complicate the meaning of the output. To improve computation efficiency and simplify the output of the MAC, a threshold is applied to all input layers to render them as presence or absence. For example, the least presence threshold  may be applied to vector distribution models, and a 50% probability threshold for disease models.
To simplify calculations, the human population density layer was pre-defined as a binary (presence, absence) feature class, and the presence threshold was set to one or more people per square kilometer. This threshold has been used to mask low human population densities in previous calculations of malaria distribution [18, 19]. As map layers occur in different resolutions, all layers were resampled to the resolution of the human population density layer (0.008333°). The value of the resampled pixel was determined by the value at its center. 'No data' values were not included in the Mal-area calculation for the AOI.
A calculation can be made for the same AOI with different input parameters, or for another AOI, to compare the resulting Mal-area estimate. In as much as the Mal-area can be seen as an estimate of disease transmission risk, this process quantifies the relative risk of different AOIs.
Results and Discussion
Mosquito occurrence is tied to the environment. For example, all mosquitoes have an aquatic larval stage, do not occur above a certain elevation, and need a minimum amount of solar radiation to develop. It is tempting to regard vector species as a homogeneous component of the environmental background to disease transmission. However, mosquito species can utilize different microclimates, vary in their biting behavior, and may have diverse life histories. An additional complication is that a species may be a primary vector in one area of its range but, for reasons that are unclear, may play a secondary role in other areas. It has been proposed that a more objective understanding of regional differences in the underlying force of malaria transmission can be attained by considering the properties intrinsic to the regionally most "dominant" vector species . We suggest that any attempt to understand the spatial distribution of vector-borne disease transmission will benefit from knowledge of the distribution and vectorial importance of the entire vector species fauna within an AOI. Thus, a goal of MosquitoMap is to provide awareness, not just of the potential extents of individual mosquito species, but of the implications for disease transmission of the combination of vector spatial distributions within an AOI.
The Mal-area concept emphasizes the independent but cumulative role of vector species in determining disease risk. Many other factors affect disease transmission but are not considered in the MAC, for example, health care coverage, host immunity, degree of vector control or insecticide treated bednet usage, and insecticide and drug resistance. It is anticipated that mosquito and pathogen distribution models will vary in their accuracy and biological realism, which will affect estimates of disease risk. For example, vector ecological niche or habitat suitability models are often based on presence only data, and are affected by the number of data points, selection of environmental layers, and the algorithms used to generate the predictions. Although the prediction of vector species spatio-temporal dynamics is possible , ecological niche models are usually constructed to predict static average yearly spatial extents, and potential presence rather than abundance. However, as models are improved and are made available on MosquitoMap, the biological realism of the MAC output should also improve. The MAC allows users, such as vector disease control workers, preventative medicine practitioners, and health planners, to see within the AOI where the predicted transmission risk areas are located, and quantifies this risk area for comparison with other AOI. In the absence of other intelligence information, the MAC can provide a useful initial decision tool, affecting such things as: counselling for prophylaxis, choice of health messages, where best to locate personnel, the form of vector control, and the vector identification tools needed.
Point data in MosquitoMap could have a variety of uses including: informing medical entomologists about where mosquito collection efforts should be directed; identifying areas relevant to the study of mosquito biogeography, evolution and biodiversity; allowing predictions about the potential spread of exotic mosquito introductions; allowing predictions about the potential effects of global warming on mosquito distributions; allowing insights into mosquito community structure, and environmental and climatic correlates to species occurrence (ecological niche); allowing continent-wide rather than just local studies of vector-borne disease; and identifying cryptic evolutionary lineages that differ in geographic or ecological space.
MosquitoMap could be expanded to include other environmental layers relevant to mosquito distribution (e.g. coastal forest urban, forest fringe, rice irrigation) or vector-borne diseases (urban/rural, impregnated bed nets, war zone, drug resistance, border areas, immunological status, distance to health clinic). Some challenges common to disease mapping  include: the heterogeneity of data sources, and difficulties in integrating data from MosquitoMap with other disease management and biodiversity systems. We hope to improve the functionality of MosquitoMap, and to use it as a template for other vectors (e.g. sand flies, ticks and fleas) of disease.
We developed a Web-based spatial database of mosquito collection records and distribution models called MosquitoMap. An application within MosquitoMap, called the MAC, quantifies the area of overlap, for any AOI, of vectors, humans and disease. MosquitoMap and the MAC can be utilized by medical entomologists, vector disease control workers, preventative medicine practitioners, and health planners to determine what species have been collected where, and to estimate the Mal-area for vector-borne diseases risk assessment. As more users submit records and distribution maps, the utility of these online resources will increase. Data on MosquitoMap are freely available and contributions are clearly sourced and acknowledged within MosquitoMap using appropriate citations provided by the contributor.
Funding for this work was provided by the Global Emerging Infections Surveillance and Response System, a Division of the Armed Forces Health Surveillance Center, and from the Global Biodiversity Information Facility. This research was performed under a Memorandum of Understanding between the Walter Reed Army Institute of Research and the Smithsonian Institution, with institutional support provided by both organizations. The opinions and assertions contained herein are those of the authors and are not to be construed as official or reflecting the views of the Department of the Army or the Department of Defense.
- White KE, Grodhaus G: Computer information retrieval system for California mosquito collection records. Calif Vector News. 1972, 19: 27-39.Google Scholar
- Faran ME, Burnett C, Crockett JJ, Lawson WL: A computerized mosquito information and collection management system for systematic research and medical entomology (Diptera: Culicidae). Mosq Syst. 1984, 16: 289-307.Google Scholar
- Foley DH, Weitzman AL, Miller SE, Faran ME, Rueda LM, Wilkerson RC: The value of georeferenced collection records for predicting patterns of mosquito species richness and endemism in the Neotropics. Ecol Entomol. 2008, 32: 12-23.Google Scholar
- Foley DH, Wilkerson RC, Rueda LM: Importance of the 'what', 'when', and 'where' of mosquito collection events. J Med Entomol. 2009, 46: 717-722. 10.1603/033.046.0401.PubMedView ArticleGoogle Scholar
- Global Biodiversity Information Facility.
- Foley DH, Rueda LM, Peterson AT, Wilkerson RC: Potential distribution of two species in the medically important Anopheles minimus complex (Diptera: Culicidae). J Med Entomol. 2008, 45: 852-860. 10.1603/0022-2585(2008)45[852:PDOTSI]2.0.CO;2.PubMedView ArticleGoogle Scholar
- Foley DH, Klein TA, Kim HC, Sames WJ, Wilkerson RC, Rueda LM: Geographic distribution and ecology of potential malaria vectors in the Republic of Korea. J Med Entomol. 2009, 46: 680-692. 10.1603/033.046.0336.PubMedView ArticleGoogle Scholar
- Rogers DJ: Models for vectors and vector-borne diseases. Adv Parasitol. 2006, 62: 1-35. 10.1016/S0065-308X(05)62001-5.PubMedView ArticleGoogle Scholar
- Peterson AT: Ecological niche modeling and understanding the geography of disease transmission. Vet Ital. 2007, 43: 393-400.PubMedGoogle Scholar
- Foley DH, Klein TA, Kim HC, Wilkerson RC, Rueda LM: Malaria risk assessment for the Republic of Korea based on models of mosquito distribution. AMEDD J. 2008, Apr-Jun: 46-53.Google Scholar
- ArcGIS Online Resource Centers.
- TDWG Wiki>DarwinCore.
- Systematic Catalog of Culicidae.
- Center for International Earth Science Information Network (CIESIN), Columbia University; International Food Policy Research Institute (IFPRI), the World Bank; and Centro Internacional de Agricultura Tropical (CIAT), 2004. Global Rural-Urban Mapping Project (GRUMP): Version 1 alpha data. Palisades, NY: CIESIN, Columbia University.
- Global Administrative Areas.
- Hearn P, Hare T, Schruben P, Sherrill D, Lamar C, Tsushima P: Global GIS Database: Digital Atlas of the Earth. USGS Digital Data Series DDS-62-H, Flagstaff, AZ. 2003Google Scholar
- Guerra CA, Snow RW, Hay SI: Defining the global spatial limits of malaria in 2005. Adv Parasitol. 2006, 62: 157-179. 10.1016/S0065-308X(05)62005-2.PubMedPubMed CentralView ArticleGoogle Scholar
- Guerra CA, Snow RW, Hay SI: Mapping the global extent of malaria in 2005. Trends Parasitol. 2006, 22: 353-358. 10.1016/j.pt.2006.06.006.PubMedPubMed CentralView ArticleGoogle Scholar
- Malaria Atlas Project.
- ArcGIS ModelBuilder 9.3.
- Kiszewski A, Mellinger A, Spielman A, Malaney P, Ehrlich-Sachs S, Sachs J: A global index representing the stability of malaria transmission. Am J Trop Med Hyg. 2004, 70: 486-498.PubMedGoogle Scholar
- Pearson RG, Raxworthy CJ, Nakamura M, Peterson AT: Predicting species' distributions from small numbers of occurrence records: A test case using cryptic geckos in Madagascar. J Biogeogr. 2007, 34: 102-117. 10.1111/j.1365-2699.2006.01594.x.View ArticleGoogle Scholar
- Peterson A, Martínez-Campos C, Nakazawa Y, Martínez-Meyer E: Time-specific ecological niche modeling predicts spatial dynamics of vector insects and human dengue cases. Trans R Soc Trop Med Hyg. 2005, 99: 647-655. 10.1016/j.trstmh.2005.02.004.PubMedView ArticleGoogle Scholar
- Gao S, Mioc D, Anton F, Yi X, Coleman DJ: Online GIS services for mapping and sharing disease information. Int J Health Geogr. 2008, 7: 8-10.1186/1476-072X-7-8.PubMedPubMed CentralView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.