Open-Source web-based geographical information system for health exposure assessment
© Evans and Sabel; licensee BioMed Central Ltd. 2012
Received: 26 September 2011
Accepted: 10 January 2012
Published: 10 January 2012
Skip to main content
© Evans and Sabel; licensee BioMed Central Ltd. 2012
Received: 26 September 2011
Accepted: 10 January 2012
Published: 10 January 2012
This paper presents the design and development of an open source web-based Geographical Information System allowing users to visualise, customise and interact with spatial data within their web browser. The developed application shows that by using solely Open Source software it was possible to develop a customisable web based GIS application that provides functions necessary to convey health and environmental data to experts and non-experts alike without the requirement of proprietary software.
Using maps to visualise data can enable quicker interpretation of complex geographical phenomena , identify patterns, and aid in planning, resource allocations for policy and decision making . Mapping, in the context of the Environment and Health sub-discipline, provides a visual assessment for investigating the spatial distribution of a disease and potential associations and underlying causes . Developments in Geographical Information Systems (GIS) have now made the mapping of this information commonplace and are used in a large range of applications. Within the environment and health fields, recent applications using GIS have been used in projects such as identifying regions at risk to malaria , monitoring effects of air pollution on asthmatics  and defining an "Index of Relative Wellbeing" for an area from census data .
The research presented here, reports on results from the European Union (EU) FP6 funded Health and Environment Integrated Methodology and Toolbox for Scenario Assessment (HEIMTSA) project. The project's overall goal was to support the European Union's (EU's) Environment and Health Action Plan (EHAP) by extending health impact assessment (HIA) coupled with cost benefit analysis (CBA) methods and tools to evaluate the impacts policy scenarios have (at a European level) on the environment and human health. HEIMTSA was structured around loosely coupled modules that deal with pollutants and their media using the full chain (impact pathway) approach. The modelling tools and framework that were developed within this project was focussed on exploring:
Emissions to environmental media ('stressor identification'), derived from sector scenarios in transport, energy, agriculture, industry, households and waste treatment and disposal, that are combined and harmonized to result in consistent scenarios for all relevant stressors for the whole of Europe;
Human exposures (e.g. outdoor and indoor air pollution, water, noise, odour, metals, dioxins) by multiple routes, estimated, using new methods (exposure scenarios and probabilistic modelling), including consumer exposure;
Health risk functions, derived, with new methods for: effects of combined exposures; estimating background rates; and mapping health impacts, to aid in communication of results; and
Monetary valuation including a review of methods for valuating children's health, developing values for relevant health endpoints, and extending the valuation paradigm to include altruism.
These methodological topics were integrated within an online modelling toolbox where users could request models to be implemented based on specific inputs and later download the results. A drawback of this approach was that it required the user to know how to interpret the results and also have appropriate software to visualise their data.
With communication and data sharing defined as underpinning elements that aid in dealing with cross border health and environmental issues , HEIMTSA set out to improve communication by enabling the access and viewing of (spatial) data and information easily online to expert and non expert users alike. The goal of this part of the project was to allow users anywhere in the world the ability to look at and interact with their model results in an online environment and share their findings with others. Commercial software which is designed to use this type of data online already exists but it can often come with a high initial set up cost, ongoing licensing fees and require some degree of technical knowledge to operate. If HEIMTSA was to overcome these problems it needed to provide a way to allow users to visualise their data freely online and be simple to use. For this an open source web GIS platform was chosen that would enable users to access, visualise and interact with their data online within a web browser.
A Web Map Service (WMS) renders geo-referenced "information" as digital image files providing static maps to clients ;
Web Feature Service (WFS) (unlike WMS) returns with actual feature (vector layers). Therefore rendering the user requested data - dependent on file size - takes noticeably longer than WMS map images. WFS also allows clients to edit stored layers ; and
Web Coverage Service (WCS) supports electronic retrieval of geospatial data as "coverages" (digital geospatial information representing space-varying phenomena). The WCS returns the original geo-referenced data together with its associated attribute data thus providing opportunities for data exploration and interpretation .
In recent years there have been many systems utilising alternative open source web GIS methods. Previous developments in open source include a GIS Enabled Cancer Atlas as a means of providing users the ability to visualise and interact with data relating to the distribution of cancer across the state of Pennsylvania ; a mapping system that looked at sharing biodiversity information ; a support tool for assessing the implementation of cross-border and global health spatial information systems (CBHSIS) across the US-Mexico border , and more recently a flash based web GIS solution was developed in Australia that enables users to map their own data online via importing tables .
There are some underlying difficulties associated with developing in open source when compared to closed source products bought "off the shelf". Open source applications often require greater amounts of time and computational expertise  and the professional support available often depends on the maturity of software and the size of the user community . These difficulties aside, however, it is the low cost and customisability of open source that still makes it an appealing alternative for many users.
This paper specifically documents the design, development and implementation of a new web based GIS/spatial visualisation tool built from a combination of open source software packages as a support tool for the health and environment sectors. The tool provides the ability to access model results online and visually explore data in web browsers without the need of additional software. The advantage of developing an entirely new system is that it can be designed to: closely fit the task, be user friendly, and require no expert knowledge of GIS to operate.
The online modelling tools previously produced as part of the HEIMTSA work outputted their results in a textual/tabular format which were downloaded by the user and visualised utilising their own software. The issue with this approach was that it limited the distribution of information as it required the user to have the appropriate software to visualise the data and also have the skill set to do so. To solve this issue, a rudimentary GIS was proposed to allow users the ability to spatially visualise and interact with their data solely within a web environment.
Basic system design
Visualisation tool features
Link to Model outputs
View uncertainty (duel mapping)
Generate chart information
Style data on the fly
Get additional information by selecting/querying the data
To make the system user-friendly it was designed so that a user with little to no knowledge of GIS would be able to visualise their data. The simplest scenario was that via the user simply clicking on one button they could visualise their data spatially and immediately gain some insight into their results, using an exploratory spatial data analysis paradigm.
For the geodatabase, two popular open source database management systems (DBMS) were considered: MySQL and PostgreSQL (coupled with PostGIS). Both of these DBMS have very powerful spatial support systems incorporated into them but in terms of spatial functionality, PostgreSQL-PostGIS appeared to have a larger range  thus allowing for greater potential of expansion of the developed web GIS software in the future. This potential, coupled with PostgreSQL-PostGIS being chosen by many of recent web GIS solutions ([23, 14, 15, 24], and ) lead the authors to choose this geodatabase for this project.
The map server makes it possible to access and display spatially enabled content of the geodatabase and enable querying and analysis of the displayed data . It works by storing data as tables in the database that can be later viewed as layers of a map . As with the geodatabase solutions, there were two open source map servers considered: MapServer and GeoServer. Reviewing previous studies ( and ) and discussion forums ( and ) it was deemed that the functionality between these two packages was quite similar and both packages would provide the level of functionality required for the project. GeoServer was subsequently chosen due to the developer's preference of its web administrator tool to aid in the testing and development of the web GIS system.
Interpret model outputs from the pre-existing toolbox;
Visualise and interact with data; and
Incorporate additional features.
The overall system development is discussed in greater detail in the following section.
Publish new model outputs (green path);
Display and Interact with data (orange path); and
Style data on the fly (red path)
Each of these pathways represents an essential process that takes place in the system. The basic principle is that all new data must go through the green path for preparation; the orange path for visualisation and interaction and the red path for styling and customisation. The following section details the functions of each process/pathway.
To visualise any health and/or environment model output in the web viewer it must first be assigned geographical/spatial attributes and published within the map server (GeoServer). The outputs from the modelling tool tested in this paper are in a comma separated variable (CSV) format and come in two versions; one being point data and the other country level polygon data. The point data's resolution is determined by the model but consists of unprojected latitude and longitude values. The country level data only has "country code" information that relates to its location and does not possess any geometric information at this stage. Within the geodatabase is a pre-existing country level polygon dataset; this dataset already contains spatial and geometric information and shares the same "country code" key field for each country polygon as those used in the outputs from the modelling tool. Using the "country codes" as a relational join field, the two datasets can be joined together in the geodatabase to enable spatial visualisation.
Once the data has been uploaded to the geodatabase and has spatial attributes, it needs to be "published" within GeoServer. The publishing of data allows GeoServer the ability to interpret the data and render it at a later stage within the web viewer. The information required by GeoServer includes the bounding box of the data (the extent), the projection of the data, and a default style which is to be applied. Publishing is commonly done via the web administrator tool within GeoServer. This, however, is impractical in our online application as it would then require the user to access the web admin tool, reference the data in the geodatabase and define parameters manually; which is both time consuming and too complex for non-experienced users. To avoid this, we implemented a command procedure (cURL)  on the server side which forces GeoServer to publish the data automatically without going through GeoServer's web admin tool. Coding wise, this is simplified as both the projection and bounding box are constant for all the model outputs in this project and a default style can be chosen depending on whether the data is point or polygon in format. Once this process is completed the data is ready to be visualised and queried in the web viewer.
From a system operational standpoint, when the user selects a model for viewing, the aforementioned processes take place followed by the launching of the web viewer page. From here the system moves on to the orange pathways shown in Figure 2 to render the data and allow interaction.
Although WFS formats allow for greater levels of interaction on the user side, such as allowing for digitising and editing data; the WMS (image) formats can still be queried and interacted with to some degree as they are georeferenced by the map server. As the visualisation tool for this project was primarily designed for viewing and basic querying, a WMS approach for rendering data in the web viewer was selected. The WMS request renders a georeferenced image of the data in the web viewer which can then be interacted with.
It is envisioned that the users of this system may wish to compare multiple datasets from previous models within one session. The model list in this region is directly linked to the central geodatabase where all previous model output data is held. The models available in the list are unique to each user, as what is displayed in Region A is filtered according their login credentials.
Being able to symbolise and customise the appearance of the data is one of the key features of this interactive visualisation tool; this allows the client to emphasise characteristics of their data. Symbolising geographical data can be complicated as the same data can be mapped as a choropleth, graduated symbol, dot, or surface/isopleth map ([34, 35]). The choice of mapping method employed is related to the purpose of the map, the original data and how best to represent it. The choropleth map is a common mapping technique for areal data such as districts and counties. This technique is clearly appropriate when values of a phenomenon change abruptly at enumeration unit boundaries .
The basic styling of data within GeoServer is governed by a "Styled Layer Descriptor" (SLD) file; this is a static file which has to be written manually or via an external software package. The SLD file contains information about what symbols, colours and other attributes are associated with specific or a defined range of values.
Divides the distribution of values (max - min) into equal ranges.
Creates intervals so that class each has an equal proportion of the sample.
Natural (Jenks) Breaks
Classes are defined according to apparently natural groupings of data values.
It shows the distance of an observation from the mean. It calculates the mean value and generates class breaks in standard deviation measures above and below it.
If the user requires any assistance with the visualisation tool or wishes to find out details, a "Contact" button is included on the upper toolbar of the viewer. This will open a new window where the user can see development revisions. The client can also email the administrator comments and suggestions via the embedded message box. This is especially useful for continuing Open Source development.
Once an external model is completed, and data is ready to be visualised, a button and a link become active on the associated web page viewed on the client side. When the user clicks on the button or link, an SQL code is generated and the process of publishing data (green path in Figure 2) begins. Upon completion of the publishing task the software launches the visualisation web page and opens the viewing tool whilst automatically switching to the "Display and interact" (orange paths) task to render the data. The model data that is to be rendered then needs to be styled so upon rendering a request is sent back to the Geothematic SLD server (red path) to style the data and re-draw in the web viewer. This complex network of functions is hidden from the user and the entire process occurs within a short timeframe. From a single click the user has gone from having tabular data into having a spatial representation of their data that they can now interact with.
The development of this Open Source system was carried out over a nine month period. To achieve this within the short timeframe available, the functionality of the system was mainly limited to what were deemed "essential" features.
As the data being visualised by the system is potentially confidential or sensitive, restrictions on access were implemented. In order to access the external modelling tool, each user must first enter their login credentials (username and password). The Web visualisation tool described here has been designed so that the username filters the layers available to each individual user only.
The work undertaken has further highlighted the potential of using open source web GIS to enable the viewing and interrogation of environment and health data anywhere in the world freely via a simple web browser, either on a desktop or mobile device, as a viable alternative to commercial software. As it is open source development, and can thus be freely modified, it is hoped that with little modification this software could be used by others as a basis for allowing visualisation of their data online in a web browser, thus improving access to their data. This development acts like a technology equaliser, enabling economically restricted health and environmentally organisations, particularly in developing countries where the costs of implementing a system (e.g. for monitoring a disease outbreak, or dealing with waste removal from disaster relief camps) would be heavily restricted. One of the key features of this system is that no specialist software is required by the user on their computer. A low-end budget computer with internet access can use this software as the bulk of the system processes are all carried out on the server and not the client side. The front end panel based design created using GeoExt and Ext libraries means that the overall application is portable and can thus be embedded within any web page. This allows for the ease of distributing this application back to the open source community for others to use as a starting point and improve and develop further. By documenting the detailed development process we have followed in this paper, it is our hope that the problems and solutions we have implemented can further other future open source development.
Having described in detail and documented the development process in this paper, it is considered useful to add a brief example of the type of analysis that is likely to be conducted through this application. We take the example of the data shown in Figures 4 and 7, of Arsenic concentrations across Europe, and associated health impacts. Suppose a policy maker was interested in testing alternative scenarios to control arsenic emissions in an effort to reduce adverse health impacts (such as lung cancer) across Europe. One can model anticipated depositions using external models, but using our application, the results can be graphically visualised, explored and spatially data-mined. This adds an extra dimension to simply presenting results in a tabular or written form. We would see, for example as shown in Figure 4, that France and Germany have high potential health impacts due to arsenic contamination. We could drill down into either France or Germany to explore alternative attributes and map those, or alter class boundaries to re-visualise the country level data to more closely examine the spatial patterns. Using our second pop-up window to reveal uncertainties in our data, we would then be able to judge how confident we were in our modelled results. Based on these analyses, it is then the intention that the policy maker could make a more informed decision about what level of Arsenic concentrations are acceptable, what the health and monetary costs of these decisions are, and how they vary spatially across Europe.
The spatial visualisation tool created for this project is now live and being hosted and used via the following website: http://heimtsa.jrc.ec.europa.eu/heimtsatb/. Users will have to register before accessing the system. Although the project initially set out to be a web based GIS, the level of functionality required lead the system to be more akin to a spatial data explorer/visualiser. Future development, for a different project, will expand the system as described here and expand the functionality to include greater analytical capabilities. Ongoing discussion with key public policy stakeholders will ensure that the system meets users' requirements and is thoroughly tested.
The prototype visualisation tool developed here successfully enables users to spatially visualise model results in real time within a web browser, without the need for any additional software, or software training. The complex workings of the system are hidden from the user and the automatic rendering design used in this system enables users with no prior knowledge of GIS to visualise their data and immediately gain some understanding of the spatial structure of their data.
In comparison to that of commercial closed source software, open source is more complicated to initially implement. Although there is no specific dedicated customer support service, the support and advice provided by users in the open source community through forums and mailing lists is extensive, and there is a large community devoted to help and share ideas which can inspire "out of the box" thinking on solutions which may not be possible in closed source applications.
Project funded by the European Commission within the Sixth Framework Programme (2002-2006). Contract number GOCE-CT-2006-036913-2.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.