Web GIS in practice VI: a demo playlist of geo-mashups for public health neogeographers
© Boulos et al; licensee BioMed Central Ltd. 2008
Received: 6 July 2008
Accepted: 18 July 2008
Published: 18 July 2008
Skip to main content
© Boulos et al; licensee BioMed Central Ltd. 2008
Received: 6 July 2008
Accepted: 18 July 2008
Published: 18 July 2008
'Mashup' was originally used to describe the mixing together of musical tracks to create a new piece of music. The term now refers to Web sites or services that weave data from different sources into a new data source or service. Using a musical metaphor that builds on the origin of the word 'mashup', this paper presents a demonstration "playlist" of four geo-mashup vignettes that make use of a range of Web 2.0, Semantic Web, and 3-D Internet methods, with outputs/end-user interfaces spanning the flat Web (two-dimensional – 2-D maps), a three-dimensional – 3-D mirror world (Google Earth) and a 3-D virtual world (Second Life ®). The four geo-mashup "songs" in this "playlist" are: 'Web 2.0 and GIS (Geographic Information Systems) for infectious disease surveillance', 'Web 2.0 and GIS for molecular epidemiology', 'Semantic Web for GIS mashup', and 'From Yahoo! Pipes to 3-D, avatar-inhabited geo-mashups'. It is hoped that this showcase of examples and ideas, and the pointers we are providing to the many online tools that are freely available today for creating, sharing and reusing geo-mashups with minimal or no coding, will ultimately spark the imagination of many public health practitioners and stimulate them to start exploring the use of these methods and tools in their day-to-day practice. The paper also discusses how today's Web is rapidly evolving into a much more intensely immersive, mixed-reality and ubiquitous socio-experiential Metaverse that is heavily interconnected through various kinds of user-created mashups.
GIS (Geographic Information Systems and Science) have always shared many of the foundational ethea (plural of ethos) of Web 2.0 , (even before the latter became known as a distinct entity), namely data sharing, remixing and repurposing, and collaboration. GIS enable remixing and repurposing of data by "mashing-up" various data and map layers or themes from multiple sources into one study/map (with multiple layers covering same locations superimposed like onion's skin). And now with the advent of Web 2.0 technologies, the democratization and participatory nature of GIS have never been more possible or powerful. Neogeography and the GeoWeb 2.0 have been born and unleashed for use by the masses [2, 3]!
'Mashup' was originally used to describe the mixing together of musical tracks to create a new piece of music . The term now refers to Web sites or services that weave data from different sources into a new data source or service. Mashups are becoming increasingly widespread, especially in the context of combining geographic data and displaying such integrated data on maps. Web-based mapping applications like Google Maps  and Google Earth  allow multiple independently generated datasets encoded using the Keyhole Markup Language (KML) format to be mixed and displayed via a two-dimensional – 2-D map (or three-dimentional – 3-D globe in the case of Google Earth) . The latest offerings from Google Maps, namely My Maps  and Mapplets , have made it possible for anyone to create and share their own interactive online maps with just a few mouse clicks and no (or almost no) coding at all! (With Google Mapplets, anyone can tap into, remix and reuse third-party mini-applications for Google Maps (known as Mapplets) from a rapidly expanding catalogue maintained by Google, to create and share even more powerful personal maps.)
Many scientists have also utilized these technologies for research purposes . For example, Nature has created its own geo-mashup using Google Earth for tracking avian-flu outbreaks , and HealthMap , developed by the Children's Hospital Informatics Program in Boston, brings together disparate data sources within Google Maps to achieve a unified and comprehensive view of the current global state of infectious diseases and their effect on human and animal health. This freely available Web site integrates outbreak data of varying reliability, ranging from news sources (e.g., Google News ) to curated personal accounts (e.g., ProMED ) to more valid alerts (e.g., World Health Organization ). Other public health mashup work can be browsed at . These examples represent a class of Web-based neogeography applications that combine the complex techniques of cartography and Geographic Information Systems (GIS) and place them within reach of users . The benefit of such easy-to-use GIS applications is evident in an increasing diversity and quantity of publicly available geocoded health data and a growing interest in using GIS and other Web-based tools for mashup of public health data.
It is therefore not surprising but rather commendable that the UK government has recently launched a data mashup competition to find innovative ways of using the masses of data it collects . The government is hoping to find new uses for public information in the areas of criminal justice, health and education, and is opening up gigabytes of information for this purpose from a variety of sources like mapping information from Britain's Ordnance Survey, medical information from the NHS (National Health Service), and neighbourhood statistics from the Office for National Statistics. (None of the data is personal information.)
Over the last few years, the complexity and magnitude of research data with advances in genomic sequencing and translational science have increased the need for complex mashup applications. One possible solution is Web 2.0, a term that describes the rising global trend in use of World Wide Web technology and Web design in the past few years, and represents applications that aim to enhance creativity, information sharing, and collaboration among users. Web 2.0 comprises online services that promote interaction between users and cooperative development of Web resources [1, 19]. These technologies, tools, and sites can be broadly categorized as follows:
The contents provided by different Web sites are organized and displayed in many different ways. The traditional approach to extracting Web content and reformatting it is to write specific screen-scraping programs to extract content from specific sites. This approach is not scalable given the high degree of heterogeneity involved. Also, it requires a significant amount of programming effort. To address this, tools such as Dapper  provide the user with the ability to visually map the Web content to a particular structure. In addition, these tools allow the extracted content to be output in different formats such as RSS (Really Simple Syndication – described in ). These tools ease the effort of content extraction and formatting over the Web .
To facilitate mashup of data provided by different sites in different formats, tools such as Yahoo! Pipes  have been developed to allow users to graphically create a Pipe or workflow to connect data including those generated by other tools like Dapper. Such tools can directly accept data in different formats and integrate them. The integrated data can be formatted in different ways for analysis purposes .
Once multiple datasets are parsed or integrated in a common format, tools are available for visualizing data in an integrated fashion. For example, Yahoo! Pipes can be used to integrate and format geo-referenced data into the KML format for visualization by Google Maps or Google Earth .
One important aspect of Web 2.0 is data sharing and community collaboration. For example, Dapper and Yahoo! Pipes both contain collaboration forums in which users can view and utilize the work of others. In the context of GIS, Web 2.0 sites such as GeoCommons  allow geo-referenced data (e.g., KML files) to be tagged, shared, reused, and remixed .
These Web 2.0 technologies, tools, and services, in conjunction with neogeography applications such as Google Maps and Google Earth, can support public health research, including infectious disease surveillance and molecular epidemiology. They reduce the onus of the public health expert to write complex programming code to perform data integration. They also promote data sharing and community collaboration. Whether the purpose is to analyze historical trends of data over time or to detect disease anomalies in real-time, Web 2.0 technology can easily integrate numerical and spatial data for public health decision support.
Using a musical metaphor that builds on the origin of the word 'mashup', this paper will present a demonstration "playlist" of four practical geo-mashup example and idea sets that make use of a range of Web 2.0, Semantic Web, and 3-D Internet methods, with outputs/end-user interfaces spanning the flat Web (2-D maps), a 3-D mirror world (Google Earth) and a 3-D virtual world (Second Life ®).
Another resource for mashup of molecular data is the Mesquite Project . The modular system promotes collaboration among scientists to develop their own programs or modules and then upload the modules for other programmers to utilize and enhance. This mashup approach enables modules to be attached to other modules for creation of a hybrid module. There is great potential for GIS to be included as a module for Mesquite in as much the same way that TreeBase II presents trees within GIS.
Since our demo "playlist" was created based on the original concept of "musical mashup", it is also very possible for these separately-composed "songs" to be remixed; for example, different species of mosquitoes carrying the WNV (see "Mashup song #1" above) can be queried semantically (ontologically – see "Mashup song #3" below), studied using the geo-phylogenetic tree ("Mashup song #2"), and visualized/interacted with in an avatar-inhabited 3-D virtual world environment (see "Mashup song #4" below).
Despite the emergence of Web 2.0 tools like Yahoo! Pipes and standard geo-data formats like KML, the task of identifying and integrating datasets of interest must be manually done by people. 'Semantic mashup' is a conception in which computers help humans discover and integrate data. A semantically-enriched machine readable format is needed for implementing the vision of semantic mashup. GeoRSS (an extension of RSS) is a step in this direction . While a regular RSS feed is used to describe feeds (channels) of Web content such as news articles, Web content consisting of geographical elements such as latitudes and longitudes can be described using GeoRSS. Like RSS feeds that are consumed by feed readers and aggregators, GeoRSS feeds are designed to be consumed by geographic software such as map generators.
GeoRSS can be viewed as an application of RDF (Resource Description Framework), since RSS 1.0 is a language of RDF. RDF is part of a broader technology called 'Semantic Web' [33–35], which is a set of recommendations and specifications supported by the World Wide Web Consortium (W3C) . The Semantic Web emphasizes common formats and languages for semantic interoperability. For example, RDF enables for the integration and combination of data drawn from diverse sources. This is an enhancement from the original Web which emphasized the interchange of documents. The Semantic Web also supports languages such as SPARQL (a recursive acronym that stands for SPARQL Protocol and RDF Query Language), which can be used to express queries across diverse data sources, whether the data are stored natively as RDF or viewed as RDF via middleware. SPARQL is much suited for recording how Web content relates to real world objects. This allows a Web reference, such as a person, or a machine, to start off in one database, and then move through an unending set of databases which are connected not by wires but by relationships.
The use of ontologies, or formal representations of concepts and their relationships, has been a popular method for supporting complex knowledge representation in the Semantic Web [34, 35]. For example, an expressive ontology language called the Web Ontology Language (OWL) is now a W3C recommendation . OWL-based ontologies can support sophisticated queries as well as machine reasoning and inferencing. The GeoNames Ontology  is an example of geo-ontology available in OWL format. It is part of GeoNames , which is a database integrating geographical data such as names of places in various languages, elevation, population and other features from various sources. The GeoNames Ontology makes it possible to add geospatial semantic information to the Web. The ontology distinguishes the 'Concept' from the 'Document'. For example, the town Embrun in France is associated with two URIs (Uniform Resource Identifiers):  and . The first URI  identifies the town Embrun in France. The second URI  is the RDF document with the information GeoNames has about Embrun. The GeoNames Web server is configured to redirect requests for the first URI to the second URI. The redirection tells Semantic Web Agents that Embrun is not residing on the GeoNames server but that GeoNames has information about it instead.
The elements in the GeoNames ontology are semantically interlinked with each other in the following ways:
These include countries for a continent, subdivisions, etc. For example, the children of France include Auvergne (province) and Lorraine (administrative region).
These are neighbouring countries for a given country. For example, Switzerland and Germany are neighbours of France.
For example, nearby the Eiffel Tower are Champ de Mars and Trocadéro – Palais de Chaillot.
While machine-readable/machine-understandable data are essential to semantic mashup, most current Web content is only human readable. To bridge the gap between human readability and machine readability, RDFa (RDF attributes)  has been proposed to incorporate Semantic Web methods (RDF) into Web pages (i.e., into HTML – the HyperText Markup Language). RDFa provides a set of HTML attributes to augment visual data with machine-readable contexts. In addition to RDFa, the GRDDL (Gleaning Resource Descriptions from Dialects of Languages) specification  introduces markup based on existing standards for declaring that an XML (eXtensible Markup Language) document includes data compatible with RDF and for linking to algorithms (typically represented in XSLT – eXtensible Stylesheet Language Transformations ).
In March 2008, Linden Lab released a new version of Second Life, which for the first time let users display a Web page on the side of an object (or 'prim' as it is called in SL) within the world . This was done using the same media channel that SL currently uses to display images and videos, and so was still restricted to one "page" per parcel, but it was at least a step forward. (A 'parcel' here refers to a circumscribed plot of virtual land in SL, with its own owner-customisable characteristics and settings.) However, the implementation does have two significant drawbacks:
• The page is not interactive, i.e., you cannot click on links in it; and
• You cannot scroll down or across the page.
Given our earlier work with maps in SL , we were interested to see how effective this new feature would be with Google Maps. Placing the URI of any Web page showing Google Maps rendered well within Second Life, but one could not zoom or pan on the map since the page was not interactive.
The next challenge was how to represent data on Google Maps. Using our Newsglobe application , we could easily produce Google Maps with geocoded RSS or KML data overlaid as markers using the Google Maps API. However, although we could bring the map with markers image into SL using the process above, we could not then click on the markers to interrogate them (e.g., link out to the relevant news story or data reading). The solution was to bring the data itself into SL alongside the map (in a similar way to our Los Angeles aircraft visualisation described in ). Now, when a map with data is requested, the controller and page generator create the map in the standard way – with or without markers – but the controller also directly requests the data feed via a Web proxy which captures (and if necessary geocodes) the data from the RSS/KML feed and then passes them back into SL in a simple text format. The controller then uses these data to rez (SL term for 'resolve') a Second Life object (e.g., a map pin) at each location, and with each map pin hyperlinked back to the Web page containing the relevant item/story. If the user then zooms or pans the map, the controller de-rezzes the pins and then re-rezzes them in their new spatial position to reflect the zoom/pan, without having to re-request the data. A bounding box is applied to ensure that markers are not plotted well beyond the map. Given the 2048-byte limit on data coming in to SL, we typically also restrict the controller to bringing in only 10–20 data points at a time. We have however built in the ability to bring in multiple feeds, each feed being plotted in markers of a separate colour.
However, Daden also wanted to give a sense of the city in 3-D. They used two techniques to achieve this:
• For landmark buildings in the city (e.g., the BT Tower, Selfridges, Radisson SAS Hotel, Mailbox, Millennium Point) they created small scale models of each building and placed them on the right point of the map when zoomed in the central city area (Figure 10). The buildings match in horizontal and vertical scale at this zoom, but are disabled at other zooms.
• For the rest of the city they took the map image and used an image editor to make all the road and open spaces transparent. They then mapped this image onto a transparent object of the same size as the map in Second Life. They then stacked 5–7 layers of this on top of the map. The result was a pseudo-3-D effect, where the buildings show as if rising above the main map, but the open spaces are left at "ground" level (Figure 10).
For us this is just an initial step towards better 3-D mapping in Second Life – and better Web integration. Once Linden Lab release a full Web browser in Second Life (probably in 2009) then Daden's current approach will not necessarily be needed, although it does create a 3-D representation of the markers (pins), which no browser-only solution can easily achieve. It also certainly will not create the 3-D building models (without proper in-world support of specialist Web-browser plug-ins), so we think Daden's system will have significant longevity. Particular areas for enhancement, which Daden might wish to work on, include:
• Moving and scaling the 3-D objects as one zooms;
• Creating a separate and unique 3-D "layer" for each height "slice" within the city, enabling a truer representation of the city height profile to be obtained; and
• Importing Google Earth COLLADA models  for the individual 3-D buildings (COLLADA stands for COLLAborative Design Activity, an interchange file format for interactive 3-D applications).
In addition to the above-mentioned tools like Yahoo! Pipes  and Google Mapplets , which can be used for creating and publishing geo-mashups with little or no coding at all, there exist other equally effective ones worth exploring by interested readers to find out which tool (or combination of tools – or "instruments", to keep the musical metaphor going) works best for them and better serves their particular settings and purposes. For example, Google is now also providing Google Mashup Editor , an AJAX development framework and a set of tools that enable developers to quickly and easily create simple Web applications and mashups with Google services like Google Maps. Similarly, Microsoft has an interesting offering related to Yahoo! Pipes, which they call Microsoft Popfly [62, 63].
The 'Google Maps/Earth in SL' tool described in "Mashup song #4" above, with its ability to visualize GeoRSS news and data feeds in the 3-D virtual world, is perhaps the first realization of the futuristic vision described by Wade Roush in . However, Google Earth (the 3-D mirror world application) remains, to this date, far more powerful than its "port" in Second Life, like many other specialist data visualization tools, a fact echoed in a recent discussion of the topic by Paul Bourke . For example, Google Earth has COLLADA support but not SL [2, 60]. But despite this, there continues to be something very special or "magical" about the current avatar-inhabited Google Maps/Earth SL version by Daden (even with the medium's many current limitations)!
Miklos Sarvary, Director of the Centre for Learning Innovation at INSEAD, has drawn parallels between the life cycle of broadcasting and the Internet : just as radio gave way to the more immersive experience of TV, today's flat Web sites will morph into more interactive, immersive multi-user experiences in which users can see and interact with each other in much more natural ways.
It is predicted that, within 5 to 7 years, the dominant Internet interface is likely to be the 3-D 'Metaverse', a new 3-D Web that will gradually "absorb", and seamlessly integrate with (not fully replace), today's World Wide Web and its applications like Google Earth [2, 74]. (Today's 3-D virtual worlds are still rather early-stage technology and are yet to mature in order to fully realize this vision of a new 3-D Web.)
Today's flat Web allows us to call up "flat" information; a 3-D virtual environment allows us to more naturally experience and visualize this information in real-time with others, and also to appreciate their presence around us. Virtual worlds are such an appealing concept to users primarily because of the social 'co-presence' of others in these worlds in a very realistic manner (Figure 10).
When people are browsing the flat Web shop of Amazon.com, for example, they cannot see, chat with, and benefit from the experiences/opinions of, other people looking for the same items in real time, as they would do in a supermarket's aisle in the physical world. But with 3-D virtual worlds this is very possible.
Although there are some very early flat Web co-browsing solutions under development like Weblin ( – flat interface) and YOOWALK ( – two-and-a-half dimensional – 2.5-D interface) that have attempted to bridge this gap, they are not without their limitations, and it is expected that they will only achieve their full potential within 3-D online social/virtual worlds or the Metaverse over the coming few years. But this can only happen after the full 'HTML-on-a-prim' roadmap and vision , and many other currently missing or seriously lacking key features and qualities (e.g., better usability, scalability and cross-world interoperability) are properly developed and fully realized in these 3-D worlds.
In their introductory description of their MPK20 3-D Virtual Workplace , Sun Microsystems wrote under a paragraph entitled 'Why 3-D for Collaboration' at : "One question we are frequently asked is why use 3-D for a collaboration environment? While it might be possible to build a 2-D tool with functionality similar to MPK20, the spatial layout of the 3-D world coupled with the immersive audio provides strong cognitive cues that enhance collaboration. (...) In terms of data sharing, looking at objects together is a natural activity. With the 3-D spatial cues, each person can get an immediate sense of what the other collaborators can and cannot see". (For other compelling arguments about the value of data visualization and collaboration in 3-D virtual worlds, please see [2, 79, 80].)
Humans are spatial beings by nature, inhabiting feature-rich 3-D analogue spaces, so a 3-D synthetic space should not be more cognitively demanding from a human-computer interface viewpoint compared to conventional flat interfaces, if it is properly designed with usability in mind. In fact, it could even make some presentations that are overly complex in 2-D version much less complicated to understand when ported to a more native 3-D environment.
Andrew Hudson-Smith and his team at the Centre for Advanced Spatial Analysis (CASA), University College London, have an extensive portfolio of GIS-related projects in Second Life, including: (i) Virtual London, (ii) a new approach to importing geographic terrains into Second Life as tabletops, and (iii) an Arc (ESRI) to Second Life project. In their popular 'Digital Urban' blog where these projects are described in detail , they frequently refer to Google Earth and Second Life as 'Three Dimensional Collaborative (Multi-User) Geographic Information Systems'. Second Life and Google Earth (and the related platforms that will definitely follow in the near future, as 3-D mirror and virtual worlds merge ) are indeed promising environments for public participation and collaboration type outreach activities, providing a good basis for a 'layered 3-D Wikipedia of the planet that anyone can edit and add to' or what can be referred to as 'The People's Atlas'. (Participatory GIS (PGIS) or Public Participation GIS (PPGIS) are terms that have been coined to express the adoption of GIS to broaden public involvement in policymaking, and thus empower local communities, especially the less privileged groups in society, who are often ignored in traditional government-oriented and run GIS applications.)
The 3-D Internet is also rapidly becoming a strategic European Commission (EC) research direction, with, for example, the recent establishment of three Working Groups (WGs) within a User Centric Media (UCM) cluster of 15 ongoing EC-funded projects in the area of Networked Media Systems: the Personalized & Creative Media WG, the 3D & Immersive Media WG, and the Future Media Internet WG .
However, we do appreciate that, for some (especially in the corporate domain), the public nature of a world like Second Life may be a barrier to adoption; despite the protection that one can put in place, the core data still goes through a third party server. However, IBM and Linden Lab are currently closely working on suitable solutions for this . Moreover, recent months have seen significant developments in the OpenSim/Open Source SL-"compatible" platform and grids . OpenSim lets users build worlds (and visualizations) on their own PC (Personal Computer) and servers, opened up only to the people they want and allow to access their worlds. In fact, one can now develop one's own spaces offline, publish them on any suitable offline digital storage medium, or host one's own live region/server on the Internet. These Open Source developments, coupled with recent work on an emerging 'MPEG-V for Virtual Worlds' ISO standard , can only lead to wider penetration of 3-D virtual worlds among online users and speed-up the development of interoperability specifications and protocols between these worlds .
Google's recent entry into the 3-D virtual worlds marketplace , as well as the availability of 3-D virtual worlds like Second Life on 3G (third generation) and WiFi enabled mobile phones and other small mobile devices, which is a reality today, thanks to an amazing and very well executed technology from Israel-based Vollee Ltd [88–90], and the novel and more natural 3-D world navigation devices and modalities that are emerging these days [91, 92], will also serve to further expedite the mass penetration of 3-D virtual worlds among Internet users and the development of the next-generation 3-D Internet or Metaverse.
The Web is still 'work-in-progress'. Nevertheless, over the past 5–7 years, Internet GIS has gradually transformed forever the way we approach and analyze geographic information, and has also changed the audience, both producers and consumers, of this information, making it today available to, and editable/remixable by, the wide masses, opening up the possibility of many new applications, and realizing the visions of community or 'Participatory GIS' and of the democratization or 'wikification' of GIS, or what has been called 'consumer health geoinformatics' .
Today, many online mapping applications exist where people can even add their own individual data to a shared Web map, e.g., 'Who is Sick?' [95, 96]. And not just this, but also now users can dynamically overlay on similarly shared maps their own current position on Earth, and also view the position of others who have likewise shared their position, all in real-time over the Web, if they have, for example, a low-cost USB GPS (Universal Serial Bus Global Positioning System) mouse receiver or similar device connected to their PC or built into their mobile gadgets. GPS-enabled mobile phones and GPS-enabled cameras are enabling millions of people every day to collectively annotate the Earth in ways never done before, besides opening up many mobile location-based service possibilities and opportunities.
All of this is now possible and accessible like never before, thanks to the latest breed of 'neogeography' and 'GeoWeb 2.0' technologies and online services like Google Earth (a 3-D mirror world) and Yahoo! Pipes (a visual mashup creation and publishing service), but also not without its own newly introduced "problems" like copyright, individual privacy and even national security issues, all of which were not much the case when GIS was once very 'closed' and only the realm of big organizations and an elite of experts. We have previously discussed these side-effect issues and others in [2, 7, 21, 74, 79], but there is one more that seems very suitable for closing an article about geo-mashups.
The many easy-to-use online interfaces and visual mashup editors that are now available have increased the risks of wrong selection, manipulation and interpretation of data in some scenarios. As discussed in , the ideal consumer tools of the future need to be fault-tolerant and capable of analysing and presenting assembled data in ways that facilitate only appropriate interpretations of integrated or mashed-up data. This can be achieved by using some form of "intelligent", goal-oriented online health GIS wizards and mashup editors that are based on robust statistical, epidemiological and other methods, so that only valid results, maps and visualizations are allowed and produced, even when unlearned users attempt to select inappropriate settings or data for a particular analysis or geo-mashup.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.