Initiating informatics and GIS support for a field investigation of Bioterrorism: The New Jersey anthrax experience
© Zubieta et al 2003
Received: 15 June 2003
Accepted: 16 November 2003
Published: 16 November 2003
The investigation of potential exposure to anthrax spores in a Trenton, New Jersey, mail-processing facility required rapid assessment of informatics needs and adaptation of existing informatics tools to new physical and information-processing environments. Because the affected building and its computers were closed down, data to list potentially exposed persons and map building floor plans were unavailable from the primary source.
Controlling the effects of anthrax contamination required identification and follow-up of potentially exposed persons. Risk of exposure had to be estimated from the geographic relationship between work history and environmental sample sites within the contaminated facility. To assist in establishing geographic relationships, floor plan maps of the postal facility were constructed in ArcView Geographic Information System (GIS) software and linked to a database of personnel and visitors using Epi Info and Epi Map 2000. A repository for maintaining the latest versions of various documents was set up using Web page hyperlinks.
During public health emergencies, such as bioterrorist attacks and disease epidemics, computerized information systems for data management, analysis, and communication may be needed within hours of beginning the investigation. Available sources of data and output requirements of the system may be changed frequently during the course of the investigation. Integrating data from a variety of sources may require entering or importing data from a variety of digital and paper formats. Spatial representation of data is particularly valuable for assessing environmental exposure. Written documents, guidelines, and memos important to the epidemic were frequently revised. In this investigation, a database was operational on the second day and the GIS component during the second week of the investigation.
In September 2001 Anthrax spores were intentionally sent through the US Postal System. At least four letters containing spores of Bacillus anthracis were sent to a U.S. Senator's office and to media centers in different parts of the country. Two cases reported in Florida were followed by two cases in New York. A few days later, four more cases occurred among postal workers at the Brentwood mail processing facility in the Washington area. 
The appearance of new cases in different locations around the country suggested contamination with B. anthracis of several postal facilities. In each location, the population potentially exposed was large. Public health, criminal investigation, and postal authorities at local, state and federal levels investigated the suspected bioterrorism event. Within hours of the discovery of clusters of cases, the Centers for Disease Control and Prevention(CDC) deployed teams to four different locations to support state and local investigations.
Four letters containing anthrax spores were known to have been postmarked and processed in the Hamilton Mail Processing Center (HMPC) located in Hamilton Township, a suburb of Trenton, New Jersey. The HMPC is part of a complicated network of stations and buildings designed to move millions of pieces of mail per day. It is not a typical post office, but a very large facility located in a 281,387 square foot building with many operations areas containing mechanized equipment. It was typically staffed by 250 employees per shift and visited by numerous others in the course of processing 2 million items per day.
On October 19,,,,, field investigation in Trenton was initiated by a CDC team composed of an epidemiologist, environmental experts, occupational physicians, Epidemic Intelligence Service (EIS) officers, a Public Health Informatics Fellow, and support personnel. The team was headquartered in the New Jersey Department of Health and Senior Services (NJDHSS) and had access to the Local Area Network and computing environment of the Department. Other CDC investigation teams were sent to Florida, Washington, and New York.
A detailed account of the New Jersey epidemiologic investigation has recently been published . This article describes the experience of the Informatics Fellow and the New Jersey Geographic Information System(GIS) Coordinator during the first two weeks of the investigation in assessing information system needs and organizing on-site computer support for the New Jersey arm of the investigation.
Assessment of Information Needs
Early in the investigation, the number of people exposed to anthrax was estimated to be large. The response team planned to sample for anthrax contamination in the facility; identify the exposed population; locate, culture, and give prophylactic antibiotics to those most likely to be exposed; and monitor the exposed and treated persons over time for potential long-term effects. Information on each person and sample was to be obtained, stored, analyzed, and augmented over time. For epidemiologic investigation and management purposes, summary reports had to be generated frequently. Because of the large number of agencies involved and the rapid progress of events, a system of document management for the many drafts of guidelines and other communications was also needed.
Information needs of the emergency response team in New Jersey
Number of people exposed
• Implementation of screening procedures
• Determination of amount of drugs/vaccine
• Deployment of drugs/vaccine/antidotes
• Number of people exposed
• People who are receiving chemoprophylaxis
• Centers providing follow up services and number of patients lost
• Loss of follow-up
During the initial investigation
• Case finding
• Identification of people at risk
• People who did not report to work/school
• People admitted to hospitals that meet case definition
• People who did not attend screening programs
• People not receiving post exposure prophylaxis(PEP)
• Dissemination of information such as screening test, changes in PEP protocols,
• Changes in facilities available for screening/treatment
• Changes in recommendations
• Case finding if the definition of exposed changes based on preliminary data
After the control measures are implemented
• Follow-up of adverse reactions
• Follow-up of special populations such as pregnant women and children
• Suspected cases under investigation
• Cases ruled out
• Cases ruled in
During Initial Investigation
• Location of suspected cases
• Location of confirmed cases
• Status of confirmatory test
• Clinical condition of cases
• Clinical status
Geographic Information Systems
• Floor plan map linkable to databases
• County/Zip code of exposed populations
• Hospital locations
• Driving directions to hospitals, operation centers, and affected facilities
• The number and location of emergency rooms
• Hospitals included in the surveillance system
• Sentinel sites
• Distribution of the sampling
• Proportion of positive samples
• Extension of the contamination
• Spatial relationship between cases and exposed population
• Identification of unexposed populations
• Identification of exposed population not included during the initial assessment
• Distribution of cases
• Distribution of deaths
• Geographic distribution patterns
Guidelines for treatments
• Frequently asked questions(FAQ) for the emergency response center
• FAQ for the press
• Recommendation for clinicians
• Guidelines for staff
• Elements of information necessary
• Software algorithms implemented in other locations
• Sensitivity and specificity of screening methods
During the initial investigation
• Drafts for guidelines in different versions
• Drafts for paper forms to be used in all stages of the investigation
• Customized reports
During the case finding
• Number of samples in the laboratory
• Number of positive samples
• Results of confirmatory tests
• Laboratories involved
• Discrepancies between laboratories
During the whole operation
• Team list
• Cell phones
• Email addresses
• Distribution by hotel, team, activities
• Emergency contact numbers
Not all potential data sources were available during the investigation. The postal facility was closed until an environmental assessment could be completed. The database system in the building was unavailable, as the computers had been turned off due to concern that fans in the computers might create air currents that would jeopardize safety or alter environmental sampling results. The list of employees in the facility computers was therefore unavailable to the investigators and had to be assembled from other sources. Digital AutoCAD drawing files of the facility were also stored on a workstation within the facility, and were likewise unavailable – a situation that complicated the production of accurate and representative floor-plan maps.
Laboratory results included cultures of both human and environmental samples performed by the New Jersey State laboratory and CDC laboratories. Managing reports from samples processed in different laboratories required linking of multiple unique identifiers. Later a tracking system for samples processed in more than one laboratory was developed.
Electronic documents that were produced or revised during the investigation included survey results, background documents, guidelines, antibiotic stockpile logistics, and press releases. Frequent revisions made it necessary to identify and provide access to the latest documents and to track document revisions.
The information system developed consisted of three main components: a database of possibly exposed persons, a GIS of the physical plant and equipment, and a document repository. The exposed-person database required merging of files in multiple formats, manual data entry, rapid analysis, and frequent production of reports. GIS work was done using floor plans produced in ArcView GIS 3.2a (Environmental Systems Research Institute, Inc., Redlands, CA http://www.esri.com) by screen-digitizing scanned not-to-scale xerographic copies of AutoCAD drawings of the postal building. Documents were made available to the team by using Web pages (HTML) on the New Jersey Department of Health intranet to link to the latest copies of the files.
To support investigations in other facilities where the letters were manipulated, digitized floor plans of the Carteret and West Trenton Post Offices were produced from rudimentary sketches. Since Epi Map is public domain software, it was installed on a number of computers without need for licensing making a GIS tool freely available to the investigators.
The document repository was created using HTML on an intranet site inside the firewall of the NJDHSS. The Web page included up-to-date information such as press releases and a list of activities for the day. It was updated twice a day or more often when important information had to be communicated to the rest of the investigation team. The repository page also included hyperlinks to the most recent version of important documents. Each link was checked on a daily basis to ensure that the correct document was being accessed. The page also provided a list of databases available to team members.
Recent publications on informatics and bioterrorism have focused on detection of bioterrorist events [4, 5]. Some have reviewed the informatics contribution to past emergencies and formulated proposals for various kinds of surveillance and/or response [6–10]. The goal of this paper is to present needs, difficulties, and solutions encountered in the first 2 weeks of a single field investigation. Every epidemic investigation is unique, but a comprehensive review of activities carried out during an actual investigation may focus discussion on understanding and improving the informatics response in such events.
At the beginning of this investigation, significant effort was expended to identify the information needed and sources for that information. Although the NJDHSS provided a secure computer environment in which to work, many elements of the system had to be designed and set up during the investigation. Much of the base information did not exist (e.g., GIS of the Hamilton facility) or had been made unavailable by the event under investigation (e.g., Hamilton facility AutoCAD digital drawing files and employee lists). As in most field investigations, both data sources and hypotheses shifted and evolved over the course of the investigation.
During an investigation, information is critical for decision-making processes and for detection of cases, identification of risk factors, and managing prevention and control measures. In the first hours or days, the emergency response team is required to design and implement an information system capable of providing reliable, timely information. Being prepared means that much of this is already in place. Experience in developing such "emergency information systems" and with the software being used plays an important role in determining the time needed to complete the task.
In this event, close collaboration between the epidemiologist and informaticians was supported by software designed for field investigation and for exchange of data with other commercial products. The presence of trained professionals from a GIS center was an important element in being able to map the site of the investigation. Use of industry standard shapefiles and Microsoft Access databases allowed working copies to be distributed and used in public domain software as needed. A database system was operational within two days of the beginning of operations, and the GIS elements evolved over a two-week period as floor plans became available.
Potential problems identified initially included lags in entry of clinical data, lack of access to data sources because computer systems were disrupted or inaccessible, and the variety of formats and platforms that were designed for other purposes. At least one source produced data in different formats on different days.
In field investigation, both physical and informatics environments must be dealt with as they are encountered. The informatics environment is defined by the agencies, networks, and individual computers from which data are obtained and with which exchange of data occurs. Unless the entire investigation is internal to a single agency or company, each of the partners – state, county, health care facility, law enforcement agency, industry, or other entity – has its own informatics environment. The standards most likely to be common to all the partners are those of the computer industry, such as the Internet, Microsoft Windows, and popular GIS standards. Those conducting the investigation must be prepared to adapt to the standards used by the data source partners rather than the reverse.
In this investigation, existing database systems provided useful information, but the data items were available on paper, in Excel files, in text files, in Access files, and in ArcView shapefiles. The informatician is responsible for merging and integrating elements stored in different database formats to make the information available to the emergency response team. The need for a speedy public health response is a challenge for informaticians because the steps that are typically time consuming, such as requirements generation, analysis, design, implementation, and testing of a new system, must be completed in hours rather than in weeks or months. In this experience, some parts of the system were functional within two days after arrival at the investigation site.
Epi Info 2000 http://www.cdc.gov/epiinfo was used to set up databases for manual data entry and updating and for importing and merging data provided by collaborating sources in a variety of Microsoft Windows formats. The system developed took advantage of several features available in Epi Info  through its use of commercial component software and computer industry standards:
Rapid development of a relational database, in which identifying information was localized to a single table for security purposes
Importation of data from several of the 20 different file formats that can be read in Epi Info 2000
The ability to perform data management interactively but to preserve and replay the steps through automatically generated programs or scripts
The ability to link and display Microsoft Access data with ArcView shapefiles to create maps
Rapid development, with minimal coding, simplifying the testing and debugging process
The ability to add or delete variables from a database merely by revising a form on the screen.
The availability of analytic output and epidemiologic statistics in the same program used for data management
Interaction with high end GIS programs and transition of database management to a commercial program through Epi Info's incorporation of GIS and Windows standards
Public domain status of the program allowed downloading the latest version from the Internet and distributing it to any number of partners without licensing difficulties
Geographical and spatial analysis has gained importance in public health during the last few years. Maps reveal spatial relationships and facilitate communication. The availability of maps during the investigation can be attributed to the fact that the NJDHSS had made resources available from its GIS center. The GIS professional was able to create the shapefiles while the rest of the emergency response team (ERT) focused on data collection and data management. Once created, maps were widely available for use among all members of the team, and Epi Map 2000 could be used to link data in Microsoft Access files to the shapefiles of the building floor plans.
The main use of GIS in this investigation was to create a relatively detailed base map and to locate environmental sampling and provide spatial analysis and visual support to the investigation. We identified and located all samples taken in the different facilities. In a matter of days we connected the database system with the map files to provide up to date information regarding sample processing and results. The maps created provided a new perspective of the magnitude of contamination inside the facility and also identified non-contaminated areas.
Because of the diversity of the data sources during the investigation, the database system required frequent maintenance. We were able to automate most of the data management routines using Epi Info program (scripting) files and to produce automatic reports.
Laboratory results were available through the local area network in the New Jersey State Health Department where the team worked. Although we might have been able to connect to the laboratory system to obtain real-time updates, for security reasons, the laboratory results for the investigation were exported and became available to the Emergency Response Team at the end of each day. There was therefore a modest, but significant delay in availability of laboratory information.
Document tracking and version control are important when geographically separated teams are working on the same set of problems. In this case each one of the field sites had a set of experts developing documents and recommendations that applied to the whole event and not only to the geographic area of investigation. It was important to keep the most recent set of recommendations available to all members of the emergency response team for decision-making and information dissemination. During the investigation, we were able to access the most recent documents with minimal effort, using links in a Web page. A more sophisticated version could be developed to allow tracking changes in a document over time.
Emergency situations require rapid design and implementation of databases to provide reliable information in the field. Having the ability to respond rapidly to emergency situations requires planning and preparedness. Key elements in preparation are in paying sufficient attention to informatics support and having both staff and software that can adapt quickly to new situations and computing environments. The likelihood of success is greater if the same people and software provide support for more routine epidemic investigations
In this field investigation, the informatics environment included Excel spreadsheets, files in several other formats, paper questionnaires and computer printouts, an existing GIS capability using ArcView, and the need for frequent revisions in database structure, frequent summary reports, and protection of personal identifiers. Software for field use must be flexible enough to interact with and unite numerous data sources and computing environments. Investigators must be prepared for new situations, and be ready to adapt new techniques and software as the needs of an investigation unfold.
The traditional steps in systems development of requirements gathering, analysis, design, implementation, and testing must be traversed in rapid sequence during the first few hours or days of an investigation and repeated again as the system evolves and capabilities are added. More formalized approach to systems development needs to be either extensively modified or compressed to account for the time-critical nature of the situation. The use of commercial software standards such as Windows file formats, Web pages, and mainstream GIS programs facilitates the rapid implementation of systems in the field.
Electronic documents were as important as more structured data for this investigation. The design of systems for emergencies should include methods for storing and maintaining version control of documents and images in addition to more structured data.
Geographic Information System
A critical component was the Geographic Information System. A floor plan of the contaminated Hamilton facility was available only as an incomplete and poor quality 8 1/2 × 11 inch photocopy of the original 'E' sized AutoCAD drawing. It was scanned as a TIFF image, imported into ArcView GIS 3.2 software and used as a backdrop to create shapefiles of the building outline, walls, operational areas and equipment placements, and NIOSH and FBI sampling locations. Attribute tables were also created for each of the GIS theme features. Shapefiles were developed using a scanned copy of the floor plan of each facility and digitized using ArcView®. The elements of information to be displayed were divided into several layers showing samples taken by different agencies and their location.
After completion, the shapefiles were linked to the data using Epi Map 2000, a component of Epi Info that is ArcView® compatible. Elements of information stored in databases created in Epi Info 2000 were sent directly to Epi Map for displaying (Figure 1). The process was automated by using scripts or programs.
Data Collection and Input
To obtain data on persons possibly exposed, a liaison between the U.S. Postal Service and the epidemiologic team was appointed. He was able to obtain MS-Excel© files or computer printouts listing personnel for each of the facilities under investigation. The lists were originally created for payroll or vehicle registration. Additional information was added to those lists to record presence in the facility during the exposure period, sex of the employee, and post exposure prophylaxis (PEP). Most of the clinical follow-up services, including PEP, were provided by the local hospital. The informatics department of the hospital extracted relevant clinical data from the database system and made them available to the investigation team as Excel tables on some days and comma delimited text files on other days.
Primary data items collected by EIS officers and other staff were entered into Epi Info 2000 or into Excel and then imported into Epi Info 2000 (Microsoft Access 97) format. Potentially exposed populations such as regular visitors and relatives of postal workers were identified and manually entered into the main database as the investigation proceeded.
Demographic characteristics of potentially exposed persons were assembled from various sources, including the local hospital where PEP was administered. When PEP was offered to a large group of people, tracking of different treatment protocols and their results became necessary. Paper forms were designed not to become obsolete if protocols are modified during the follow up, and similar revisions in the database were expected.
Database management was done in Epi Info 2000, the Windows version of Epi Info . Epi Info 2000 was developed at CDC from commercial component software and Visual Basic and offers compatibility with Microsoft Access files and 20 other standard database formats. It includes a mapping program that uses the ArcView compatible shapefile format and allows the display of data from Access tables on shapefiles. Most of the basic statistics needed for reporting and analysis are built in so that we did not have to spend time coding or debugging algorithms. The data collection and data management tools in Epi Info do not require expert database managers.
The database system consisted of a central database designed and maintained in Epi Info 2000. The system was structured in 14 tables linked by a unique key. The unique key at first contained information (was "intelligent"), but we designed a mechanism to replace intelligent keys by non-intelligent keys. The intelligent key contained facility code, deployment place (NJ) exposed unique identifier, and postal facility involved. We included the deployment site in the key assuming that the information would be linked with other sites at a latter time.
From the beginning of the operation, security was a concern. Because of the complexity of the investigation we decided that all sensitive information – defined as information that would allow identification of specific individuals – would be handled separately from the rest of the database. We created a single flat table with all identifiers and related the table to the rest of the database system. With that structure, we were able to handle sensitive information as a unit. Identifiers were not sent to CDC Headquarters in Atlanta, for example, and sensitive data were protected by the security features of the NJDHSS computing environment.
A commercial HTML editor (Microsoft Front Page 2000®) was used for the maintenance of the document repository Web page. Web pages included in the system did not contain any special scripting that required an advanced Web programming techniques. The level of HTML coding was basic, and mostly inserted automatically by Front Page.
The document repository allowed coding and storing versions of documents during the progress of the investigation. Documents were organized in folders and assigned code numbers.
Members of the team were instructed to check in documents on a regular basis. The document check-in process included a simple computer-based form with the name of the document, the author, and a time stamp. The first two elements were used to create a catalog, and the timestamp was used for versioning control.
ESRI (Environmental Systems Research Institute) provided a special license for the free distribution of MapObjects software within EpiInfo/EpiMap 2000.
- Update: Investigation of anthrax associated with intentional exposure and interim public health guidelines. MMWR 2001,50(41):889–893.
- Greene CM, Reehuis J, Tan C, Fiore AE, Goldstein S, Beach MJ, et al.: Epidemiologic investigations of bioterrorism-related anthrax, New Jersey, 2001. Emerg Infect Dis 2002,8(10):1048–55. Available from: URL:http://www.cdc.gov/ncidod/EID/vol8no10/02-0329.htm
- Dean AG, Arner TG, Sangam S, Sunki GG, Friedman R, Lantinga M, Zubieta JC, Sullivan KM, Smith DC: Epi Info a database and statistics program for public health professionals for use on Windows 95, 98, NT, and 2000 computers. Centers for Disease Control and Prevention(CDC), Atlanta, Georgia, USA 2000.
- Lazarus R, Kleinman K, Dashevsky I, Adams C, Kludt P, DeMaria A Jr, Platt R: Use of automated ambulatory-care encounter records for detection of acute illness clusters, including potential bioterrorism events. Emerg Infect Dis 2002,8(8):753–60.PubMed
- Lober WB, Thomas Karras B, Wagner MM, Marc Overhage J, Davidson AJ, Fraser H, Trigg LJ, Mandl KD, Espino JU, Tsui F-C: Roundtable on Bioterrorism Detection: Information System-based Surveillance. J Am Med Inform Assoc 2002,9(2):105–115.View ArticlePubMed
- Teich JM, Wagner MM, Mackenzie CF, Schafer Brig Gen KO: The Informatics Response in Disaster, Terrorism, and War. J Am Med Inform Assoc 2002,9(2):97–104.View ArticlePubMed
- Kohane IS: The Contributions of Biomedical Informatics to the Fight Against Bioterrorism. J Am Med Inform Assoc 2002,9(2):116–119.View ArticlePubMed
- Wagner MM: The Space Race and Biodefense: Lessons from NASA about Big Science and the Role of Medical Informatics. J Am Med Inform Assoc 2002,9(2):120–122.View ArticlePubMed
- Tang PC: AMIA Advocates National Health Information System in Fight Against National Health Threats. J Am Med Inform Assoc 2002,9(2):123–124.View ArticlePubMed
- Davenhall B: Building a community health surveillance system. ArcUser Online 2002, 1–2.
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.