Comparison of Poisson and Bernoulli spatial cluster analyses of pediatric injuries in a fire district
© Warden; licensee BioMed Central Ltd. 2008
Received: 26 February 2008
Accepted: 22 September 2008
Published: 22 September 2008
With limited resources available, injury prevention efforts need to be targeted both geographically and to specific populations. As part of a pediatric injury prevention project, data was obtained on all pediatric medical and injury incidents in a fire district to evaluate geographical clustering of pediatric injuries. This will be the first step in attempting to prevent these injuries with specific interventions depending on locations and mechanisms.
There were a total of 4803 incidents involving patients less than 15 years of age that the fire district responded to during 2001–2005 of which 1997 were categorized as injuries and 2806 as medical calls. The two cohorts (injured versus medical) differed in age distribution (7.7 ± 4.4 years versus 5.4 ± 4.8 years, p < 0.001) and location type of incident (school or church 12% versus 15%, multifamily residence 22% versus 13%, single family residence 51% versus 28%, sport, park or recreational facility 3% versus 8%, public building 8% versus 7%, and street or road 3% versus 30%, respectively, p < 0.001). Using the medical incident locations as controls, there was no significant clustering for environmental or assault injuries using the Bernoulli method while there were four significant clusters for all injury mechanisms combined, 13 clusters for motor vehicle collisions, one for falls, and two for pedestrian or bicycle injuries. Using the Poisson cluster method on incidence rates by census tract identified four clusters for all injuries, three for motor vehicle collisions, four for fall injuries, and one each for environmental and assault injuries. The two detection methods shared a minority of overlapping geographical clusters.
Significant clustering occurs overall for all injury mechanisms combined and for each mechanism depending on the cluster detection method used. There was some overlap in geographic clusters identified by both methods. The Bernoulli method allows more focused cluster mapping and evaluation since it directly uses location data. Once clusters are found, interventions can be targeted to specific geographic locations, location types, ages of victims, and mechanisms of injury.
Analysis using geographical information systems (GIS) is just beginning to be tapped in the field of injury prevention. Injuries are most likely spatially heterogenous with some mechanisms constrained geographically, for example, motor vehicle collisions and bicycle and pedestrian injuries will only occur on a roadway. Prevention strategies need to be targeted as much as possible due to constraints on resources available. Fire district resources are assigned to permanent stations and response areas and are limited in the distance they can travel for non-emergency tasks such as injury prevention talks and inspections. If a crew is to be assigned to injury prevention interventions, they need geographic information on where injuries are occurring and where potential target populations are accessible. Even funding of an independent entity such as an academic injury prevention program is limited and its efforts need to be targeted to specific geographic areas and populations to be cost effective.
One study of fall-related injuries in central Toronto used GIS to demonstrate that in addition to age and household income census tract data, the location of homeless shelters appeared to be significantly associated with the distribution of injuries. Motor vehicle collisions cause more deaths in children < 15 years old than any other cause and have diverse geographic variation across the US. A Canadian national study of adolescent injuries revealed a disparity in injury rates from urban (lower rate) to rural (higher rate) populations. Fall-related injuries are the most frequent mechanism of pediatric injuries in the US, though with lower mortality rates than motor vehicle collisions. These injuries can still cause significant head injuries that can affect future cognitive function. A study of pedestrian-related injuries in Montréal using ambulance service data showed that only 1% of intersections had at least one victim and these accounted for only 4% of all injured pedestrians. This is illustrative of the difficulty in attempting to target a limited number of intersections for pedestrian injury prevention. These studies show that there is a clear spatial component to injury patterns and different mechanisms of injury need to be accounted for in the analysis.
The spatial statistic SaTScan™ using the Poisson method has had wide acceptance in detecting disease clusters in many different situations. [7–11] It has been found to have reasonable sensitivity and specificity when compared to generalized additive models (GAM) and Bayesian disease mapping and to the Besag-Newell's R, Cuzick-Edwards' k-Nearest Neighbors, Tango's Maximized Excess Events Test, and Moran's I  in cluster models. The Bernoulli method (see Methods section) in SaTScan has been used to identify census tracts with clusters of high metastatic versus localized prostate cancer incidence in the state of New Jersey with success. To my knowledge, it has not been used for modelling traumatic injury geographical patterns.
The objective of this study is to evaluate and compare the Poisson and Bernoulli methods in SaTScan in finding potential geographical clusters of pediatric injuries within a fire district's boundary. This fire district is very active in injury prevention activities and wishes to see if it can focus its interventions more effectively using these methods. If significant clusters are found, the next step is to evaluate potential injury prevention strategies depending on the characteristics of injuries in each cluster.
All patients less than 15 years of age that had an emergency medical response within the current boundaries of the TVF&R district during 2001–2005 were included.
Patient data including location of call is documented in an electronic charting system (Sunpro,™ Aether Systems, Inc., Baltimore MD) by treating firefighters. Patients' home addresses are not routinely recorded if the incident did not occur there. This database was then queried and the data downloaded as an Excel 2003™ (Microsoft, Inc., Redmond WA) spreadsheet. Variables analyzed were patient demographics, location of incident, location type, and mechanism of injury. No patient identifying data was available to the author. The Oregon Health & Science University Institutional Review Board approved this study.
Base map and patient data analysis
The mechanisms of injuries were aggregated into motor vehicle collision injuries (patients that were passengers in the motor vehicle); bicycle and pedestrian injuries (motor vehicle versus bicyclist or pedestrian or a fall off a bicycle); fall injuries; assault injuries (intentional including alleged child abuse); and "environmental" (poisoning, heat or cold injuries, drowning and burns) to aggregate similar mechanisms to ensure adequate cases in each cohort for analysis. Separate shapefiles were created for the total injuries cohort and each of the mechanisms above. Location type information was consolidated into 1) school, church, or day care; 2) multifamily residential building; 3) single family residence; 4) sport or recreational facility or park;[15, 16] 5) other public building; or 6) street or road. Age was also analyzed by stratifying into age groups defined by the Centers for Disease Control.[17, 18] Disposition was defined as died at scene, not transported, and transported to a hospital (the only options in the EMS system). Race or ethnicity was classified as white, Hispanic, African-American and other or missing together. Patient data was imported into SPSS™ 15.0 (SPSS, Inc., Chicago IL) for statistical analysis. The characteristics of the medical and injured cohorts were compared using the t-test or Pearson χ2 statistic where appropriate.
Base map data layers included the fire district's boundary, station locations and first-due areas, census tract boundaries and population, city and county boundaries (Portland Metro Data Resource Center), and streets and highways (StreetMaps,™ ESRI, Inc., Redlands WA). All maps and analysis used the NAD 1983 HARN State Plane of Oregon North FIPS 3601 coordinate system (Lambert conformal conic projection).
The fire district automatically geocodes incident locations as part of their data management using ArcInfo™ (Environmental Systems Research Information, ESRI, Inc., Redlands CA). The fire district GIS analyst manually locates unmatched incidents from the automatic geocoding and these were not validated independently by the author. The patient location data was transferred as a point shapefile and matched to patient case files using a unique patient incident number. The author imported this patient location and case data into ArcView™ 9.2 (ESRI, Inc., Redlands WA) for geographical analysis.
Bernoulli cluster analysis method
where C is the total number of cases, c is the observed number of cases within the window, n is the total number of cases and controls within the window, N is the combined total of cases and controls within the data set, and I () is the indicator function which is equal to 1 if c > C/N or 0 otherwise. Since this analysis is only interested in detecting clusters with higher than expected rates, I () was set equal to 1.
Poisson cluster analysis method
where C is the total number of cases, c is the observed number of cases within the window, E [c] is the expected number of cases within the window under the null hypothesis, and I () is the indicator function which is equal to 1 if c > E [c] or 0 otherwise. Since this study is only interested in detecting clusters with high rates, I () was set equal to 1. 
For both the Poisson and Bernoulli models, the likelihood ratio is tested for significance using the Monte Carlo method. A circular window is centered on each census tract centroid (for Poisson analysis) or each incident location (for Bernoulli analysis) and the diameter is varied from zero to one that includes a priori a certain maximum proportion of the total number of case events. For the purposes of this analysis, 999 Monte Carlo replications were used, the maximum circle size included up to 50% of the total cases being analyzed, and a significant p-value was less than 0.05. The likelihood function is maximized over all window locations and sizes and the one with the maximum likelihood constitutes the most likely cluster. Secondary non-overlapping clusters can then be found by subtracting the most likely cluster cases (and controls in the Bernoulli method) from the pool and repeating the above procedure. Any edge effect was ignored in this analysis since there was no data available from outside the fire district's boundary.
There were an estimated 82, 400 children less than 15 years old living within the fire district boundary. During the study period, the fire district responded to a total of 2806 medical calls and 1997 injuries in patients less than 15 years of age for an incidence of 6.8/1000/year and 4.8/1000/year, respectively. Figure 2 demonstrates the census tract population distribution for children less than 15 years old in the fire district. The intervals for the choropleth map were determined by the quantile method. Aggregating similar injury mechanisms revealed there were 413 injuries due to motor vehicle collisions, 219 due to pedestrian and bicycle injuries, 1035 due to falls, 236 due to environmental injuries, and 94 due to assaults.
Bernoulli cluster analysis
Comparison of medical and injured patient cohorts
Medical Patients # 2806
Injured Patients # 1997
5.4 ± 4.8 yrs
7.7 ± 4.4 yrs
< 0.001 (t-test)
0 – 12 mos
1 – 4 yrs
5 – 9 yrs
10 – 14 yrs
Single family residence
Clusters identified by Bernoulli method stratified by mechanism
Number in Cluster
Total Injuries (# 1997)
Motor Vehicle Collision Injuries (# 413)
Pedestrian & Bicycle Injuries (# 219)
Fall Injuries (# 1035)
Environmental Injuries (# 236)
Assault Injuries (# 94)
Poisson cluster analysis
Clusters identified by the Poisson method stratified by mechanism
Number in Cluster
Total Injuries (# 1997)
Motor Vehicle Collision Injuries (# 413)
Pedestrian & Bicycle Injuries (# 219)
Fall Injuries (# 1035)
Environmental Injuries (# 236)
Assault Injuries (# 94)
Comparison of Poisson and Bernoulli cluster analyses
Figures 4, 5, 6, 7, 8, 9 show the location of the significant Poisson and Bernoulli clusters for the total injuries cohort and each of the individual mechanisms respectively. Each cluster is labelled with its RR, p-value of that cluster RR and the number of cases contained in the cluster. In general, there is some geographical overlap of clusters by each method but there are some marked differences in the location of clusters identified. The comparison of each method on the same map highlights the strengths and weaknesses of each method and if the clusters overlap strengthens the impression that the area is a "hotspot." The two analyzes have different hypotheses since the Poisson method compares the injury rates to the underlying population using census tract data and the Bernoulli compares the location of events to a control group which may or may not be appropriate.
Figure 4 demonstrates the location of clusters for the total injuries cohort. There are an equal number of clusters found in each method but the Poisson method includes 32% of all cases in its clusters while the Bernoulli method only includes 11% of cases. Three of the Bernoulli clusters mostly overlap with one large Poisson cluster in the most urban part of the fire district while one Poisson cluster consisting of one large census tract has no proximate Bernoulli cluster.
There is even a starker difference in the respective cluster arrangement for the motor vehicle collision injury cohort with 13 clusters found in the Bernoulli method while only 3 found in the Poisson method in Figure 5. Again, there is some overlap in the core urban area and also some in the middle, western border but otherwise little overlap. The Bernoulli method has much smaller cluster sizes demonstrating perhaps a higher sensitivity in finding clusters since it uses precise incident locations.
Figure 6 maps the clusters for pedestrian and bicycle injuries with each method having two significant clusters that in this case do overlap. The number of cases included in the clusters is also similar and the RRs for the central clusters are the highest for its respective method.
The falls injuries cluster analysis (Figure 7) found only one cluster with the Bernoulli method and four with the Poisson method, one of which overlaps on the eastern boundary. The Poisson analysis resulted in having only one tract per cluster and overall contained 17% of cases while the small Bernoulli cluster had only 2% of the total.
Finally, Figures 8 and 9 show a single Poisson cluster containing one census tract for the environmental and assault injury cohorts, respectively. Each contains a small proportion of the total cases in the cohort. There are no significant Bernoulli clusters found for these mechanisms.
This study of pediatric injuries in a fire district database showed significant clustering for overall injuries and for each mechanism cohort for either the Poisson or Bernoulli method or both. Except for motor vehicle collisions, the majority of injuries occurred outside any identifiable clusters. The RR and corresponding p-value for most of the clusters are very significant so there is little doubt that these are high-risk areas. On the other hand, targeting injury prevention strategies only to these high risk areas may only have a minimal impact on the overall injury rates.
This has implications for the field of pediatric injury prevention. Injury prevention programs either free-standing or as part of a larger organization such as a fire department struggle with how to implement programs effectively. They have to decide which mechanisms to focus on, what age groups to include, and what geographic areas are highest risk. They also need to contend with whether there are effective preventative strategies available, [22, 23] how much they cost, and are there any appropriate personnel or infrastructure to implement them. Most injury prevention programs have to compromise among these many factors. With a good GIS analysis of injury patterns, though, the programs should be able to make better decisions.
The clusters identified may be the first place to start some injury prevention activities since they have been identified as high-risk areas. These may be good places to establish pilot projects since stakeholders there may be more motivated to work on prevention activities and since the rates are already high it may be easier to show an effect for interventions. Since most injuries occur outside these clusters, the programs developed by these pilot projects need to be distributed throughout the organization's catchment area to have any appreciable effect on injury incidence. For example, the fire district could first focus on the two clusters of pedestrian and bicycle injuries and do further analysis of what age groups of children are most involved and what activities these children may be involved in leading to an increased rate of injury. A simple question to ask is whether the clusters seem to be proximate to schools, parks or commercial areas. One study found increased pedestrian injuries in proximity to a school in four Californian communities. Once the location of clusters of pedestrian injuries are found there are proven interventions that can be done at schools to decrease the incidence. The present study demonstrates only two significant clusters of pedestrian injuries in the study area consistent with a previous study in Montréal.
The major limitation of this study is that the original data collection was for patient care and not an injury prevention analysis. Trying to obtain similar data through a specific injury prevention research project would be very expensive and take several years to complete. This study used data fields that should be accurate since they are also needed for documentation of patient care. Most similar analyses come from secondary data sources due to resource constraints. As emergency medical service agencies engage in more injury prevention strategies they will collect more appropriate data points to manage these effectively, allowing better research. This data also excluded any injuries that did not generate a 9-1-1 call but may have been seen in a primary care office or emergency department. The relative distribution of where these children are seen first varies with the severity of injury, where the injury occurs, and what mechanism is involved.[5, 21, 23] This dataset should account for most of the serious non-intentional injuries in this population. Except for a few select geographical areas in the United States, a comprehensive injury data collection process is not in place and one has to depend on secondary and limited sources.
Each method of cluster analysis in SaTScan has potential strengths and weaknesses. Certainly the geographical overlap of both methods was less than perfect with each finding different number of significant clusters for most mechanisms. The Poisson method had to rely on census tract level population data that outside the core urban areas have less regular shapes and population densities. In addition, since the fire district's boundary did not always correspond to census tract boundaries multiple portions of census tract polygons were used for analysis leading to possible distortion due to uneven population distributions. SaTScan uses a circular window on census tract centroids to determine potential cluster boundaries which may not represent the population at risk in a realistic fashion. The injury rates are relatively low so attempting aggregations at a smaller population size such as census blocks may lead to very low rates for analysis.
Using the Bernoulli method in SaTScan with the controls being medical cases can be criticized. The two cohorts did differ in some demographic factors that may influence the cluster analysis results. The strength of this approach is using cases and controls drawn from a sample of the population at-risk that use the 9-1-1 emergency response system. One would be more certain of overlapping clusters identified by both methods to be real and one might concentrate on these for further analysis. Even if one could more accurately map the population distribution by using dysametric methods and remote sensing, for example, this will still not take into account the population at-risk that travels through, goes to school or day care in or works in the study area. Estimating the distribution of this population would be a huge undertaking with a limit in available data and statistical methods to analyze it.
One could argue that for most injuries especially ones occurring on the road system it would be more appropriate to use network analysis to find clusters. Currently, cluster analysis on a network using a stochastic model is in its early stages.[26, 27] This would be a logical next step in the analysis of injury data especially road-related ones.
In this study of pediatric injuries involving a fire district 9-1-1 response, there were identifiable high-risk clusters found for all injury mechanisms combined and each mechanism separately by at least one method of cluster detection. There are strengths and weaknesses to each method of cluster detection. Finding these clusters is the first step in targeting injury prevention interventions to decrease the incidence. More detailed GIS and demographic analysis will further refine possible strategies and allow more rational choices. Other methods of analysis should be attempted on the location of incidents involving injuries.
Tualatin Valley Fire and Rescue.
Jeanne Ervin, GIS Analyst and the firefighters of Tualatin Valley Fire & Rescue, Beaverton, Oregon.
- Edelman LS: Using geographic information systems in injury research. J Nurs Scholarsh. 2007, 39 (4): 306-11. 10.1111/j.1547-5069.2007.00185.x.PubMedView ArticleGoogle Scholar
- Cusimano MD: Geomatics in injury prevention: the science, the potential and the limitations. Inj Prev. 2007, 13 (1): 51-6. 10.1136/ip.2006.012468.PubMedPubMed CentralView ArticleGoogle Scholar
- Baker SP, Waller A, Langlois J: Motor vehicle deaths in children: geographic variations. Accid Anal Prev. 1991, 23 (1): 19-28. 10.1016/0001-4575(91)90031-Y.PubMedView ArticleGoogle Scholar
- Jiang X: Variations in injury among Canadian adolescents by urban-rural geographic status. Chronic Dis Can. 2007, 28 (1–2): 56-62.PubMedGoogle Scholar
- Klauber MR: A population-based study of nonfatal childhood injuries. Prev Med. 1986, 15 (2): 139-49. 10.1016/0091-7435(86)90084-8.PubMedView ArticleGoogle Scholar
- Morency P, Cloutier MS: From targeted "black spots" to area-wide pedestrian safety. Inj Prev. 2006, 12 (6): 360-4. 10.1136/ip.2006.013326.PubMedPubMed CentralView ArticleGoogle Scholar
- Kulldorff M: A spatial scan statistic. Commun Statist Theory Meth. 1997, 26 (6): 1481-1496. 10.1080/03610929708831995.View ArticleGoogle Scholar
- Kulldorff M: SaTScan User Guide v7.0. Information Management Services Inc. 2006, 4-69.Google Scholar
- Kulldorff M: Breast cancer clusters in the northeast United States: a geographic analysis. Am J Epidemiol. 1997, 146 (2): 161-70.PubMedView ArticleGoogle Scholar
- Kulldorff M, Nagarwalla N: Spatial disease clusters: detection and inference. Stat Med. 1995, 14 (8): 799-810. 10.1002/sim.4780140809.PubMedView ArticleGoogle Scholar
- Nkhoma ET: Detecting spatiotemporal clusters of accidental poisoning mortality among Texas counties, U.S., 1980–2001. Int J Health Geogr. 2004, 3 (1): 25-10.1186/1476-072X-3-25.PubMedPubMed CentralView ArticleGoogle Scholar
- Aamodt G, Samuelsen SO, Skrondal A: A simulation study of three methods for detecting disease clusters. Int J Health Geogr. 2006, 5: 15-10.1186/1476-072X-5-15.PubMedPubMed CentralView ArticleGoogle Scholar
- Song C, Kulldorff M: Power evaluation of disease clustering tests. Int J Health Geogr. 2003, 2 (1): 9-10.1186/1476-072X-2-9.PubMedPubMed CentralView ArticleGoogle Scholar
- Abe T, Martin IB, Roche LM: Clusters of census tracts with high proportions of men with distant-stage prostate cancer incidence in New Jersey, 1995 to 1999. Am J Prev Med. 2006, 30 (2 Suppl): S60-6. 10.1016/j.amepre.2005.09.003.PubMedView ArticleGoogle Scholar
- Finch C, Cassell E: The public health impact of injury during sport and active recreation. J Sci Med Sport. 2006, 9 (6): 490-7. 10.1016/j.jsams.2006.03.002.PubMedView ArticleGoogle Scholar
- Lam LT: Hospitalisation due to sports-related injuries among children and adolescents in New South Wales, Australia: an analysis on socioeconomic and geographic differences. J Sci Med Sport. 2005, 8 (4): 433-40. 10.1016/S1440-2440(05)80058-1.PubMedView ArticleGoogle Scholar
- Recommended framework for presenting injury mortality data. MMWR Recomm Rep. 1997, 46 (RR-14): 1-30.
- Injury Mortality Atlas of the United States, 1979–1987. MMWR Morb Mortal Wkly Rep. 1991, 40 (49): 846-8.
- Hendricks SA: Power determination for geographically clustered data using generalized estimating equations. Stat Med. 1996, 15 (17–18): 1951-60. 10.1002/(SICI)1097-0258(19960930)15:18<1951::AID-SIM407>3.0.CO;2-P.PubMedView ArticleGoogle Scholar
- Huang L, Kulldorff M, Gregorio D: A spatial scan statistic for survival data. Biometrics. 2007, 63 (1): 109-18. 10.1111/j.1541-0420.2006.00661.x.PubMedView ArticleGoogle Scholar
- Agran PF: Rates of pediatric and adolescent injuries by year of age. Pediatrics. 2001, 108 (3): E45-10.1542/peds.108.3.e45.PubMedView ArticleGoogle Scholar
- Dowswell T: Preventing childhood unintentional injuries–what works? A literature review. Inj Prev. 1996, 2 (2): 140-9.PubMedPubMed CentralView ArticleGoogle Scholar
- Mace SE: Injury prevention and control in children. Ann Emerg Med. 2001, 38 (4): 405-14. 10.1067/mem.2001.115882.PubMedView ArticleGoogle Scholar
- LaScala EA, Gruenewald PJ, Johnson FW: An ecological study of the locations of schools and child pedestrian injury collisions. Accid Anal Prev. 2004, 36 (4): 569-76. 10.1016/S0001-4575(03)00063-0.PubMedView ArticleGoogle Scholar
- Hotz GA: Pediatric pedestrian trauma study: a pilot project. Traffic Inj Prev. 2004, 5 (2): 132-6. 10.1080/15389580490435097.PubMedView ArticleGoogle Scholar
- Okabe A, Okunuki KI, Shiode S: The SANET toolbox: New methods for network spatial analysis. Transactions in GIS. 2006, 10 (4): 535-550. 10.1111/j.1467-9671.2006.01011.x.View ArticleGoogle Scholar
- Okabe A, Okunuki KI, Shiode S: SANET: A toolbox for spatial analysis on a network. Geographical Analysis. 2006, 38 (1): 57-66. 10.1111/j.0016-7363.2005.00674.x.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.