Skip to main content

A power comparison of generalized additive models and the spatial scan statistic in a case-control setting

Abstract

Background

A common, important problem in spatial epidemiology is measuring and identifying variation in disease risk across a study region. In application of statistical methods, the problem has two parts. First, spatial variation in risk must be detected across the study region and, second, areas of increased or decreased risk must be correctly identified. The location of such areas may give clues to environmental sources of exposure and disease etiology. One statistical method applicable in spatial epidemiologic settings is a generalized additive model (GAM) which can be applied with a bivariate LOESS smoother to account for geographic location as a possible predictor of disease status. A natural hypothesis when applying this method is whether residential location of subjects is associated with the outcome, i.e. is the smoothing term necessary? Permutation tests are a reasonable hypothesis testing method and provide adequate power under a simple alternative hypothesis. These tests have yet to be compared to other spatial statistics.

Results

This research uses simulated point data generated under three alternative hypotheses to evaluate the properties of the permutation methods and compare them to the popular spatial scan statistic in a case-control setting. Case 1 was a single circular cluster centered in a circular study region. The spatial scan statistic had the highest power though the GAM method estimates did not fall far behind. Case 2 was a single point source located at the center of a circular cluster and Case 3 was a line source at the center of the horizontal axis of a square study region. Each had linearly decreasing logodds with distance from the point. The GAM methods outperformed the scan statistic in Cases 2 and 3. Comparing sensitivity, measured as the proportion of the exposure source correctly identified as high or low risk, the GAM methods outperformed the scan statistic in all three Cases.

Conclusions

The GAM permutation testing methods provide a regression-based alternative to the spatial scan statistic. Across all hypotheses examined in this research, the GAM methods had competing or greater power estimates and sensitivities exceeding that of the spatial scan statistic.

Background

Statistical tests applied in spatial epidemiology have two primary purposes. The first is to detect spatial variation in disease risk across a study region and the second is to identify areas of increased or decreased risk [1–4]. We consider generalized additive models (GAMs) and the spatial scan statistic; two popular methods that can be used for both purposes.

GAMs are generalizations of generalized linear models that allow nonparametric functions of covariates to be modeled in an additive framework [5]. Webster et al. (2006) used GAMs in spatial settings with a bivariate locally weighted regression (LOESS) smooth [5] and performed hypothesis tests using permutation techniques to determine whether there was spatial variation in disease risk and to locate statistically significant areas of increased or decreased risk [6]. Similar methods have been applied by other authors using tests based on permutation, bootstrap, and Monte Carlo techniques [7–10].

In previous research, we evaluated four permutation tests applied with GAMs to determine type I error rates and power estimates under simple hypotheses (Young, Weinberg, Vieira, Ozonoff, Webster: The Power of Hypothesis Testing Using Generalized Additive Models with Bivariate Smoothers, submitted) [11]. The four methods differed primarily in the determination of the span (neighborhood) size when applying GAMs to observed and permuted datasets. For the conditional permutation test (CPT), originally proposed by Webster et al. (2006), we selected an optimal span by applying GAMs to observed data using a range of possible span sizes. Akaike's Information Criterion (AIC) was recorded for each model and the minimal model AIC corresponded to the optimal span [6, 12]. The statistic of interest, the difference in deviances of models including and excluding the bivariate LOESS smoothing term, was recorded for the observed data. GAMs were then applied to permuted datasets using the optimal span selected for the observed data to produce a conditional permutation distribution of difference in deviance statistics. We determined significance through the comparison of the observed statistic to the sampling distribution generated from the repeated permutations [6]. This method had an inflated type I error rate when applied using the nominal α cutoff. For a nominal significance level of 0.05, CPT had an estimated type I error rate of 9.5% when applied with a bivariate smoothing term. When the null hypothesis was rejected for p-values less than 0.025, the observed type I error rate fell within a 95% confidence interval of 0.05, the desired significance level [11].

The second method was a fixed span permutation test (FSPT) where the span size was determined a priori and was held constant for observed and permuted datasets. The test was otherwise performed in the same manner as the CPT. This test had an appropriate type I error rate [11] but the required a priori determination of the span size was a disadvantage (Young, Weinberg, Vieira, Ozonoff, Webster: The Power of Hypothesis Testing Using Generalized Additive Models with Bivariate Smoothers, submitted). An alternative method was the fixed multiple span permutation test (FMSPT), evaluating GAM models at three or five predetermined span sizes across the range of possible spans. A permutation test was performed at each selected span with a reduced significance cutoff, empirically determined to be α/#Spans Examined. The Bonferroni-like adjustment produced a slightly conservative type I error rate but the FMSPT had similar power estimates when compared to the other methods (Young, Weinberg, Vieira, Ozonoff, Webster: The Power of Hypothesis Testing Using Generalized Additive Models with Bivariate Smoothers, submitted). The final permutation method was the unconditional permutation test (UPT) where we determined the optimal span size for observed and permuted datasets through minimizing the AIC statistics. This method had an appropriate type I error rate; however it was computationally intensive and had reduced power when compared to the other methods [11]. A brief description of the hypothesis testing methods is located in Table 1.

Table 1 Description of Hypothesis Testing Methods and Significance Cutoffs

The spatial scan statistic, a popular method proposed by Kulldorff and Nargawalla (1995), detects the most likely cluster through comparison of likelihoods of cases falling within and outside circular zones [13]. In recent power evaluations, the scan statistic performed well with a single circular cluster [2, 14, 15] but underperformed with multiple and non-circular clusters [16]. When applied to case-control data, aside from stratified analyses, the scan statistic cannot be adjusted for covariates [16]. We applied the scan statistic through SaTScan, publicly available free software [17], to reflect its application in spatial statistics and spatial epidemiology.

GAMs and the scan statistic were compared in a previous study that focused on cluster detection using aggregate data. The performance of the two methods depended greatly on the shape of the cluster and with irregularly shaped clusters, GAMs outperformed the scan statistic [18]. In this study, we applied the CPT, FMSPT, and spatial scan statistic to simulated case-control point data to estimate power for global and sensitivity for local hypothesis tests. The CPT and FMSPT were selected for comparison as they are computationally efficient, had similar power estimates in a previous study (Young, Weinberg, Vieira, Ozonoff, Webster: The Power of Hypothesis Testing Using Generalized Additive Models with Bivariate Smoothers, submitted), and neither method requires a priori selection of a single span.

Here, simulated data were generated in three cases: Case 1 was a circular cluster of constant increased or decreased risk centered in a circular study region (Figure 1), Case 2 was a circular study region with increased or decreased risk with proximity to the center of the region (Figure 2), and Case 3 was a square study region with increased or decreased risk with proximity to the center of the horizontal axis (Figure 3). We compared the power and sensitivity of the hypothesis testing methods for each Case.

Figure 1
figure 1

Case 1 Study Region Diagram. This figure is a diagram for the study region generated for Case 1.

Figure 2
figure 2

Case 2 Study Region Diagram. This figure is a diagram for the study region generated for Case 2.

Figure 3
figure 3

Case 3 Study Region Diagram. This figure is a diagram for the study region generated for Case 3.

Methods

Simulated Data

Simulated data had a dichotomous outcome and geographic locations generated from a bivariate uniform distribution of longitude and latitude. Odds ratios were chosen to produce a wide range of theoretical power. Odds ratios less than 1.0 indicate areas of decreased risk while odds ratios greater than 1.0 indicate areas of increased risk. For each set of parameters, 1000 datasets were simulated, each containing 1000 observations, chosen to reflect previous studies performed on the Cape Cod Family Health Study data that used GAMs as a statistical analysis technique [19–21]. Statistical analyses were applied to point data and the nominal α level was 0.05.

Case 1 was a circular study region that contained a circular cluster of constant risk centered in the region. (Figure 1, Figure 4) This case was a simplified version of what may be observed if subjects living within some radius of an exposure source, such as a lead smelter [22], were found to be at constant increased or decreased risk when compared to subjects living further from the source. The cluster covered 15% of the study region and approximately 150 of the 1000 subjects lived within the cluster. We considered scenarios where the probabilities of disease outside the cluster were equal to 5% or 20%. Odds ratios comparing those living within to outside the cluster were 0.5, 1.0, 1.5, 2.0, 2.5, and 3.0.

Figure 4
figure 4

Case 1 Example Data with Odds Ratio of 3.0. This figure is a replicate of data simulated for Case 1 with an odds ratio and a probability of disease outside the cluster of 20%. Cases are displayed in red while non-cases are displayed in blue.

Simulated data for Case 2 reflected a circular study region with linearly increasing or decreasing logodds of disease with proximity to the center of the region. (Figure 2) This was a simplified pattern of what may be observed where increased proximity to some point, such as a lead smelter [22], increased the risk of disease. The probability of disease at the edge of the region, i.e. the subjects least exposed, was equal to 5 or 20% and odds ratios comparing subjects at the center to those at the edge of the region included 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, and 3.5.

In Case 3, the study region was square with logodds of disease increasing or decreasing linearly with proximity to the center of the horizontal axis of the study region. (Figure 3) These data followed a simple pattern similar to increased risk of disease with proximity to heavy-traffic roadways [23, 24]. As with Case 2, the probabilities of disease for those least exposed, i.e. living at the horizontal edges of the region, were equal to either 5 or 20%. There was no variation in disease risk across the vertical axis. The odds ratios were 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, and 3.5 comparing subjects living at the center to those at the edge of the horizontal axis.

Theoretical Power

Data for Case 1 could be appropriately analyzed using a Pearson chi-square test while data for Cases 2 and 3 could be appropriately analyzed using simple logistic regressions. We derived the theoretical power for each Case to determine how the spatial methods compare to the more simple tests. Details of these derivations are available in Additional file 1. 95% confidence intervals were computed for each power estimate. The margin of error was computed using the standard deviation of the estimated power, i.e. 95 % C I = p ∧ ± 1.96 p ∧ ( 1 − p ∧ ) 1000 , where p ∧ is the estimated power.

Generalized Additive Models (GAMs)

We applied GAMs to simulated data using a bivariate LOESS smoothing term to adjust for geographic location [6] using the gam package [25] available in R v2.8.0 [26]. Two hypothesis tests were performed using the GAM framework: the CPT and FMSPT. When performing the CPT, an optimal span size for the observed data was selected through the application of GAM models to the observed data using a range of span sizes between 0.05 and 0.95. The AIC was recorded for each model and the span was selected to minimize the model AIC [6, 12]. The span size was held constant as GAMs were applied to 999 permuted datasets. The statistic of interest was the difference in deviances between models including and excluding the smoothed term for geographic location. The null hypothesis was rejected if the observed difference in deviance statistic fell in the upper 2.5% of the distribution of statistics from permuted datasets [6, 11].

The FMSPT was performed through the application of GAM models across either three or five predetermined span sizes, denoted by FMSPT-3 and FMSPT-5 respectively. The three span sizes selected were 0.1, 0.5, and 0.9 while the five span sizes were 0.1, 0.3, 0.5, 0.7, and 0.9. From each model the difference in deviance statistic was recorded and subsequently compared to the permutation distribution with the same span size. For FMSPT-3, the null hypothesis was rejected if the difference in deviance statistic fell in the upper 100 * α 3 % of the distribution while for FMSPT-5, it was rejected if the statistic fell in the upper 100 * α 5 % of the distribution [27].

Syntax used to generate data for this study and to minimize the AIC statistic across multiple span sizes using the software R v2.8.0 [26] is available on the Boston University Superfund Research Program website at: http://www.busbrp.org/projects/project2.html.

Spatial Scan Statistic

The spatial scan statistic is a method that overlays the study region with overlapping circular regions centered at observation locations with radii varying continuously from zero to some specified upper limit (here, radii vary from zero to containing one-half of the study population). A zone is the infinite number of circles centered at some arbitrary point location with radii varying from zero to the upper bound. For a given radius, a zone can be further described by the number of individuals and cases falling inside the circle. The spatial scan statistic tests the null hypothesis of spatial randomness, i.e. the probability of disease within a circular zone equals that outside the zone, through a likelihood ratio test and detects the most likely cluster as the zone that maximizes the likelihood under the full parameter space. The distribution of the likelihood ratio depends on the underlying population distribution, upon which no assumptions have been made. With small samples it is possible to find the exact distribution; however for larger datasets Monte Carlo simulations are required [13].

In this study, the scan statistic was applied through the free available software SaTScan v 7.0.3 [17] using a purely spatial Bernoulli model, appropriate for case-control data. We rejected the null hypothesis if the most likely cluster was significant at the 0.05 level.

Detecting Exposure Source Locations

We aimed to evaluate the ability of the GAM permutation tests and the spatial scan statistic to correctly identify the exposure source as high or low risk, i.e. the sensitivity of the methods. For Cases 1 and 2, we defined the CPT as successful in locating the exposure source if the global null hypothesis was rejected and the point-wise predicted logodds for the center of the study region fell in either the upper or lower 2.5% of the point-wise permutation distribution of predicted logodds. We considered the FMSPT successful if the global null hypothesis was rejected and at least one test predicted a point-wise logodds falling in the upper or lower 2.5% of the corresponding predicted permutation distribution. For Case 1, we also examined the proportion of the true cluster correctly identified as high or low risk by the point-wise tests, given that the global hypothesis was rejected. For Case 3, we defined sensitivity as the proportion of the vertical exposure source that was detected as increased risk for datasets when the global null hypothesis was rejected. For the FMSPTs, we present the proportion of the vertical source detected by at least one span size.

For Cases 1 and 2, if the center of the region was detected as part of a significant most likely cluster at the 0.05 level, the scan statistic was considered successful in identifying the area of risk. For Case 1, we also examined the percent of the true cluster overlapped by a significant most likely cluster. For Case 3, we examined the proportion of the vertical exposure source included in a significant most likely cluster.

The sensitivity to detect the exposure source is undefined for data simulated under the null hypothesis and so sensitivity estimates for these data are excluded. The local hypothesis tests are exploratory in nature but provide an additional measure by which the tests can be compared. We did not examine a measure of specificity in this analysis as, for Cases 2 and 3, the exposure source was not dichotomous in nature.

Results

Theoretical powers for Cases 1, 2, and 3 were computed using equations from the literature [[28], 31, 32]. (See Additional file 1 for equations.) The theoretical power for a Pearson chi-square test applied to Case 1 ranged from 0.050 to greater than 0.980 for both probabilities of disease. For Case 2, the theoretical power ranged from 0.05 to 0.766 and 0.988 for probabilities of disease of 0.05 and 0.20, respectively. For Case 3, the power ranged from 0.05 to 0.935 and >0.999 for probabilities of disease of 0.05 and 0.20 at the edge of the region. (Table 2)

Table 2 Theoretical Power Based on Pearson Chi-Square Test and Simple Logistic Regression

The CPT outperformed FMSPT-3 and FMSPT-5 in each case. It was appropriately sized with observed type I error rates near 0.05. The power of the CPT was approximately doubled when comparing estimates for a probability of disease of 0.05 to 0.20 for each case. Larger power estimates were generally observed for Case 1, followed by Case 3 and Case 2. (Table 3)

Table 3 Observed Power for GAM Hypothesis Tests and Spatial Scan Statistic

In general, FMSPT-3 had higher power estimates than FMSPT-5, perhaps due to the slightly conservative Bonferroni-like significance cutoff adjustments. The greatest power estimates for the FMSPTs with an odds ratio of 3.0 were observed in Case 1 with a probability of disease for unexposed subjects of 0.20 and power estimates ranging from 0.027 (95%CI: 0.017-0.037) to 0.890 (95%CI: 0.871-0.909) and 0.020 (95%CI: 0.011-0.029) to 0.880 (95%CI: 0.860-0.900) for the 3 and 5 span tests, respectively. The power estimates for Case 3 were greater than those of Case 2 with maximal power estimates of 0.954 (95%CI: 0.941-0.967) and 0.942 (95%CI: 0.928-0.956) observed for FMSPT-3 and FMSPT-5, respectively, for an odds ratio of 3.5 in Case 3 and 0.833 - (95%CI: 0.810-0.856) and 0.798 (95%CI: 0.773-0.823) in Case 2. (Table 3)

The spatial scan statistic performed best in Case 1, followed by Cases 2 and 3. In Case 1, the scan statistic had a maximum estimated power of 0.963 (95%CI: 0.951-0.975) with a probability of disease outside the cluster of 0.20. In Cases 2 and 3 the maximum power was 0.758 (95%CI: 0.731-0.785) and 0.703 (95%CI: 0.675-0.731), respectively, for an odds ratio of 3.5. The test had an appropriate type I error rate for all cases and scenarios. It was outperformed by all three of the permutation testing methods in Cases 2 and 3 but had the highest power estimates in Case 1. (Table 3)

Examining method sensitivity, in Case 1, the CPT detected an average proportion of 0.740 (SD: 0.202) and 0.855 (SD: 0.140) of the true cluster as a hotspot with an odds ratio of 3.0 and probabilities of disease outside the cluster of 0.05 and 0.20, respectively. For the same odds ratio and scenarios, FMSPT-3 outperformed the CPT detecting 0.855 (SD: 0.150) and 0.961 (SD: 0.063), while FMSPT-5 detected 0.876 (SD: 0.133) and 0.969 (SD: 0.058). The scan statistic had the smallest average proportion of the true cluster detected with averages of 0.686 (SD: 0.290) and 0.842 (SD: 0.186) for an odds ratio of 3.0 and probabilities of disease outside the cluster of 0.05 and 0.20, respectively. (Table 4)

Table 4 Case 1 Sensitivity - Mean Proportion of True Cluster Detected as Hot- or Coldspot

Examining the ability of the methods to detect the exposure source location, the CPT was outperformed by both FMSPTs in Case 2. When the null hypothesis was rejected, the exposure source was correctly identified as a point of high or low risk by the CPT in up to 98.7% (95%CI: 98.0-99.4%) of datasets. (Table 5) The estimates for the FMSPTs were greater, with the exposure source correctly identified in over 99% of datasets where the global null hypothesis was rejected with an odds ratio of 3.5 and a probability of disease for unexposed subjects of 0.20. (Table 5) The scan statistic was outperformed by the permutation testing methods in Case 2. It detected the exposure source in 72.5% (95%CI: 69.7-75.3%) of datasets where the null hypothesis was rejected with a probability of disease for unexposed subjects of 0.20 and an odds ratio of 3.5. (Table 5)

Table 5 Cases 2 and 3 Sensitivity - Detecting the Exposure Source Location

In Case 3, the FMSPTs outperformed the CPT and scan statistic detecting a higher proportion of the vertical exposure source. For probabilities of disease of 0.05 and 0.20, FMSPT-3 detected an average proportion of 0.634 (SD: 0.334) and 0.788 (SD: 0.204) and FMSPT-5 detected 0.661 (SD: 0.343) and 0.807 (SD: 0.222) of the vertical exposure source, respectively, for an odds ratio of 3.5. The CPT detected proportions of 0.524 (SD: 0.293) and 0.704 (SD: 0.192) on average while the scan statistic had mean detection proportions of 0.346 (SD: 0.277) and 0.443 (SD: 0.257) for the probabilities of disease 0.05 and 0.20, respectively. (Table 5)

The distribution of the selected span size for the CPT was bimodal with models near 0.3 and 0.6 for Case 1 for a probability of disease outside the cluster of 0.20. For a probability of disease of 0.05, there was greater density of the distribution near large span sizes indicating that for smaller probabilities of disease for unexposed subjects, the GAM was more likely to choose large span sizes than for higher prevalence diseases. (Figure 5a) The distribution of the radius of the scan statistic for Case 1, for a probability of disease for unexposed subjects of 0.20 and an odds ratio of 3.0, was unimodal with both mean and mode near the true cluster radius of 0.15 . (Table 6) The distribution for a probability of disease for unexposed subjects of 0.05 was bimodal with modes near 0.1 and the true cluster size showing a tendency of the scan statistic to detect small clusters when analyzing diseases of lower prevalence. (Figure 5b)

Table 6 Radii of Scan Statistic Most Likely Cluster with Significant P-Value (p < 0.05)
Figure 5
figure 5

Distributions of Optimal Span Size and Most Likely Cluster Radius Observed for Case 1 with Odds Ratio of 3.0. a: Case 1 Conditional Permutation Test Optimal Span Size for Odds Ratio of 3.0. This figure depicts the optimal span size selected by applying GAMs across a range of possible spans and selecting the optimal span as that which corresponds to the minimal model AIC statistic. b: Case 1 Scan Statistic Most Likely Cluster Radius for Odds Ratio of 3.0. This figure depicts the distribution of the observed radius for most likely clusters selected by the scan statistic. It is paired with Figure 4a as we can compare the tendencies of the methods to over- or under-smooth through these figures. With Figure 4a we see that for lower disease prevalence the GAM methods tend to choose a large span size, possibly over-smoothing and missing the cluster. The scan statistic tends to under-smooth and finds a most likely cluster that is much smaller than the true cluster radius, as shown in Figure 4b.

Figures 6 and 7 display two datasets from Case 1 with an odds ratio of 3.0 and a probability of disease of 0.20 for unexposed subjects where the global null hypothesis was rejected by all of the methods compared. In Figure 6, the CPT detected between 6.3 and 33.7% of points as increased risk while FMSPT-3 and FMSPT-5 detected 12.8-38.1% and 13.7-38.2% of points, respectively. For the same case, odds ratio, and probability of disease, the scan statistic detected a cluster with the smallest radius observed, 0.07, while in Figure 7 it detected its largest cluster with a radius of 0.80.

Figure 6
figure 6

Case 1 Points Detected at High Risk for Data with Scan Statistic Minimum Radius, Probability of Disease Outside Cluster = 0.20. This figure compares the area of the region detected as high risk by the methods discussed in this paper. This particular figure shows the minimum radius observed for a significant most likely cluster with an odds ratio of 3.0 and a probability of disease outside the cluster of 0.20.

Figure 7
figure 7

Case 1 Points Detected at High Risk for Data with Scan Statistic Maximum Radius, Probability of Disease Outside Cluster = 0.20. This figure compares the area of the region detected as high risk by the methods discussed in this paper. This particular figure shows the maximum radius observed for a significant most likely cluster with an odds ratio of 3.0 and a probability of disease outside the cluster of 0.20.

Discussion

Simulated data were used to compare the power and sensitivity of the CPT and FMSPTs performed with GAMs to the spatial scan statistic under three simple alternative hypotheses. Theoretical power was computed for each alternative hypothesis to provide a comparison of spatial statistic hypothesis tests to simpler methods.

In Case 1, a circular cluster was centered in the study region. The spatial scan statistic identifies clusters by placing circular zones across the region of interest and comparing the likelihood of disease within to outside the zones. As this method is similar to the pattern of disease risk for this Case, it is unsurprising that the scan statistic had the highest estimated power, nearing the theoretical power calculated for a Pearson chi-square test. The CPT had slightly lower power than the scan statistic and FMSPT-3 and FMSPT-5 had lower power estimates.

In Case 2, there was a linear association between Euclidean distance from the center of the circular study region and the logodds of disease. Case 3 was a square study region with a linear association between the proximity to the center of the horizontal axis and the logodds of disease. As these cases would be appropriately analyzed by logistic regression methods, GAM permutation tests had an advantage over the scan statistic in its flexibility to detect different patterns in disease risk. In both Cases 2 and 3, the CPT had the highest estimated power though estimates were at least 10% smaller than the theoretical power of a logistic regression. The scan statistic had the lowest power for Cases 2 and 3. For all tests, power estimates for Case 1 exceeded those of Cases 2 and 3. For the GAM permutation tests, power estimates for Case 3 were greater than those of Case 2 under similar conditions while the estimates were comparable between the two cases for the scan statistic.

The size of the most likely cluster and hot- and coldspots identified by the scan statistic and GAM methods varied greatly across datasets, as observed in Figures 6 and 7. For Case 1, an odds ratio of 3.0, and a lower prevalence, i.e. a probability of disease outside the cluster of 0.05, the spatial scan statistic had larger variation in most likely cluster radius and a greater probability of a most likely cluster having a radius smaller than the true cluster than for higher prevalence. (Table 6, Figure 5a) The scan statistic showed a tendency to detect small clusters while the CPT tended to smooth over small variations in disease risk as a large span (span > 0.80) was more likely to be selected when analyzing diseases of lower prevalence. (Figure 5b)

Comparing model sensitivities, in Case 1, the FMSPT-5 consistently detected the highest proportion of the true cluster as a hot- or coldspot, followed by FMSPT-3 and CPT with the scan statistic having the lowest mean proportion detected. It is not surprising that FMSPT-3 and FMSPT-5 had the highest sensitivity estimates as the definition of sensitivity of these tests considered points detected if they were considered a hot- or coldspot in at least one of 3 or 5 models. Sensitivity for the CPT required the points to be detected at a single span size.

Of interest, the spatial scan statistic had the highest power estimates for Case 1 though it did not detect the highest proportion of the true cluster. As for its sensitivity, the scan statistic detected a most likely cluster of the correct size with a radius within ±0.01 of the true cluster radius in 19.4% of datasets with an odds ratio of 3.0 and a probability of disease outside the cluster of 0.20. Of these most likely clusters, 12.4% were centered in the correct location and only one dataset was observed to have a correct cluster radius and location with a p-value of less than 0.05.

In Case 2, sensitivity was measured by the probability of detecting the exposure source, given that the global null hypothesis was rejected. In practice, after detecting variation in disease risk, public health resources may be sent to specific locations detected as hot- or coldspots to determine the source of exposure. If the exposure point source is not included in the most likely cluster or hot-/coldspot detected, it is unlikely that public health officials will be able identify the true exposure that is increasing disease risk. A minimum sensitivity of 80% may be considered a reasonable requirement of tests used for application. The FMSPTs had sensitivity estimates of at least 80% for odds ratios over 2.0 while the CPT had sensitivities of 80% for odds ratios of at least 3.0 for both probabilities of disease. The sensitivity of the scan statistic did not reach 80% for any odds ratios, having much lower estimates than the permutation testing methods. Of the datasets where the scan statistic detected a most likely cluster with a p-value of less than 0.05, it rarely identified the correct exposure point source.

Sensitivity for Case 3 was measured as the proportion of the vertical exposure source identified as high or low risk, given that the global null hypothesis was rejected. Again, the spatial scan statistic had much lower sensitivity than the permutation testing methods. For odds ratios of at least 3.0 and a probability of disease for unexposed subjects of 0.20, the FMSPTs had sensitivity estimates of at least 70%, slightly lower than the desired magnitude. FMSPT-5 had the highest sensitivity, followed by FMSPT-3 and CPT.

For the CPT, we selected the span size through minimization of the AIC statistic. Many other methods of span selection are available. We believe similar results would be observed for any data driven span selection procedure, but further research is needed to confirm this. For the FMSPTs, we selected spans for a range across possible span sizes a priori. Other span sizes could be selected and power estimates may change accordingly. For the CPT and FMSPTs, we applied significance level adjustments based on empirical evidence from previous research and a nominal α level of 0.05 [27]. There is no guarantee that similar results will be observed in future studies as the significance cutoffs used here were selected and evaluated through a single set of simulations. For different nominal α levels, appropriate significance cutoffs must be determined. A number of extensions to the scan statistic are available, including elliptical [28] and flexibly shaped [29] zones; however for this research, our interest was in evaluating the original, and widely used, circular spatial scan statistic as applied using the software SaTScan. Applications of other versions of the scan statistic may influence the statistical power and sensitivity of the test. Evaluation of the extended methods is left for future research. In this research, we applied the methods to point data. Both the scan statistic and GAM methods are applicable to aggregate data and if applied to such data, the resulting distribution of power estimates would likely change.

Conclusions

Power of at least 80% indicates that the null hypothesis is correctly rejected at a high rate, a desirable quality of a testing method. The permutation tests each had power estimates exceeding the 80% threshold for large odds ratios. Reduced power was observed for a lower prevalence disease, as was expected with reduced theoretical power. The scan statistic had an observed power estimate of at least 80% for a circular cluster of increased risk centered in the study region but lower estimates for other variations in disease risk.

Sensitivities of at least 80% are desirable to ensure that the testing methods detect the correct areas of increased or decreased risk. In general, the FMSPTs had the highest sensitivity with estimates of at least 80% with large odds ratios for all disease risk patterns examined. The CPT had slightly lower sensitivity though its sensitivity reached 80% for higher prevalence diseases and with large odds ratios. The scan statistic had lower sensitivity estimates for all variations of disease risk examined and was observed to have at least 80% sensitivity only for a circular cluster centered in the study region which mimicked its own cluster detection method.

Simple patterns of spatial variation in disease risk were considered in this study. The relative pattern of power estimates of the four methods differ based on the pattern of disease risk considered. The spatial scan statistic outperformed the GAM methods in the case of a circular cluster centered in the study region, though it underperformed in sensitivity. For a linear association between geographic location and disease risk, the scan statistic had power estimates and sensitivity falling below the GAM estimates. It is important to note that analyses were performed using point data. Results of power comparisons applied to aggregate data may differ from those observed here. Across all simple scenarios examined in this research, the GAM methods presented a reasonable alternative with similar or greater power estimates and sensitivity exceeding that of the spatial scan statistic.

References

  1. Marshall RJ: A Review of Methods for the Statistical Analysis of Spatial Patterns of Disease. Journal of the Royal Statistical Society, Series A (Statistics in Society). 1991, 154: 421-441. 10.2307/2983152.

    Article  Google Scholar 

  2. Kulldorff M, Tango T, Park P: Power comparisons for disease clustering tests. Computational Statistics & Data Analysis. 2003, 42: 665-684. 10.1016/S0167-9473(02)00160-3.

    Article  Google Scholar 

  3. Takahashi K, Tango T: An extended power of cluster detection tests. Statistics in Medicine. 2006, 25: 841-852. 10.1002/sim.2419.

    Article  PubMed  Google Scholar 

  4. Besag J, Newell J: The Detection of Clusters in Rare Diseases. Journal of the Royal Statistical Society, Series A (Statistics in Society). 1991, 154: 143-155. 10.2307/2982708.

    Article  Google Scholar 

  5. Hastie TJ, Tibshirani RJ: Generalized Additive Models. 1990, New York: Chapman & Hall/CRC

    Google Scholar 

  6. Webster T, Vieira V, Weinberg J, Aschengrau A: Method for mapping population-based case-control studies: an application using generalized additive models. International Journal of Health Geographics. 2006, 5: 10.1186/1476-072X-5-26.

    Google Scholar 

  7. Tusell F: A permutation test for randomness with power against smooth variation. Statistics and Computing. 2001, 11: 147-154. 10.1023/A:1008927315937.

    Article  Google Scholar 

  8. Kelsall JE, Diggle PJ: Spatial Variation in Risk of Disease: A Nonparametric Binary Regression Approach. Journal of the Royal Statistical Society Series C (Applied Statistics). 1998, 47: 559-573. 10.1111/1467-9876.00128.

    Article  Google Scholar 

  9. Hardle W, Huet S, Mammen E, Sperlich S: Bootstrap Inference In Semiparametric Generalized Additive Models. Econometric Theory. 2004, 20: 265-300. 10.1017/S026646660420202X.

    Article  Google Scholar 

  10. Cardinale M, Arrhenius F: The influence of stock structure and environmental conditions on the recruitment process of Baltic cod estimated using a generalized additive model. Canadian Journal of Fish and Aquatic Sciences. 2000, 57: 2402-2409. 10.1139/cjfas-57-12-2402.

    Article  Google Scholar 

  11. Young RL, Weinberg J, Vieira V, Ozonoff A, Webster TF: Generalized Additive Models and Inflated Type I Error Rates of Smoother Significance Tests. Computational Statistics & Data Analysis. 2010,

    Google Scholar 

  12. Hurvich C, Simonoff J, Tsai C-L: Smoothing Parameter Selection in Nonparametric Regression Using an Improved Akaike Information Criterion. Journal of the Royal Statistical Society, Series B (Methodological). 1998, 60: 271-293. 10.1111/1467-9868.00125.

    Article  Google Scholar 

  13. Kulldorff M, Nagarwalla N: Spatial Disease Clusters: Detection and Inference. Statistics in Medicine. 1995, 14: 799-810. 10.1002/sim.4780140809.

    Article  PubMed  CAS  Google Scholar 

  14. Song C, Kulldorff M: Power evaluation of disease clustering tests. International Journal of Health Geographics. 2003, 2: 10.1186/1476-072X-2-9.

    Google Scholar 

  15. Ozonoff A, Bonetti M, Forsberg L, Pagano M: Power comparisons for an improved disease clustering test. Computational Statistics & Data Analysis. 2005, 48: 679-684. 10.1016/j.csda.2004.03.012.

    Article  Google Scholar 

  16. Ozonoff A, Webster T, Vieira V, Weinberg J, Ozonoff D, Aschengrau A: Cluster detection methods applied to the Upper Cape Cod cancer data. Environmental Health: A Global Access Science Journal. 2005, 4:

    Google Scholar 

  17. Kulldorff M: SaTScan. 2007, Boston and Maryland: Harvard Medical School, Boston and Information Management Services Inc, Silver Springs, Maryland, 7.0.3

    Google Scholar 

  18. Aamodt G, Samuelsen SO, Skrondal A: A simulation study of three methods for detecting disease clusters. International Journal of Health Geographics. 2006, 5: 10.1186/1476-072X-5-15.

    Google Scholar 

  19. Aschengrau A, Weinberg J, Rogers S, Gallagher L, Winter M, Vieira V, Webster T, Ozonoff D: Prenatal Exposure to Tetrachloroethylene-Contaminated Drinking Water and the Risk of Adverse Birth Outcomes. Environmental Health Perspectives. 2008, 116: 23-34. 10.1289/ehp.10414.

    Article  Google Scholar 

  20. Vieira V, Webster T, Weinberg J, Aschengrau A: Spatial analysis of bladder, kidney, and pancreatic cancer on upper Cape Cod: an application of generalized additive models to case-control data. Environmental Health Perspectives. 2009, 8: Article 3

    Google Scholar 

  21. Webster T, Vieira V, Weinberg J, Aschengrau A: Spatial analysis of case-control data using generalized additive models. EUROHEIS/SAHSU Conference; 30-31 March 2003; Ostersund, Sweden. Edited by: JL L. 2003, 56-59.

    Google Scholar 

  22. Trepka MJ, Henrich J, Krause C, Schulz C, Lippold U, Meyer E, Wichmann H-E: The Internal Burden of Lead among Children in a Smelter Town - A Small Area Analysis. Environmental Research. 1997, 72: 118-130. 10.1006/enrs.1996.3720.

    Article  PubMed  CAS  Google Scholar 

  23. Wilhelm M, Ritz B: Residential Proximity to Traffic and Adverse Birth Outcomes in Los Angeles County, California, 1994-1996. Environmental Health Perspectives. 2003, 111: 207-216.

    Article  PubMed  PubMed Central  Google Scholar 

  24. McConnell R, Berhane K, Yao L, Jerrett M, Lurmann F, Gilliland F, Kunzli N, Gauderman J, Avol E, Thomas D, Peters J: Traffic, Susceptibility, and Childhood Asthma. Environmental Health Perspectives. 2006, 114: 766-772. 10.1289/ehp.8594.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  25. Hastie TJ: gam: Generalized Additive Models. R Package. 2008

    Google Scholar 

  26. R v 2.8.0. 2008, The R Foundation for Statistical Computing, 2.8.0

  27. Young RL, Weinberg J, Vieira V, Ozonoff A, Webster T: The Power of Hypothesis Testing Using Generalized Additive Models with Bivariate Smoothers. 2009,

    Google Scholar 

  28. Kulldorff M, Huang L, Pickle L, Duczmal L: An elliptic spatial scan statistic. Statistics in Medicine. 2006, 25: 3929-3943. 10.1002/sim.2490.

    Article  PubMed  Google Scholar 

  29. Tango T, Takahashi K: A flexibly shaped spatial scan statistic for detecting clusters. International Journal of Health Geographics. 2005, 4: 10.1186/1476-072X-4-11.

    Google Scholar 

Download references

Acknowledgements

This research was supported by grant P42ES007381 from the National Institute of Environmental Health (NIEHS), NIH. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of NIEHS, NIH.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robin L Young.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

RLY designed and performed simulation studies, drafted and revised the manuscript, and has approved the final version for submission to this journal. JW, VV, AO, and TFW participated in study design, made significant revisions and contributions to the manuscript, and approved the final version for submission to this journal.

Electronic supplementary material

12942_2010_370_MOESM1_ESM.DOC

Additional file 1:Theoretical Power. This file includes information regarding how to compute the theoretical power for the Pearson chi-square test and logistic regressions applied. (DOC 42 KB)

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Young, R.L., Weinberg, J., Vieira, V. et al. A power comparison of generalized additive models and the spatial scan statistic in a case-control setting. Int J Health Geogr 9, 37 (2010). https://doi.org/10.1186/1476-072X-9-37

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1476-072X-9-37

Keywords