International Journal of Health Geographics Geostatistical Analysis of Disease Data: Accounting for Spatial Support and Population Density in the Isopleth Mapping of Cancer Mortality Risk Using Area-to-point Poisson Kriging

Background: Geostatistical techniques that account for spatially varying population sizes and spatial patterns in the filtering of choropleth maps of cancer mortality were recently developed. Their implementation was facilitated by the initial assumption that all geographical units are the same size and shape, which allowed the use of geographic centroids in semivariogram estimation and kriging. Another implicit assumption was that the population at risk is uniformly distributed within each unit. This paper presents a generalization of Poisson kriging whereby the size and shape of administrative units, as well as the population density, is incorporated into the filtering of noisy mortality rates and the creation of isopleth risk maps. An innovative procedure to infer the point-support semivariogram of the risk from aggregated rates (i.e. areal data) is also proposed.


Background
Public health officials frequently use cancer mortality maps to identify areas of excess and their potential causes (e.g. environmental exposure or socio-demographic factors), as well as to guide surveillance and control activities. The interpretation and analysis of those choropleth maps faces three major hurdles: 1) the presence of extreme unreliable rates that typically occur for sparsely populated areas and/or less frequent cancers, 2) the visual bias resulting from the aggregation of health data within administrative units of widely different sizes and shapes, and 3) the mismatch of spatial supports for cancer rates and explanatory variables that prevents their direct use in correlation analysis and often implies an aggregation to the coarser geography. Common pitfalls include devoting unwarranted attention to a few oversized geographical units located in low population density areas or conducting regression analysis at scales that distort the true exposure/response relationship (i.e. ecological fallacy) [1]. What is needed is a coherent spatial methodology that allows both the filtering of the noise caused by the small number problem and the mapping of results as continuous surfaces without subjective administrative boundaries. Creation of isopleth risk maps from aggregated disease rates (i.e. areal data) has been the topic of several papers in the statistical and health science literatures. Geostatistical methods range from straightforward point kriging [2][3][4][5] to complex random field models that require distributional assumptions and computationally intensive parameter estimation using Markov Chain Monte Carlo (MCMC) techniques [6,7].
The simplest approach to create isopleth risk maps consists of assigning the aggregated rates to the geographic centroids of the administrative units, which are then interpolated to the nodes of a regular grid using point kriging. In these studies, the noise attached to rates estimated from small populations was either ignored [2], filtered using empirical Bayes smoothers prior to the interpolation [3], or incorporated in the interpolation procedure using binomial [4] or Poisson kriging [5]. When performing point kriging of areal data, the user makes the practical assumption that all the habitants of the administrative unit live at the same location and the measured rate thus refers to this specific location. This assumption is reasonable whenever the units of aggregation are small with respect to the spacing of the interpolation grid. For example, Oliver et al. [4] mapped the risk of childhood cancer over a 2 km grid in the West Midlands of England using binomial kriging and rates measured within small electoral wards. More recently, Ali et al. [5] created dysentery and cholera incidence maps using Poisson kriging and rates measured for patrilineally-related clusters of households. The assumption of point measurement support becomes clearly inappropriate when the administrative units are counties or states [2,3], which calls for specific methods to incorporate the shape and size of those units in the analysis.
In theory, the system of kriging equations is flexible enough to incorporate data measured over a wide variety of supports and to accommodate either punctual or areal (classically termed "blocks" in the geostatistical literature) prediction supports [8,9]. Practical implementation of kriging in presence of irregular geographical units is, however, challenging mainly for the derivation of the areal covariance terms used in the kriging system [10]. Indeed, the covariance between two areas is approximated as the average of the point-support covariance computed between any two points discretizing these two areas. The first challenge is that the point-support semivariogram model cannot be inferred directly from the observations since only areal data are available. Instead, the semivariogram of areal data needs to be modelled, then deconvoluted to yield the point-support model. Second, the discretization of all the measurement and prediction supports, followed by the averaging of covariance values, can be CPU intensive. This partly explains why most geostatistical studies were restricted to the use of punctual data to conduct interpolation at punctual locations or over areas of regular size, such as blocks to be mined or environmental exposure units to remediate.
The geostatistical analysis of areal data has received increasing attention as kriging becomes more popular in the fields of remote sensing, social science, and medical geography. In particular, several authors [11][12][13][14] have implemented kriging of areal data to predict punctual or areal values, an approach referred to as "area-to-point" (ATP) or "area-to-area" (ATA) kriging following the terminology in Kyriakidis [14]. ATP kriging allows mapping the variability within geographical units while ensuring the coherence of the prediction so that, for example, disaggregated estimates of count data are non-negative and their sum is equal to the original aggregated count. Gotway and Young [12] applied ATA kriging to the mapping of the number of low birth weight (LBW) babies at the Census tract level, accounting for county-level LBW data and covariates measured over different spatial supports, such as a fine grid of ground-level PM 10 concentrations or tract population.
introduced to incorporate varying population sizes and spatial patterns in the processing of cancer mortality data [15,16]. Its implementation was facilitated by the initial assumption that all geographical units are the same size and shape, which allowed the use of geographic centroids in semivariogram estimation and kriging. Another implicit assumption was that the population at risk is uniformly distributed within each unit. Although these assumptions were reasonable for filtering cancer mortality maps that consist of geographical units of similar sizes, point Poisson kriging does not allow a rigorous treatment of the change of support, prohibiting for example the creation of continuous risk surfaces.
It is noteworthy that the increasingly popular full Bayesian modeling approach is overwhelmingly used with the conditional auto-regressive (CAR) model for defining the random effect associated with spatial autocorrelation [17][18][19]. This computationally convenient choice is reasonable if all geographical entities are of similar size and arranged in a regular pattern but it is not particularly attractive otherwise [6]. There have been several attempts to model the spatial correlation as a function of the distance between centroids of geographical units instead of using the arbitrary neighborhood relationship underlying the CAR model [20]. To quote Kelsall and Wakefield [6], "This approach is quite simplistic, however, and again does not acknowledge the differing shapes and sizes and relative orientation of the areas". The same authors describe one of the rare Bayesian models that accounts properly for the spatial support of the data, although it was implemented under the assumption of uniform population distribution within each area. Using semivariograms to model the spatial autocorrelation, they estimated the continuous underlying relative risk function for colorectal cancer mortality in 39 wards of the UK district of Birmingham. This paper presents the first geostatistical study where the size and shape of administrative units, as well as the population density, is incorporated into the filtering of noisy mortality rates and the mapping of the corresponding risk at a fine scale (i.e. disaggregation). This generalization of the Poisson kriging algorithm introduced by Monestiez et al. [21,22] capitalizes on recent developments in the field of change of support and deconvolution of semivariogram estimated from irregular areal data [10]. The approach is illustrated using age-adjusted lung and cervix cancer mortality rates recorded for white females in two contrasted county geographies: 1) state of Indiana that consists of 92 counties of fairly similar size and shape, and 2) four states in the Western US (Arizona, California, Nevada and Utah) forming a set of 118 counties that are vastly different geographical units. Simulation studies are conducted to compare the performances of the new methodology to simple geostatistical approaches (i.e. point kriging of raw or empirical Bayesian smoothed rates) for the same two county geographies. Performance criteria include the accuracy of the prediction of the underlying risk and the quantification of the attached uncertainty. The lack of realistic spatial models in a full Bayesian approach, coupled with the absence of user-friendly software to implement this type of model [23], prohibited the inclusion of this approach in the comparison study of isopleth mapping algorithms.

Data
The methodology to account for spatial support and population density in Poisson kriging will be illustrated using directly age-adjusted mortality rates for a frequent (lung) and less frequent (cervix) cancer. These data are part of the Atlas of Cancer Mortality in the United States [24] and were downloaded from http://www3.cancer.gov/atlas plus/download.html. The analysis focuses on white female rates recorded over the 1970-1994 period and adjusted using the 1970 population pyramid. Two areas with contrasted county geographies were considered: 1) state of Indiana (Region 1), and 2) four states in the Western US (Arizona, California, Nevada and Utah) that will be referred to as Region 2. The choice of these two specific geography areas was guided by the need to assess performances in two contrasted settings: 1) all geographical units have a fairly similar size and shape, which is the "ideal" situation for methods that ignore the spatial support of the data, and 2) geographical units display a wide range of sizes and shapes, which should favour any approach that implicitly accounts for the spatial support of the data in the analysis. The West coast provided a perfect example for the second type of geography (i.e. set of 118 vastly different counties), while Indiana includes a reasonable number (i.e. 92) of counties that are geometrically fairly similar. Figure 1 (top graphs) shows the spatial distribution of age-adjusted mortality rates per 100,000 person-years for lung cancer in Region 1 and cervix cancer in Region 2. Following the recommendations of several studies on map color schemes [25,26], a double-ended color scheme with 10 equally-weighted classes (i.e. boundaries correspond to deciles of the histogram) was used: a gradient of red is used for rates higher than the median, while a gradient of blue is used for lower rates. The population-weighted average of mortality rates is 23.7 per 100,000 person-years for lung cancer and 2.85 per 100,000 person-years for cervix cancer. For Region 1, the population at risk in each county was computed as: 100,000 × the total number of "lung cancer" deaths over the 1970-1994 period divided by the corresponding age-adjusted mortality rate; both datasets are available on the website of the National Cancer Institute (NCI) of the US (URL as given above). A sim-ilar procedure was conducted in Region 2, except that the most frequent breast cancer was used to avoid zero population estimates for a few sparsely populated counties where no cervix cancer mortality was reported during that period. The population sizes are mapped in Figure 1 (middle row) using a rainbow color scheme to avoid any confusion with the rate maps.
A visual inspection of the cervix cancer mortality map conveys the impression that rates are particularly high in the centre of the study area (Nye and Lincoln Counties), as well as in a few Northern California counties. This result must, however, be interpreted with caution since the population is not uniformly distributed across the study area and rates computed from sparsely populated counties tend to be less reliable. This effect, known as "small number problem" [27], is illustrated by the bottom scattergrams in Figure 1. In particular, cervix cancer mortality in excess of 5 deaths per 100,000 person-years is observed only for counties with a 25 year cumulated population smaller than 10 5 . The use of administrative units to report the results (i.e. counties in this case) can also bias the interpretation: had the two counties with the highest rates been much smaller in size, these high values likely would have been perceived as less problematic. The higher frequency of lung cancer mortality, coupled with the narrower range of county sizes in Indiana, makes the interpretation of Region 1 cancer mortality map less hazardous. The highest rate is in fact reported for the most populated county (Marion) whose seat is Indianapolis.
To highlight the limitations of collapsing administrative units into their geographic centroids, the spatial distribution of population within each county is mapped at the top of Figure 2. These maps were produced by allocating the county-level population estimates of Figure 1 to a grid of 25 km 2 cells discretizing each study area, leading to 3,751 and 48,474 cells in Regions 1 and 2, respectively. The relative proportion of the county-level population within each 25 km 2 cell was retrieved from the readily available 2000 census block level data. These population maps illustrate the non-uniform repartition of population, in particular for the vast counties in Region 2. To account for the shape of geographical units and their heterogeneous population density, the distance between any two counties is here estimated as a population-weighted average of Euclidian distances between points discretizing the pair of counties: where P α and P β are the number of points u s and u s' used to discretize the two units or blocks v α and v β , respectively.
The quantity n(u s ) represents the population size within the 25 km 2 cell centred on the discretizing point u s . For the examples of Figure 1, the discretizing points are identified with the nodes of the 5 km grid, yielding a total of 9 to 69 points per county in Indiana, and 11 to 2,082 discretizing points for the West Coast counties. The set of block-to-block distances (Equation (1)) are plotted against Euclidian distances between polygon centroids depicted by red dots in Figure 2 (middle maps). The scatterplots at the bottom of Figure 2 indicate a high correlation between the two sets of distances; discrepancies essentially occur for neighboring counties (i.e. for small distances). The correlation is slightly smaller for Region 2 where county shapes and sizes vary greatly. Block-to-block distances are numerically very close to the distances computed between population-weighted centroids depicted by blue triangles in Figure 2 (middle maps).

Poisson kriging: the centroid-based approach
The geostatistical methodology for the estimation of risk values from empirical frequencies, and its performance relative to common smoothers, is described in details in Goovaerts [15]. This section provides a brief recall of the centroid-based implementation of Poisson kriging (PK) for prediction of aggregated risk values. The approach is then generalized to account for spatial supports and population density in both the case of area-to-area and areato-point predictions.
For a given number N of geographical units v α (e.g. counties), denote the observed mortality rates (areal data) as is the number of recorded mortality cases and n(v α ) is the size of the population at risk. Let us assume for now that all units v α have similar shapes and sizes, with a uniform population density. A straightforward approach to implement geostatistics is to assume that each unit or measurement support is a single point. This simplification amounts at assigning each measured rate to the geographic centroid of the unit over which it has been recorded. For example, let u α = (x α , y α ) be the vector of spatial coordinates for the centroid of the unit v α . The disease count d(u α ) is interpreted as a realization of a random variable D(u α ) that follows a Poisson distribution with one parameter (expected number of counts) that is the product of the population size n(u α ) by the local risk R(u α ).
The risk over a given unit v α is estimated as a linear combination of the rate observed for that unit, z(u α ), and in (K-1) neighboring units:

Equation
Geographic distribution of age-adjusted lung and cervix cancer mortality rates and the populations at risk Figure 1 Geographic distribution of age-adjusted lung and cervix cancer mortality rates and the populations at risk. For the two top maps, the fill color in each county represents the age-adjusted mortality rates per 100,000 person-years recorded over the period 1970-1994 for white females (class boundaries correspond to deciles of the histogram of rates). The middle maps represent the population at risk which was back-calculated from the rate and count data (lognormal scale). The bottom scatterplots illustrate the larger variance of rates computed from sparsely populated counties, in particular for the least frequent cervix cancer.

Population at risk Population at risk
High-resolution population maps and their use to compute population-weighted distances between counties Figure 2 High-resolution population maps and their use to compute population-weighted distances between counties.
The county-level population estimates of Figure 1 were allocated to a grid of 25 km 2 cells according to the 2000 census block level data. The scattergrams plot Euclidian distances between county geographic centroids, depicted by red dots in the middle maps, versus a "block distance" that accounts for the shape of counties and the distribution of the population (Equation 1). Population-weighted centroids are depicted by blue triangles in the middle maps.
where λ i (u α ) is the weight assigned to the rate z(u i ) when estimating the risk at u α . The K weights are the solution of the following system of linear equations: where δ ij = 1 if u i = u j and 0 otherwise, and m* is the population-weighted mean of the N rates. The "error variance" term, m*/n(u i ), leads to smaller weights for less reliable data (i.e. rates measured over smaller populations). The prediction variance associated with the estimate (2) is computed as: To solve the kriging system (Equation 3), one needs a model for the covariance of the risk, C R (h), or equiva- Monestiez et al. [21,22] the semivariogram of the risk is estimated as: The different spatial increments [z(u α )-z(u α +h)] 2 are weighted by a function of their respective population sizes, n(u α )n(u α + h)/(n(u α ) + n(u α + h)), a term which is inversely proportional to their standard deviation. More importance is thus given to the more reliable data pairs (i.e. smaller standard deviations).

Area-to-Area (ATA) Poisson kriging
In presence of geographical units with very different shapes and sizes, it is overly simplistic to assimilate each unit v α to its geographic centroid u α . The spatial support of each unit needs to be accounted for in both the semivariogram inference and kriging. Following the terminology in Kyriakidis [14], ATA kriging refers to the case where both the prediction and measurement supports are areas instead of points. The PK estimate (2) for the areal risk value r(v α ) thus becomes: The weights λ i (v α ) are computed by solving the following kriging system: The main difference between systems (3) and (7) is that point-to-point covariance terms C R (u i -u j ) are replaced by area-to-area covariances Those covariances are numerically approximated by averaging the point-support covariance C(h) computed between any two locations discretizing the areas v i and v j : where P i and P j are the number of points used to discretize the two areas v i and v j , respectively. The weights w ss' are computed as the product of population sizes within the square cells centred on the discretizing point u s and u s' : The kriging variance for the ATA kriging estimator is computed as: is the within-area covariance that is computed according to Equation (8) with

Area-to-Point (ATP) Poisson kriging
A particular case of ATA kriging is when the prediction support is so small that it can be assimilated to a point u s , leading to the following area-to-point kriging estimator and kriging variance: The kriging weights and the Lagrange parameter μ(u s ) are computed by solving the following system of linear equations: The ATP kriging system is similar to the ATA kriging system (Equation 7), except for the right-hand-side term where the area-to-area covariances by area-to-point covariances R (v i , u s ) that are approximated as: where P i is the number of points used to discretize the area v i and the weights w s's are computed according to expression (9).
ATP kriging can be conducted at each node of a grid covering the study area, resulting in a continuous (isopleth) map of mortality risk and reducing the visual bias that is typically associated with the interpretation of choropleth maps. Another interesting property of the ATP kriging estimator is its coherence: the population-weighted average of the risk values estimated at the P α points u s discretizing a given entity v α yields the ATA risk estimate for this entity: Constraint (15) is satisfied if the same K areal data are used for the ATP kriging of the P α risk values. Indeed, in this case the population-weighted average of the righthand-side covariance terms of the K ATP kriging systems is equal to the right-hand-side covariance of the single ATA kriging system: per relations (8), (9) and (14). Therefore, the following relationship exists between the two sets of ATA and ATP kriging weights:

Deconvolution of the semivariogram of the risk
To solve both ATA and ATP kriging systems, one needs to know the point-support covariance of the risk C(h), or equivalently the point-support semivariogram γ(h). This function cannot be estimated directly from the observed rates, since only aggregated data are available. Derivation of a point-support semivariogram from the "regularized" experimental semivariogram computed from areal data is called "deconvolution" [14,28]. Unlike the mining applications that have led to the initial development of geostatistical deconvolution [8], the size and shape of geographical units in health science applications can vary greatly, which makes the deconvolution much more challenging. Goovaerts [10] developed an innovative approach to conduct the deconvolution in presence of irregular units and heterogeneous population distributions. Like most inverse problems, the deconvolution is best tackled using an iterative procedure. The basic idea is to seek the point-support model that, once regularized, is the closest to the model fitted to areal data. Only the most salient features of the deconvolution procedure are described hereafter, and the reader is referred to Goovaerts [10] for a detailed presentation of the algorithm and demonstration of its performances in simulation studies.
The deconvolution is based on the following relationship between the regularized semivariogram model γ v (h) fitted to areal data and area-to-area semivariogram values that are a function of the unknown point-support model γ(h): The area-to-area semivariogram value (v, v h ) represents the mean value of the point-support semivariogram between an arbitrary point in the support v and another in the translated support v h . Because each area (e.g. county) has potentially a different size and shape, one needs to consider any two areas that can be encountered across the study area and average the semivariogram values over all N(h) pairs of areas separated by the distance h. More precisely, the area-to-area semivariogram value is estimated as: The semivariogram value between any two areas, v α and v α+h , separated by the population-based distance h (Equation 1) is computed as: where P α and P α+h are the number of points used to discretize the two areas v α and v α+h , respectively.
The second term in the regularization expression (Equation 18) is the within-area semivariogram value h (v, v).
It fluctuates as a function of the separation distance h since the size of the areas used in semivariogram computation varies as a function of the distance between them: smaller areas tend to be paired at shorter distances, while larger areas can only be paired for distances that exceed half their minimum dimension. This quantity is estimated as the arithmetical average of within-area semivariogram values for any pair of areas separated by a given distance h:  (2) (h) for the next iteration. The procedure stops when the maximum number of allowed iterations has been tried (e.g. 35 in this paper) or the decrease in the D statistic becomes negligible from one iteration to the next. The use of lag-specific rescaling coefficients provides enough flexibility to modify the initial shape of the point-support semivariogram and makes the deconvolution insensitive to the initial solution adopted.

Simple approaches to perform area-to-point kriging
A couple of geostatistical approaches have been proposed in the health science literature to derive continuous maps of health outcomes from areal data. The most straightforward method is to interpolate the raw rates to the nodes of a regular grid using point ordinary kriging [2]: The corresponding kriging variance is computed as: The weights λ i (u s ) are the solution of the following ordinary kriging system: The covariance C(h) is derived from the model fitted to the experimental semivariogram of raw rates that is calculated as: This approach overlooks two critical facts: 1) rates can be very unreliable if recorded over sparsely populated geographical units, and 2) the spatial support of the data (i.e. county) can not be simply equalled to the prediction support (i.e. grid node). In other words, it is unrealistic to collapse vastly different administrative units into their geographic centroids. Another limitation of this approach is its lack of coherence: there is no guarantee that the average of kriging estimates for all P α points discretizing the county v α yields the areal data z(v α ): To correct for the "small number problem", Berke [3] proposed to replace the raw rates z(u i ) in Equations (22) and (25) by rates GBS (u i ) obtained using global empirical Bayes smoothers. Each smoothed rate is easily computed as a linear combination of the raw rate z(u i ) and the global mean m*: The Bayes shrinkage factor λ(u i ) is computed as: where m* and s 2 are the population-weighted sample mean and variance of rates, and is the average population size across the study area. Whenever the rate z(u i ) is based on small population sizes n(u i ) relative to the average size , the factor λ(u i ) is small and the Bayesian estimate (Equation 27) is close to the global mean m*.
The kriging of empirical Bayesian smoothed rates suffers from the same shortcomings as the kriging of raw rates:  27), in this paper kriging was also applied to local Bayes estimates. Local Bayes smoothers are calculated similarly except that all the statistics (i.e. the population-weighted sample mean and variance, population size) are computed within local search windows [29]; for example using the K neighboring observed rates. The estimator thus becomes: where m*(u i ) is the population-weighted average of the rates within the search window. The Bayes shrinkage factor λ(u i ) is now computed as: As for global Bayes smoothers, the relative weight λ(u i ) assigned to the observed rate z(u i ) is smaller for less densely populated counties. For counties with similar population sizes, the factor λ(u i ) is also smaller in regions of greater homogeneity, as measured by a lower local variance s 2 (u i ).

Analysis of lung and cervix cancer data
County-level estimates of mortality risk Figure 3 (top graphs, red curve) shows the omnidirectional semivariograms of lung and cervix cancer mortality risk computed from county-level rates using estimator (5) and the distance measure (1). The experimental semivariogram was fitted using a spherical model with a range of 113 km for lung cancer in Indiana and 437 km for cervix cancer in Region 2. Each model was deconvoluted using the iterative procedure and the high-resolution population maps displayed in Figure 2. For both regions, the procedure stopped once a small (i.e. <1%) decrease in the D statistic occurred three times, after 28 iterations for Region 1 and 10 iterations for Region 2. As expected, the pointsupport semivariogram model (green curve) has a higher sill since the punctual process has a larger variance than its aggregated form. The difference between the sills is particularly large for Region 1, which confirms the stronger impact of regularization for processes with smaller range of autocorrelation [ [30], p. 465]. Applying expression (18) to the point-support model yields a theoretically regularized semivariogram model (blue curve) that is close to the one fitted to experimental values, which validates the consistency of the deconvolution.
The deconvoluted semivariogram models were used to estimate aggregated risk values at the county level in both regions (ATA kriging); see Figure 4 (bottom maps). Figure  4 also shows the maps produced by global and local empirical Bayes smoothers. In all cases, the estimation was based on K = 32 closest observations which were selected according to the population-weighted distance between counties for ATA kriging and the Euclidian distance between the county population-weighted centroids for the two other methods. All maps are smoother than the map of raw rates since the noise due to small population sizes is filtered. The smoothing is particularly strong for the less frequent cervix cancer with a one order of magnitude drop in the variance; see Table 1 (top rows). The variance of smoothed mortality rates is half the variance of raw rates for lung cancer. Table 1 indicates that for both cancers the smoothing is slightly smaller for ATA kriging versus local Bayes smoothers. The global EBS estimates are the least variable, in particular for cervix cancer.
The three risk maps of Region 1 are relatively similar; see Figure 4 (left column). All methods smoothed low lung cancer rates recorded in a few North-western and Northeastern counties characterized by smaller population sizes. For example, after application of Poisson kriging the minimum rate increased from 9.071 to 13.22 deaths/ 100,000 habitants; see Table 1. Since the highest lung cancer mortality rates are observed in the most populated counties in Indiana, the maximum rates remain practically the same after smoothing.
Differences between methods are much more pronounced for cervix cancer mortality in Region 2; see Figure  4 (right column). Unlike the case of lung cancer, both high and low cancer rates are smoothed. Except on the local EBS map, the high risk area formed by two central counties in Figure 1 faded, which illustrates the potential pitfalls associated with the interpretation of the map of observed rates.   more smoothing occurred in sparsely populated Eastern Nevada, leading to higher risk on the ATA kriged map versus local EBS. In summary, ATA kriging seems to weigh more heavily the population size in the filtering of the noisy rate maps.

Point-level estimates of mortality risk
To eliminate the visual bias associated with the interpretation of the choropleth maps of Figure 4, continuous risk surfaces were created for lung and cervix cancer mortality. The most straightforward mapping techniques amount at conducting a point kriging of the county-level raw or smoothed rates, which are simply assigned to the population-weighted centroids of those counties. In other words, one implicitly assumes that all the habitants of a county live at the same location and the measured rate thus refers to this specific location. Note that to ensure that both ATP and point kriging incorporate information on the spatial distribution of county population, the populationweighted centroids displayed in Figure 2 (middle maps) were used for semivariogram estimation and point kriging throughout the analysis. These centroids will be referred to as county centroids, as opposed to geographic centroids, hereafter. A prerequisite for kriging is a semivariogram model that is fitted to experimental values computed from centroid-based rate data using estimator (25). Figure  3 (bottom graphs) shows that the largest semivariogram values are observed for raw rates (blue curve), which is expected because of the additional random variability caused by the small number problem. Since cervix cancer is less frequent than lung cancer, its mortality rates are more likely to be impacted by the small number problem and display higher levels of noise, leading to a sill that is one order of magnitude larger than for the other curves ( Figure 3, right bottom graph). The semivariogram of empirical Bayes estimates has the smallest values since the smoothing reduces the variance of raw rates, in particular for cervix cancer. The use of local versus global empirical smoothers has little impact on the semivariogram for the frequent lung cancer while, as expected, the global smoothers lead to a much lower sill for the semivariogram of cervix cancer. For that cancer, the semivariogram of local EBS risk estimates (green curve) has a concave shape which reflects the presence of a trend that is apparent on the smoothed risk map of Figure 4 (i.e. low rate in Utah and high rates in Southern California). This semivariogram was modelled using a cubic model [6,31] with a range of 2,041 km, while the semivariogram model for local EBS estimates consists of two exponential models and a large nugget effect; see Figure 3 (left bottom graph).   In all four cases, the estimation was based on K = 32 closest county-level rates (local search window). To enforce the coherence constraint, within a given county each grid node was estimated using the same 32 county-level rates; that is the proximity is measured in terms of distance between the centroids of both the prediction and the data counties instead of the distance between the grid node in the prediction county and the centroid of the data county. For lung cancer, the maps created by point kriging of raw and EBS rates show only minor differences; higher EBS risk estimates are found close to the Northeast and Northwest borders where low rates in sparsely populated counties were smoothed by the empirical Bayes method. The ATP kriged map displays much more details, with the presence of clearly delineated areas of lower and higher rates. These visual impressions are confirmed by the statistics listed in Table 1: the variance of the set of ATP kriging estimates (9.510) is twice the variance calculated for the other methods (3.703-4.970). Similar conclusions can be drawn from the comparison of the ATP and EBS kriging maps for Region 2: the variance of ATP kriging estimates (0.266) is twice the variance obtained for the kriging of local EBS (0.154) and more than one order of magnitude larger than for the global empirical Bayes smoother (0.010). The isopleth map of GBS risk estimates is very smooth because of the large nugget effect of the corresponding semivariogram model (recall Figure 3, left bottom graph). Conversely, the kriged map of raw rates is the most variable since it is based on the interpolation of unstable cervix cancer mortality rates, and one can distinguish very clearly the location of several county centroids.
Although the interpretation of spatial patterns is beyond the scope of this methodology paper, comparison of ATP and local EBS kriging maps for Region 2 offers interesting insights about the spatial distribution of cervix cancer mortality risk: • The impact of low or high rates recorded in the vicinity of urban areas (i.e. low rates around Reno and Phoenix, or high rates around Eureka) is much more apparent on the ATP kriged map. This short-scale variability in the risk map results from the greater weight assigned to population density and areal data through the coherence constraint in area-to-point kriging.
• Lower cervix cancer mortality is clearly confined to Utah on the ATP kriging map, while these low rates expand to Eastern Nevada on the maps of kriged local EBS.
• ATP kriging highlights a cluster of high mortality risk in Southern California. A similar cluster was identified with Kern County on the choropleth map of ATA kriged risk in Figure 4. The isopleth risk map shows that high risks are not confined to this sole county but potentially spread over four counties. This information, which is important for designing prevention strategies, is blurred by the use of county-level estimates. Figures 7 and 8 show the maps of the kriging variance associated with the risk estimates of Figures 5 and 6. Differences among interpolation methods are much more pronounced in terms of prediction variance than estimated risk. The point kriging variance is mainly influenced by the location of county centroids (i.e. data geometry): prediction variances are small in the vicinity of county centroids and increase in extrapolation situation, e.g. close to the state borders for the maps of Region 1 (Figure 7). The location of county centroids is particularly obvious on the global EBS map in Region 2 because of the large nugget effect of the corresponding semivariogram. Furthermore, the variance map for local EBS is very smooth as a result of the highly continuous behaviour of the cubic semivariogram model at the origin; recall Figure  3 (right middle graph, green curve). The pattern of these point kriging variance maps essentially reflects the presence of two clusters of small counties (i.e. nearby county centroids) in Northern California and in Utah. The point kriging variance indirectly accounts for the spatial distribution of population within each county through the use of population-weighted centroids, yet differences in population sizes among counties is ignored. This information is incorporated directly into the computation of the ATP kriging variance leading to increased uncertainty in sparsely populated areas. This effect is particularly apparent in Region 2; compare the bottom right map in Figure  8 with the top population map. One can easily associate the location of pockets of low variance with urban centers, such as Los Angeles, San Francisco, Salt Lake City, Las Vegas or Tucson. In both regions, the variance of EBS point risk estimates is the smallest since the smoothing leads to semivariograms with the lowest sills; recall Figure  3.

Coherence of estimation techniques
The isopleth risk maps displayed in Figures 5 and 6 can be viewed as the product of the disaggregation of the choropleth maps of Figure 4. One should thus expect that the aggregation of point kriging risk estimates within each county returns the areal data for that county. This "coherence constraint" is not implicit to the point kriging approach, and Map of lung cancer mortality rates in Region 1, and the risk estimated by the different forms of kriging the variance of aggregated estimates is 50 to 80% smaller than the variance of areal data. This is not the case for ATP kriging where the coherence constraint (Equation 15) ensures that the population-weighted variance, as well as the population-weighted mean, of ATP kriging estimates equals the variance and mean of ATA kriging estimates; see Table 1. This coherence constraint explains the greater level of details noticed in the ATP kriging map versus the risk maps created by the simple point kriging techniques.

Point Poisson kriging versus ATA Poisson kriging
In earlier papers on the use of Poisson kriging for filtering noise in cancer mortality maps [15], a point kriging approach was implemented whereby only county geographic centroids were considered for semivariogram estimation and kriging. To investigate the impact of these simplifying assumptions on the computation of the kriging estimate and variance the results of ATA and point Poisson kriging are compared in Figure 9. Estimation of the mortality risk is fairly robust with respect to the use of centroid geography, in particular for Region 1 where all counties have similar size and shape. Discrepancies between both approaches are much more pronounced for the kriging variance: ignoring the spatial support of county-level data leads almost systematically to larger kriging variance; see Figure 9 (bottom graphs). This result is consistent with the "change-of-support" theory that predicts smaller error variances for block versus point estimation. The magnitude of the overestimation by point kriging is the largest for Region 2 that includes very large counties and has a wide range of county sizes and shapes. Figures 3 through 8 illustrated the major differences between alternative approaches for creating isopleth risk maps from aggregated rates. An objective assessment of the prediction performances of the various techniques requires, however, the availability of the underlying risk maps, which are unknown in practice. Simulation provides a way to generate multiple realizations of the spatial distribution of cancer mortality rates that can be analyzed using alternative approaches. Predicted risks can then be compared to the risk maps used in the simulation.

Simulation studies
For both lung and cervix cancers, L = 100 maps of countylevel mortality rates {z (l) (v α ), α = 1,..., N; l = 1,..., L} were simulated using the following procedure: 1. Continuous maps of mortality risk values, {r (l) (u s ), s = 1,..., S}, are first simulated using non-conditional sequential Gaussian simulation (see [16,9], p. 380 for a description of the algorithm). The simulation grid, which has a 5 km spacing, consists of S = 3,751 and S = 48,474 nodes for Regions 1 and 2, respectively. Five different risk maps (i.e. L = 5) were generated for each cancer, and two realizations for each region are displayed at the top of Figures 10 and  11. Each realization reproduces the histogram of raw cancer mortality rates, while the deconvoluted semivariogram models of Figure 3 were used in the simulation algorithm: exponential model with a range of 75 km for Region 1 and a spherical model with range of 425 km for Region 2. Both models have a zero nugget effect and a unit sill.
2. Each simulated risk map was combined with the population maps of Figure 2 to compute a number of deaths for all grid nodes. Both simulated death counts and population data were then aggregated within each county, and their ratio defines the simulated county-level mortality risk: The 3. For each cancer and each of the five risk maps, 20 realizations of the number of deaths recorded over each county v α were generated by random drawing of a Poisson distribution whose mean parameter is r (l) (v α ) × n(v α ). The division of the simulated death counts by the county population leads to 5 × 20 = 100 sets of simulated mortality rates for each cancer; see bottom of Figures 10 and 11 for the first realization generated for the risk maps displayed in the same figures. The noise caused by the small number problem is particularly apparent for cervix cancer. For example, a couple of counties in the North central part of realization #1 display high mortality rates, while the underlying risk value is low.

Comparison of prediction performances
Each map of simulated rates {z (l) (v α ), α = 1,..., N} underwent a (geo)statistical analysis in order to estimate point risk values using four alternate approaches: point kriging of raw rates, point kriging of global and local empirical Bayesian smoothed rates (EBS), and area-to-point Poisson kriging (ATP). In all four cases, the estimation at each grid node was based on K = 32 closest areal data (local search window). For each realization, the semivariogram needed for each type of kriging was estimated before being automatically modelled and deconvoluted in the case of ATP kriging. Figure 12 shows the different semivariogram models for the two simulated maps of lung and cervix cancer mortality displayed at the bottom of Figures 10 and 11. The reference point (black curve) and areal (yellow curve) support models are the semivariograms fitted directly to the original simulated risk maps before and after aggrega- tion to the county level; see Figure 12 (left column). Models resulting from the deconvolution procedure are plotted as green curves. Comparison of the green and black solid curves indicates that the deconvolution yields point-support models that are reasonably close to the underlying ones in terms of range values. Each deconvo-luted model was regularized according to Equation (18), and the resulting model (blue curve) agrees fairly well with the risk semivariogram model fitted to areal data (red curve), which demonstrates the convergence of the iterative deconvolution procedure. Note also how this risk semivariogram model is close to the reference areal sup-Impact of ignoring the spatial support and population distribution on Poisson kriging results Figure 9 Impact of ignoring the spatial support and population distribution on Poisson kriging results. Top scatterplots illustrate the similarity of the cancer mortality risk estimated using punctual and area-to-area (ATA) Poisson kriging. Ignoring the shape and size of counties (i.e. collapsing county rates to their population-weighted centroids) leads to an overestimation of the kriging variance, in particular for the set of vastly different counties in Region 2.
County maps of lung cancer mortality rates simulated under two scenarios for the underlying continuous risk map Figure 10 County maps of lung cancer mortality rates simulated under two scenarios for the underlying continuous risk map. The number of cases for each county was simulated by random sampling of a Poisson distribution that is defined by the white female population map of Figure 1 and the county-level aggregation of a continuous risk map generated using sequential Gaussian simulation. The units are age-adjusted mortality rates per 100,000 person-years. The color legend applies to all the maps; the class boundaries correspond to the deciles of the histogram of original rates. County maps of cervix cancer mortality rates simulated under two scenarios for the underlying continuous risk map Figure 11 County maps of cervix cancer mortality rates simulated under two scenarios for the underlying continuous risk map. The number of cases for each county was simulated by random sampling of a Poisson distribution that is defined by the white female population map of Figure 1 and the county-level aggregation of a continuous risk map generated using sequential Gaussian simulation. The units are age-adjusted mortality rates per 100,000 person-years. The color legend applies to all the maps; the class boundaries correspond to the deciles of the histogram of original rates.
port model (yellow curve). Because of the good agreement between theoretically regularized and the risk semivariogram models inferred from the county-level rates, discrepancies at the point-support level are essentially caused by the use of the regularization formula (Equation 18). An important factor that influences the deconvolution results is the behavior at the origin of the regularized and pointsupport semivariogram models; for example the fitting of a nugget effect or the use of a spherical (linear behavior) versus cubic (parabolic behavior) model. Unfortunately, in absence of point data this part of the semivariogram model can not be validated. In this paper, no nugget effect was systematically fitted to the point-support model to account for the characteristics of the model used in the simulation.
The semivariogram models used by the point kriging approaches are plotted in the right column of Figure 12.
In particular for cervix cancer, the noise caused by the small number problem leads to semivariograms for raw rates that have a high sill and a large relative nugget effect (blue curves). Rate smoothing using global or local empirical Bayes approaches substantially reduces the sample variability, resulting in smaller sills and nuggets effects, as well as parabolic behaviour for most of the semivariograms at the origin (green and black curves). The risk semivariogram estimator (Equation 5), which accounts for population size in the estimation, yields models with intermediate sills (red curves). Figures 13 and 14 show, for the simulated rate maps #1 of Figures 10 and 11, the risk values estimated at the nodes of a 5 km grid using the four alternate methods. The analysis of simulated maps confirms the conclusions drawn from the analysis of real cancer mortality rates in Figures  5 and 6. The location of county centroids is the most apparent on the kriged map of raw rates which is also the most variable because it is based on the interpolation of unstable mortality rates, in particular for cervix cancer.
Kriging of global and local EBS estimates leads to very smooth maps free of the influence of unreliable rates. Poisson kriging, through its coherence constraint, generates maps of risk estimates that are locally more detailed (i.e. less smoothing) and reveal features of the true risk maps that were blurred by empirical Bayes smoothing, such as the low risk area extending up to the eastern border of Region 2. techniques [15], various criteria were computed and averaged over all realizations to attenuate the impact of statistical fluctuations.
Bias and accuracy of prediction The first two criteria are the mean error (ME) and mean absolute error (MAE) of prediction computed as: The prediction error at each grid node u s is weighted according to the population size at that location, ω s = n(u s ), in order to penalize more the errors that affect a larger population. Table 2 (top rows) indicates that, on average over 100 realizations, all prediction methods are relatively unbiased. However, in particular for cervix, the smallest bias is the most frequently (i.e. 51% of realizations) observed for ATP kriging estimates. Methods differ much more in terms of the mean absolute error of prediction; see Table 2 (bottom rows). On average, ATP kriging reduces the absolute error of simple prediction methods by 20% for cervix and 5-10% for lung. The benefit of Poisson kriging over kriging of raw or empirical Bayesian smoothed rates is almost systematic since it leads to smaller MAE values 99% of the time for cervix and 89% for lung cancer. Although the deconvoluted semivariogram models depart somewhat from the true point-support models (recall Figure 12, left column), the use of the true model γ R (h) (black curve in Figure 12, left column) in ATP kriging does not cause a substantial decline in prediction errors; compare the last two rows of Table 2 for each performance criterion. For both cancers, the smallest kriging variance is obtained for point kriging of smoothed (EBS) rates (see Table 3), which is expected because of the lower sill of the corresponding semivariogram models; recall examples of Figure 12 (right column). Conversely, the noise caused by the small number problem leads to higher sills for the semivariograms of raw rates, hence larger kriging variances, in particular for cervix cancer.

Variance of the prediction errors and smoothing effect
The semivariogram model not only influences the magnitude of the kriging variance at each grid node u s , but it also controls the smoothness of the maps of risk estimates. The variance of the risk estimates was computed for each realization and the average over 100 realizations is listed in Table 3. The variance of the reference risk values, averaged over the five scenarios for the risk map, is 18.137 and 1.828 for lung and cervix cancers, respectively. In agreement with the results displayed for one realization in Figures 13 and 14, point kriging of empirical Bayes estimates generates the largest smoothing effect: the variance represents 15-25% of the reference risk variance. The smoothing effect of ATP kriging is much smaller and similar for both cancers: the ratio of variances is 47-50%. Results for point kriging of raw rates are strongly affected by the reliability of the original data: the smoothing is much smaller when interpolating the unreliable cervix mortality rates (ratio = 94%) than for lung cancer (ratio = 31%).

Quality of the uncertainty model
According to the results of Table 3, point kriging of EBS rates yields the most accurate prediction since the corresponding kriging variance is the smallest; a conclusion that conflicts with results of with δ l = 1 if MSSR (l) >1, and zero otherwise. This averaging allows one to penalize equally the over-and underestimation of the prediction errors by the kriging variance. where G(·) is the cumulative distribution function of the standard normal distribution. The ccdf allows the computation of symmetric p-probability intervals (PI) centred on the median; for example, the 0.5-PI is bounded by the lower and upper quartiles [ (u s ;0.25|(Info)), (u s ;0.75|(Info))]. A correct modeling of local uncertainty would entail that there is a 0.5 probability that the actual risk value at u s falls into that interval or, equivalently, that over the study area 50% of the 0.5-PI include the true value. Since the true risk maps are known for our simulation studies, the fraction of true values falling into any given p-PI, denoted (l) (p), can be computed easily and compared to the expected fraction p. Following Deutsch [32], the agreement between estimated and theoretical fractions is quantified using the following "goodness" statistic: where w(p k ) = 1 if (l) (p k ) >p k , and 2 otherwise. The weights w penalize the situation where the fraction of true values falling into the p-PI is smaller than expected. K' represents the discretization level of the computation; for example, the ccdf percentiles are used as PI boundaries when K' = 50. The goodness statistic confirms the ranking of methods obtained for MSSR; see ) ζ the less frequent cervix cancer. Point kriging of raw rates performs better for cancer with high incidence since the corresponding mortality rates are less unstable.

Conclusion
Capitalizing on recent work in the area of change of support and semivariogram deconvolution, this paper generalized the Poisson kriging approach in order to incorporate not only the shape and size of geographical units, but also the spatial repartition of population within those units in the filtering of cancer mortality maps. The new formulation can accommodate any measurement or prediction support, enabling the estimation of mortality risk both at the level of administrative units (area-to-area kriging) or at the nodes of a fine grid discretizing the area of interest (area-to-point kriging). Unlike the empirical geostatistical methods recently proposed for the creation of isopleth maps from areal cancer rates, the present methodology: 1) conducts the filtering and mapping in a single step, 2) ensures the coherence of the prediction, that is the population-weighted average of kriged risks within each geographical unit equals the risk for this unit, and 3) provides a measure of uncertainty (i.e. kriging variance) that accounts for the spatial pattern of mortality risk, the geometry of administrative units and the spatial repartition of the population at risk.
The analysis of age-adjusted lung and cervix cancer mortality rates illustrated some key features of area-to-point (ATP) Poisson kriging relatively to point kriging of raw rates or empirical Bayesian smoothed rates. For both cancers, the coherence constraint implicit to ATP kriging attenuates the smoothness of the kriged maps while the incorporation of the high-resolution population map enhances the impact of low or high rates recorded in the vicinity of urban areas. Because point kriging of areal data arbitrarily assigns the entire county population to the location of its centroid, information about the size and shape of the geographical unit is lost. Point kriging risk maps can either display unrealistically large variability if based on unreliable raw mortality rates or can be oversmoothed if rates are first filtered using an empirical Bayes approach. Differences between prediction methods are even more pronounced with respect to the prediction variance. The point kriging variance is mainly influenced by the location of county centroids: low values simply indicate the presence of clusters of small counties, while large values are observed close to the edges of the study area. A unique feature of the ATP kriging variance is its incorporation of population sizes, leading to lower prediction variance around urban centres. Although the use of population-weighted versus geographic centroids improved the point kriging predictions (results not shown), ATP kriging yields the most accurate predictions for both cancers. Simulation studies also showed that the ATP kriging variance provides a better quantification of the magnitude of prediction errors. The benefit of the proposed approach is the largest for cervix cancer measured in Region 2 because of the combined effect of the low reliability of mortality rates and the wide range of county shapes and sizes. In earlier papers on the use of Poisson kriging for filtering noise in cancer mortality maps [15], a point kriging approach was implemented whereby only county geographic centroids were considered for semivariogram estimation and kriging. Besides prohibiting the disaggregation of areal data, point Poisson kriging overestimates the prediction variance associated with the areal estimates. The trade-off costs for the implementation of the new form of Poisson kriging are: 1) the inference of the point-support semivariogram from the semivariogram of areal data, 2) the CPU time associated with the computation of area-to-area or area-to-point covariance terms through the fine discretization of geographical units. This paper introduced an innovative deconvolution procedure that allows such an inference in presence of irregular area and heterogeneous population density. Simulation studies demonstrated the convergence of the iterative approach that yields theoretically regularized semivariogram models that are very close to the model inferred from areal data. Although the deconvoluted semivariogram models depart somewhat from the true point-support models, the estimation of risk values appeared to be fairly robust with respect of misspecifications of the semivariogram model. Better prediction of the semivariogram sill would however benefit the quality of the models of uncertainty provided by Poisson kriging. Computational time can be substantially reduced by the use of coarser discretizing grids (e.g. 10 km spacing instead of 5 km). Sensitivity analysis [10] indicates that the minor changes in the deconvoluted model caused by the use of those coarser grids bear little consequences on the kriging predictions.
In their common implementation, full Bayes models fail to account for the size and shape of administrative units, as well as the distribution of the population at risk within those units [17,18]. Simplistic spatial models, such as the popular conditional auto-regressive model, prohibit any change of support and the isopleth mapping of risk values. Therefore, ATP Poisson kriging could only be compared to point kriging of raw rates or empirical Bayesian smoothed rates. Although the latter was introduced as a method for "exploratory" disease mapping [3], it is based on kriging and semivariogram modelling and represents a valid yardstick for the comparison study. When conducting spatial interpolation, one should never forget that every measurement relates to a non-zero finite sample volume. Whereas it is customary in earth science applications to assimilate that measurement support to a point, such a practice becomes a crude simplification for lattice data. Furthermore, treating areal data as point data, then using them to conduct kriging within their own areal support (i.e. disaggregation), is meaningless. This paper presents a coherent spatial methodology that allows both the filtering of the noise caused by the small number problem and the creation of isopleth maps of mortality risk, thereby avoiding the visual bias associated with the interpretation of choropleth cancer mortality maps. By enabling the estimation of mortality risk over any given support, the approach should facilitate the analysis of relationships between health data and putative covariates (i.e. environmental, socio-economic, behavioral or demographic factors) that are typically measured over different spatial supports. These covariates could also be used directly as secondary information in area-to-point kriging, leading to more detailed risk maps at finer scale [12].