U.S. census unit population exposures to ambient air pollutants
© Hao et al; licensee BioMed Central Ltd. 2012
Received: 27 October 2011
Accepted: 12 January 2012
Published: 12 January 2012
Progress has been made recently in estimating ambient PM2.5 (particulate matter with aerodynamic diameter < 2.5 μm) and ozone concentrations using various data sources and advanced modeling techniques, which resulted in gridded surfaces. However, epidemiologic and health impact studies often require population exposures to ambient air pollutants to be presented at an appropriate census geographic unit (CGU), where health data are usually available to maintain confidentiality of individual health data. We aim to generate estimates of population exposures to ambient PM2.5 and ozone for U.S. CGUs.
We converted 2001-2006 gridded data, generated by the U.S. Environmental Protection Agency (EPA) for CDC's (Centers for Disease Control and Prevention) Environmental Public Health Tracking Network (EPHTN), to census block group (BG) based on spatial proximities between BG and its four nearest grids. We used a bottom-up (fine to coarse) strategy to generate population exposure estimates for larger CGUs by aggregating BG estimates weighted by population distribution.
The BG daily estimates were comparable to monitoring data. On average, the estimates deviated by 2 μg/m3 (for PM2.5) and 3 ppb (for ozone) from their corresponding observed values. Population exposures to ambient PM2.5 and ozone varied greatly across the U.S. In 2006, estimates for daily potential population exposure to ambient PM2.5 in west coast states, the northwest and a few areas in the east and estimates for daily potential population exposure to ambient ozone in most of California and a few areas in the east/southeast exceeded the National Ambient Air Quality Standards (NAAQS) for at least 7 days.
These estimates may be useful in assessing health impacts through linkage studies and in communicating with the public and policy makers for potential intervention.
KeywordsCensus geographic unit concentration population exposure ambient air pollutants PM2.5 ozone
census block group
Centers for Disease Control and Prevention
census geographic unit
Community Multiscale Air Quality
U.S. Environmental Protection Agency
Environmental Public Health Tracking Network
mean absolute deviation
National Ambient Air Quality Standards
- PM2.5 :
particulate matter with aerodynamic diameter < 2.5 μm;
- R :
Air pollution monitoring data has customarily been compiled and maintained by the EPA and/or state and local agencies. These data have been used in several studies that found ambient air pollutants associated with mortality [1–4] and morbidity [5–9]. However, air monitoring sites are typically sparsely located in very limited geographic areas - only 20% of U.S. counties have at least one monitoring station for PM2.5 - and the temporal resolution and type of pollutants measured vary by station (e.g., PM2.5 data is only available about every 3-6 days). Thus, studies based on monitoring data were usually limited to high population density areas such as cities or urban/suburban centers, where most monitoring stations are located.
To expand geographic coverage and increase temporal resolution of air pollution data, several studies have recently estimated ambient air pollution concentrations using various data sources and advanced modeling techniques [10–13]. Thus, areas with very sparse or no monitoring data now have gridded data with a variety of spatial (e.g., 4 km, 36 km) and temporal (e.g., hourly, daily) resolutions. However, these data have not been widely accepted by health researchers partly because studies of possible effects of ambient air pollutants on human health often require population exposures to ambient air pollutants to be presented at certain census geographic levels (e.g., census tract, county), where health data are usually available to maintain confidentiality of individual health data [14, 15]. Other socioeconomic and demographic data are also routinely collected at such geographic resolutions .
Ideally, concentration should be presented at the finest CGU possible, at which air pollution concentration may approximate the potential population exposure to a certain kind of ambient air pollutant, whereas actual population exposure may be close to zero in certain places where few people live (e.g., mountains), no matter how high the concentration of pollutants. From a public health perspective, it is the exposure that makes people sick. The goal of this study is, therefore, to estimate CGU population exposures to ambient PM2.5 and ozone. Two major steps are taken to achieve this goal: 1) estimate BG daily ambient PM2.5 and ozone concentrations from the gridded data and conduct data comparisons against ground-based monitoring values; and 2) aggregate BG concentrations to generate population exposure estimates for larger CGUs using BG population as a weighting factor. We choose BG (instead of census block) as the basic unit because BG is the lowest CGU where population data are available on an annual basis. BGs generally contain between 600 and 3,000 people, with an optimum population size of 1,500 .
Materials and methods
Data source: gridded PM2.5 and ozone concentrations
Gridded PM2.5 (μg/m3) and ozone (ppb) concentrations were obtained using a hierarchical Bayesian model developed by the EPA for CDC's EPHTN , which provide 24-hour maximum PM2.5 and 8-hour maximum ozone concentrations on a daily basis (2001-2006). The model uses source-based Community Multiscale Air Quality (CMAQ) model outputs and monitoring data. It accounts for spatial and temporal dependencies of air pollutants through a hierarchical Bayesian approach. The spatial resolution of data was inherited from CMAQ modeling outputs. CMAQ considered information about emission inventories, meteorological information, and land use. The detailed information about CMAQ and monitoring data can be obtained from http://epa.gov/asmdnerl/CMAQ and http://airnow.gov, respectively. The model resulted in two sets of gridded data: 36 km grid-cells for the contiguous U.S. and 12 km grid-cells for an eastern portion of the U.S., which includes the Northeast census region and the South Atlantic and East South Central divisions of the South census region (excluding part of south Florida) and part of Arkansas and Louisiana; portions of the Midwest census region, which includes the entire East North Central division and part of Minnesota, Iowa and Missouri (http://www.census.gov/geo/www/us_regdiv.pdf) . The gridded data fill "holes" in both time (when data are missing on certain days) and space (locations where data are not available). Information on CDC's ongoing EPHTN has been described elsewhere [21, 22] and is also available from http://www.cdc.gov/ephtracking.
Estimating BG PM2.5 and ozone concentrations
We used a distance-weighting method to estimate BG daily PM2.5 and ozone concentrations for all U.S. BGs based on 36 km-gridded data (12 km-gridded data for an eastern portion of the U.S.). Empirical studies, which compared different methods of areal interpolation, suggested that distance-weighting was an appropriate method in calculating population exposure estimates [10, 24]; distance-weighting relaxes the homogeneity assumption associated with area-weighting method and overcomes bias introduced by the equal contribution assumption associated with internal or nearest neighboring method.
where δ i is the standard error associated with BG estimate μ i ; and σ i is the standard error associated with the original grid's concentration measure P i .
Third, the derived 2001-2006 BG daily estimates for PM2.5 and ozone were compared with monitoring data observed at ground stations within each BG boundary. The comparison was restricted to the area equivalent to 12 km grid-cell coverage (i.e, an eastern portion of the U.S.) for simplicity of having estimates from both 36 km- and 12 km-gridded data. The number of monitoring sites in each BG ranges from 0 to 2. The comparison was conducted for those BGs containing 1 or 2 monitoring sites. The majority of BGs contained only one monitoring site (e.g., 1055 BGs contains one versus 26 BGs contains two PM2.5 monitoring sites). We calculated two statistics for data comparison: mean absolute deviation (MAD), an intuitive measure of absolute fit; and correlation coefficient (R), a measure of relative fit. MAD measures the average absolute deviations of the estimates from their corresponding observed data . In time series analysis, MAD measures the average absolute deviation of observations from their forecasts.
Estimating daily population exposures to ambient PM2.5 and ozone for larger CGUs
where δ i is the standard error associated with BG estimate, μ i . In this study, we generated daily population exposure estimates (PEs and their standard errors ψs) (2001-2006) for census tract, county, state and the U.S. accordingly by aggregating BG daily estimates weighted by BG population distribution across corresponding larger CGUs.
We mapped the 98th percentiles of 2006 daily population exposures to ambient PM2.5 and ozone for census tract and county to demonstrate geographic variation in population exposures to ambient PM2.5 and ozone and highlight where severe population exposures to these two ambient pollutants could potentially occurs. The 98th percentile of 2006 daily population exposure estimate shows the seventh-highest daily population exposure (i.e., 2% of 365 days equals to ~7 days) that the population in a CGU has experienced in that year. We grouped population exposure to ambient air pollutants into five categories. The cut point for the second highest category is adjusted to match the NAAQS (daily 24-hour standard of 35 μg/m3 for PM2.5 and daily 8-hour standard of 75 ppb for ozone)  and the highest one and the lowest three categories are set at equal lengths of 10 units above or below the standards (e.g., cut points for the highest ones are set as 45 for PM2.5 and 85 for ozone). ArcGIS software was used in mapping . Similarly, we mapped the 90th percentiles of 2006 daily population exposures to ambient PM2.5 and ozone for census tract and county with five manually grouped categories (the highest category is set to match the NAAQS) to allow a spatial pattern to emerge. The 90th percentile of 2006 daily population exposure estimate corresponds to the thirty fifth-highest daily population exposure (i.e., 10% of 365 days equals to ~35 days) that the population in a CGU has experienced in that year.
We further calculated the number of days when PM2.5 or ozone concentration exceeded the NAAQS for each BG. The populations at risk for larger CGUs were calculated by aggregating BG populations with exposure to ambient PM2.5 or ozone exceeding the NAAQS.
Data comparison: BG estimates against ground-based monitoring data
Comparison statistics between BG daily estimates and ground-based monitoring data in an eastern portion of the U.S., by year
(12 km grid)
(36 km grid)
(12 km grid)
(36 km grid)
PM 2.5 (μg/m3)
PM 2.5 (μg/m3)
PM 2.5 (μg/m3)
PM 2.5 (μg/m3)
PM 2.5 (μg/m3)
PM 2.5 (μg/m3)
Estimating CGU population exposures to ambient PM2.5 and ozone
Population at risk
The estimated population (percentage) at risk by state in 2006
The study has three important results. First, daily BG estimates for ambient PM2.5 and ozone concentration were comparable to data observed at monitoring sites, which suggested that inverse-distance weighing was an appropriate method to generate estimates for BGs from gridded data. The second important result was that we generated daily potential population exposure estimates, for both PM2.5 and ozone, for various CGUs from BG to state and the U.S. Such population exposure estimates for small areas such as census tracts and counties are very valuable for conducting health impact studies. Moreover, this result highlights the need for investigation and intervention in places with higher estimated daily potential population exposures (not concentrations) and/or longer duration.
The geographical patterns of PM2.5 and ozone found (especially at census tract level) were generally consistent with the ranking of most polluted cities (by year round particle pollution and ozone, respectively), provided by the American Lung Association - available at http://www.stateoftheair.org/. The highest daily potential population exposure to ambient PM2.5 in the west coast and northwest U.S. may be largely contributed by organic carbon due to high biomass burning such as wildfires, waste burning, and woodstoves [34, 35], though nitrate, sulfate, or crustal material may also represent substantial components of PM2.5 for the western U.S. . The higher daily potential population exposure to PM2.5 in other areas and ozone in general may mainly occur in those megacities or large metropolitan areas where ozone precursors such as volatile organic compounds and oxides of nitrogen produced by heavy traffic (also contribute to organic carbon and nitrite for PM2.5) and electric utilities and industrial boilers (also contribute to sulfate and nitrite for PM2.5) are concentrated [36, 37].
The third important result was that we generated population at risk for each CGU from BG to state and the U.S. based on the NAAQS for PM2.5 and ozone. This result provides a hierarchical structure that links hazardous pollution to population affected at different geographic levels. For example, population at risk presented at the state level could be easily traced back to specific CGUs, where information on potential population exposures to ambient air pollutants and population size is needed at smaller CGUs. Such detailed information on potential population exposure level and size of population affected could be used to facilitate communications among public health professionals and/or policy makers across different levels of jurisdiction and help them prioritize resources based on size of population affected and duration of exposures to ambient air pollutants.
There are several limitations. First, we assumed independence among the nearest four grids. This could potentially underestimate the standard errors associated with BG estimates. Second, we used the BG centroid to represent the entire BG area, which on average contains about 39 census blocks . However, in reality, ambient PM2.5 or ozone may vary within a BG. Although we thought to convert gridded concentration data to census blocks (the smallest CGU in the U.S.), we were limited to BGs because population data were not available on an annual basis at block level to allow us to generate potential population exposure estimates for larger CGUs. Third, like other studies, we could not account for net population gain or loss for a BG on a daily basis due to population movement across BGs.
An additional limitation was associated with the uncertainty of 36 km- versus 12 km-gridded data. For example, BG estimates from 36 km-grids were slightly more approximate to ground monitoring data than those estimated from 12 km-grids. This may be explained by different sets of input variables included in 36 km- versus 12 km-CMAQ modeling system . We compared 12 km- and 36 km-gridded data against values observed at the nearest monitoring site within specific grids (in an eastern portion of the U.S.) and the comparison statistics (e.g., MAD and R) showed the same pattern as in Table 1 (data not shown): 36 km-gridded data were more approximate to the observed values than 12 km-gridded data. Thus, interpretations of results found must be considered in the context of the limitations of this study.
We presented a method to allocate gridded data to BGs based on spatial proximities between BGs and their four nearest grids. We used a bottom-up (fine to coarse) strategy to generate CGU population exposures to ambient air pollutants based on BG estimates. Given that BG concentration derived from inverse-distance weighting was comparable to the ground-based monitoring data, using BG as a building block not only provided comparable population exposure estimates across CGUs, but also guaranteed that patterns shown at different geographic levels were consistent, with finer geographic resolution showing more detailed location for potential population exposures to ambient air pollutants. These estimates may be useful in communicating to the general public about the amount and duration of potential population exposures to ambient air pollutants and size of population affected for various geographic levels.
- Woodruff TJ, Darrow LA, Parker JD: Air pollution and postneonatal infant mortality in the United States, 1999–2002. Environmental Health Perspectives 2008,116(1):110–115.PubMedView Article
- Jerrett M, Burnett RT, Ma RJ, Pope CA, Krewski D, Newbold KB, Thurston G, Shi YL, Finkelstein N, Calle EE, et al.: Spatial analysis of air pollution and mortality in Los Angeles. Epidemiology 2005,16(6):727–736.PubMedView Article
- Franklin M, Schwartz J: The impact of secondary particles on the association between ambient ozone and mortality. Environmental Health Perspectives 2008,116(4):453–458.PubMed
- Bell ML, McDermott A, Zeger SL, Samet JM, Dominici F: Ozone and short-term mortality in 95 US urban communities, 1987–2000. Jama 2004,292(19):2372–2378.PubMedView Article
- Darrow LA, Klein M, Flanders WD, Waller LA, Correa A, Marcus M, Mulholland JA, Russell AG, Tolbert PE: Ambient Air Pollution and Preterm Birth A Time-series Analysis. Epidemiology 2009,20(5):689–698.PubMedView Article
- Silverman RA, Ito K: Age-related association of fine particles and ozone with severe acute asthma in New York City. Journal of Allergy and Clinical Immunology 2010,125(2):367–373.PubMedView Article
- Parker JD, Akinbami LJ, Woodruff TJ: Air Pollution and Childhood Respiratory Allergies in the United States. Environmental Health Perspectives 2009,117(1):140–147.PubMed
- Morello-Frosch R, Jesdale BM, Sadd JL, Pastor M: Ambient air pollution exposure and full-term birth weight in California. Environmental Health 2010.,9(44): doi:10.1186/1476–1069X-1189–1144 doi:10.1186/1476-1069X-1189-1144
- Dominici F, Peng RD, Bell ML, Pham L, McDermott A, Zeger SL, Samet JM: Fine particulate air pollution and hospital admission for cardiovascular and respiratory diseases. Jama-Journal of the American Medical Association 2006,295(10):1127–1134.View Article
- Bell ML: The use of ambient air quality modeling to estimate individual and population exposure for human health research: A case study of ozone in the Northern Georgia Region of the United States. Environment International 2006,32(5):586–593.PubMedView Article
- Liu Y, Paciorek CJ, Koutrakis P: Estimating Regional Spatial and Temporal Variability of PM2.5 Concentrations Using Satellite Data, Meteorology, and Land Use Information. Environmental Health Perspectives 2009,117(6):886–892.PubMed
- Sahu SK, Yip S, Holland DM: Improved space-time forecasting of next day ozone concentrations in the eastern US. Atmospheric Environment 2009,43(3):494–501.View Article
- van Donkelaar A, Martin RV, Brauer M, Kahn R, Levy R, Verduzco C, Villeneuve PJ: Global Estimates of Ambient Fine Particulate Matter Concentrations from Satellite-Based Aerosol Optical Depth: Development and Application. Environmental Health Perspectives 2010,118(10):847–855.PubMedView Article
- Armstrong MP, Rushton G, Zimmerman DL: Geographically masking health data to preserve confidentiality. Stat Med 1999,18(5):497–525.PubMedView Article
- National Research Council: Putting People on the Map:Protecting Confidentiality with Linked Social-Spatial Data. Washington, DC: The National Academies Press; 2007.
- Briggs D, Fecht D, de Hoogh K: Census data issues for epidemiology and health risk assessment: experiences from the Small Area Health Statistics Unit. Journal of the Royal Statistical Society Series a-Statistics in Society 2007, 170:355–378.
- Cartographic Boundary Files [http://www.census.gov/geo/www/cob/bg_metadata.html]
- Community Multiscale Air Quality (CMAQ). U.S. Environmental Protection Agency [http://epa.gov/asmdnerl/CMAQ/]
- AirData. U.S. Environmental Protection Agency [http://www.epa.gov/aqspubl1/]
- Census Regions and Divisions of the United States [http://www.eia.gov/emeu/reps/maps/us_census.html]
- McGeehin MA, Qualters JR, Niskar AS: National Environmental Public Health Tracking Program: Bridging the information gap. Environmental Health Perspectives 2004,112(14):1409–1413.PubMedView Article
- McGeehin MA: National Environmental Public Health Tracking Program: Providing Data for Sound Public Health Decisions. Journal of Public Health Management and Practice 2008,14(6):505–506.PubMed
- National Environmental Public Health Tracking Network. Centers for Disease Control and Prevention [http://www.cdc.gov/ephtracking]
- Hanigan I, Hall G, Dear KBG: A comparison of methods for calculating population exposure estimates of daily weather for health research. Int J Health Geogr 2006, 5:38. This article is available from: http://www.ij-healthgeographics.com/content/5/1/38. PubMedView Article
- SAS Institute Inc: SAS Version 9.2. 9.2nd edition. Gary, North Carolina: SAS Institute Inc; 2010.
- Vincenty T: Direct and Inverse Solutions of Geodesics on the Ellipsoid with Application of Nested Equations. Survey Review 1975,22(176):88–93.
- Zdeb M: Driving Distances and Times Using SAS ® and Google Maps. In SAS Global Forum 2010. Seattle, Washington, USA; 2010.
- Gorard S: Revisiting a 90-year-old debate: the advantages of the mean deviation. 2004.
- Brindley P, Wise ST, Maheswaran R, Haining RP: The effect of alternative representations of population location on the areal interpolation of air pollution exposure. Computers, Environment and Urban Systems 2005,29(4):455–469.View Article
- Haining R, Law J, Maheswaran R, Pearson T, Brindley P: Bayesian modelling of environmental risk: example using a small area ecological study of coronary heart disease mortality in relation to modelled outdoor nitrogen oxide levels. Stochastic Environmental Research and Risk Assessment 2007,21(5):501–509.View Article
- National Ambient Air Quality Standards (NAAQS). U.S. Environmental Protection Agency [http://www.epa.gov/air/criteria.html]
- ESRI Inc: ArcGIS Version 9.3. 9.3rd edition. Redland, California: ESRI (Environmental Systems Research Institute) Inc; 2010.
- Most Polluted Cities [http://www.stateoftheair.org]
- Zhang Y, Wen XY, Wang K, Vijayaraghavan K, Jacobson MZ: Probing into regional O-3 and particulate matter pollution in the United States: 2. An examination of formation mechanisms through a process analysis technique and sensitivity study. Journal of Geophysical Research-Atmospheres 2009. 114. Artical Number: D22305 31 PP. doi:10.1029/2009JD011900
- Jaffe D, Hafner W, Chand D, Westerling A, Spracklen D: Interannual variations in PM2.5 due to wildfires in the Western United States. Environmental Science & Technology 2008,42(8):2812–2818.View Article
- Our Nation's Air - Status and Trends through 2008 [http://www.epa.gov/airtrends/2010/]
- Gurjar BR, Jain A, Sharma A, Agarwal A, Gupta P, Nagpure AS, Lelieveld J: Human health risks in megacities due to air pollution. Atmospheric Environment 2010,44(36):4606–4613.View Article
- LandView ® 6 Census 2000 Population Estimator [http://www.census.gov/geo/landview/lv6help/pop_estimate.html]
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.