Spatio-temporal gaps in monitoring can result in uncertainty in the ascertainment of population-level exposures, especially when fluctuations in PM2.5 concentrations tend to occur at frequencies not detectable given existing monitor sampling schedules. Monitors that operate for regulatory purposes are not usually sited very close to sources, where high concentrations of PM2.5 can be observed. Rather they are sited in places to measure levels of PM2.5 that represent average concentration levels over large areas. Lack of PM2.5 concentration measurements over continuous spatial and temporal scales limits our ability to link air quality levels with health effects data. Thus, modeled predictions may provide a suitable alternative for use in public health surveillance.
CMAQ model output as well as output from any model that relies on CMAQ output has a spatial unit, the grid cell that differs from the spatial unit of health and demographic data, which are often available at geo-political resolutions, such as county, census tract, etc. Geo-imputation procedures are necessary to convert grid-level data to county level estimates, which are needed to generate metrics for environmental public health surveillance through the CDC Tracking Network and for linkage to spatially comparable health data. The three geo-imputation methods mentioned in this paper are routinely used in public health and other allied fields. In our study, a population-weighted county centroid containment approach performed best among the methods considered for translating grid-level HB predictions to county level estimates. Also, a population-weighted county centroid denotes a spatial mean of the underlying population distribution within each county and estimates of PM2.5 generated using this method are intended to represent the ambient concentration levels to which most of the population are potentially exposed.
In the context of linking with health data, the spatial scale of modeled predictions was a very important consideration in developing county level PM2.5 estimates. For 2005, HB predictions of PM2.5 were available at a 12-km resolution for the eastern U.S., whereas HB predictions of PM2.5 were available at a 36-km resolution for the entire conterminous U.S. County level estimates of PM2.5 derived from 36-km predictions correlate more strongly with AQS-based PM2.5 concentrations than do estimates derived from 12-km predictions. The predominant reason for the difference in performance of 12- and 36-km HB predictions was the underlying CMAQ estimates. The input needed to generate the 12-km and 36-km CMAQ estimates were processed with different assumptions and, for certain inputs, resolving to a finer spatial scale could add uncertainty to the final model output . We developed county level estimates of PM2.5 from 36-km HB predictions since our primary goals were to have the HB-based metrics approach the values of the AQS-based metrics, and generate metrics for the entire conterminous U.S.
The strength and consistency of the relationship between daily HB- and AQS-based PM2.5 county level estimates are acceptable at concentrations below the daily NAAQS and, at these concentrations, differences between HB- and AQS-based PM2.5 estimates are reasonable for most census regions. For 2005, less than 2% of the measurements available from AQS monitoring reflected concentrations greater than 35 μg/m3. At these higher concentrations, HB-based county level estimates are more likely to under predict AQS concentrations, with the largest differences observed for the western region of the U.S. Some of these differences can be explained based on model features. The HB model fuses monitor data when available with CMAQ estimates and, for most locations, days with only CMAQ estimates outnumber days with both CMAQ and AQS data. CMAQ estimates are primarily used for predicting background concentrations and do not adequately capture spikes in air quality levels as a result of exceptional events . While the bias in the CMAQ estimates is adjusted using a global (national-level) regression approach, and the AQS data measurement error is accounted for in the HB model , the daily HB predictions can be different from the coincident AQS measurements when CMAQ estimates greatly differ from the AQS data. Additionally, there are relatively fewer monitor-based observations available for the western U.S. and CMAQ estimates under predict AQS concentrations in the western U.S., especially when the terrain is mountainous . Hence, HB estimates rely heavily on CMAQ in the western U.S. and we see larger absolute and relative differences between county level HB and AQS estimates with increasing PM2.5 concentrations (Table 2). Users of HB-based PM2.5 estimates should be aware of the limitations of these data as well as the benefits of having data over continuous spatio-temporal scales.
Annual county level metrics of PM2.5, such as annual averages, provide a useful summary of prevailing concentration levels. However, averages created from AQS-based PM2.5 concentration measurements are limited to counties with monitors and therefore do not provide a complete picture of prevailing air quality throughout the conterminous U.S. Moreover, PM2.5 annual metrics derived from AQS data based on a sampling frequency that is predominantly every third day can be taken to accurately characterize general conditions only under the assumption that days included in the sample fairly represent the full calendar. Given that the HB-based estimates are available for every day of the year, annual averages incorporating these estimates can be interpreted without any assumptions concerning days without data.
The benefits of employing HB predictions should be considered in light of the added uncertainty which they introduce. As noted, the annual county level HB-based annual averages can understate or overstate the air quality problem in specific areas compared to averages based on AQS concentrations. At higher concentrations, especially near the annual NAAQS—15 μg/m3, and in the western U.S., HB-based annual averages are more likely to fall below monitor-based measurements. Notably, combining HB-based estimates with AQS-based concentrations results in annual averages that comport well with annual averages created using AQS data exclusively; however, a few counties have lower annual averages when compared with the annual averages obtained exclusively from AQS-based concentrations.
In summary, we characterized the difference between HB-based estimates and AQS-based concentrations with the intent that the results can guide public health professionals and researchers on the appropriate use of the county level estimates of PM2.5. Our analysis of daily differences between AQS-based concentrations and HB-based estimates of PM2.5 indicate that the differences can vary across census regions and divisions, and that generally HB-based county level estimates tend to under predict at higher concentrations. This needs to be considered when using daily county level HB-based estimates to identify exceedances of the daily NAAQS in different parts of the country or to conduct studies that assess health outcomes related to short-term PM2.5 exposures. The annual averages developed by combining HB- and AQS-based PM2.5 data show less variation with AQS-based annual averages. Given the need to correctly characterize air quality levels and minimize the discrepancy with county level annual averages created from AQS data that are commonly used, we suggest that the county level annual averages of PM2.5 for the Tracking Network be calculated using AQS data when and where they are available and using HB-based estimates for days and locations without such data.