A case-referent study: light at night and breast cancer risk in Georgia

Background Literature has identified detrimental health effects from the indiscriminate use of artificial nighttime light. We examined the co-distribution of light at night (LAN) and breast cancer (BC) incidence in Georgia, with the goal to contribute to the accumulating evidence that exposure to LAN increases risk of BC. Methods Using Georgia Comprehensive Cancer Registry data (2000–2007), we conducted a case-referent study among 34,053 BC cases and 14,458 lung cancer referents. Individuals with lung cancer were used as referents to control for other cancer risk factors that may be associated with elevated LAN, such as air pollution, and since this cancer type was not previously associated with LAN or circadian rhythm disruption. DMSP-OLS Nighttime Light Time Series satellite images (1992–2007) were used to estimate LAN levels; low (0–20 watts per sterradian cm2), medium (21–41 watts per sterradian cm2), high (>41 watts per sterradian cm2). LAN levels were extracted for each year of exposure prior to case/referent diagnosis in ArcGIS. Results Odds ratios (OR) and 95% confidence intervals (CI) were estimated using logistic regression models controlling for individual-level year of diagnosis, race, age at diagnosis, tumor grade, stage; and population-level determinants including metropolitan statistical area (MSA) status, births per 1,000 women aged 15–50, percentage of female smokers, MSA population mobility, and percentage of population over 16 in the labor force. We found that overall BC incidence was associated with high LAN exposure (OR = 1.12, 95% CI [1.04, 1.20]). When stratified by race, LAN exposure was associated with increased BC risk among whites (OR = 1.13, 95% CI [1.05, 1.22]), but not among blacks (OR = 1.02, 95% CI [0.82, 1.28]). Conclusions Our results suggest positive associations between LAN and BC incidence, especially among whites. The consistency of our findings with previous studies suggests that there could be fundamental biological links between exposure to artificial LAN and increased BC incidence, although additional research using exposure metrics at the individual level is required to confirm or refute these findings.


Background
Increasing urban development and the subsequent need for artificial lighting of roadways, shopping centers, and homes, has diminished the daily dark period [1]. Artificial light sources have the power to light the evening sky up to 200 thousand times brighter than the natural new moon [2]. The First Atlas of Artificial Night Sky Brightness, reports that 99% of American and European populations, and nearly one-fifth of world terrain, is under light-polluted skies [3]. Rising concern has begun to mount about the detrimental health effects of our newly created electrified environment.
Although the origins of elevated global breast cancer (BC) rates are not fully understood, the highest incidence is observed in industrialized nations [4]. The hypothesis that light at night (LAN) and corresponding decreases in nocturnal melatonin production may act as a BC risk factor [5] has received increasing attention over the past decade. This hypothesis proposes that exposure to LAN disrupts endogenous melatonin production. Melatonin (MLT) is considered an oncostatic or anti-estrogenic agent that is suppressed by ambient light exposure via the retinohypohalamic pineal tract. MLT may affect estrogen activity via several processes and elevated estrogen may be a risk factor for BC [6]. Therefore, reductions in MLT resulting from LAN exposure may promote BC development [7][8][9] due to the facilitation of increased estrogen production [10][11][12][13][14][15][16]. Several prospective studies have reported increased BC risks among women with elevated circulating estrogen concentrations or reduced MLT levels. Shift work, which is associated with both LAN exposure and reductions in MLT production, has been linked with BC and has been designated by the International Agency for Research on Cancer (IARC) as a Group 2A probable human carcinogen [17].
The purpose of this case-referent study was to test the hypothesis that an increased incidence of BC is associated with residence in highly illuminated areas, as defined by time series satellite imagery, in Georgia (GA). We hypothesized that higher LAN levels would be associated with BC incidence, and not, or to a lesser extent, with lung cancer. An elevated incidence of lung cancer related to LAN is not expected because lung cancer is not estrogen dependent and thus, was included in the present analysis as referent cases [14]. Cases and referents were analyzed in relation to residential location and corresponding average LAN exposure for years prior to cancer diagnosis, dating back to 1992, the earliest year of exposure data. Infrared light detected by a nighttime satellite was used to assess exposure, assuming that the more brightly lit areas contained residents who, on average, have had greater exposure to circadian-disruptive light than residents in the lesser lit areas. Additionally, the variation between geographic trends of LAN and BC incidence were evaluated statewide and over time. We also examined whether the relationship between elevated LAN exposure and BC incidence was modified by race. To the best of our knowledge, this is one of the first studies to explore LAN and BC risk among racial subgroups. Previous studies suggest that African Americans may be more susceptible to circadian misalignment. African Americans have a shorter circadian period, larger phase advances, and smaller phase delays relative to Caucasian subjects [18].

Results
In total, 47,817 incident cancers, 33,503 breast and 14,314 lung were identified for inclusion in the study. Analyses   (Table 1). There were no other notable differences among cases and referents. A comparative evaluation of spatial variation between county-level BC incidence and LAN exposure revealed strikingly similar geographic trends across the state of GA ( Figure 1). Regionally, elevated BC incidence rates were observed in northwestern and north central GA.
Elevated incidence rates were also observed along the southeast border and in more isolated pockets in the southwest corner of the state. Similar to the county-level BC incidence map, elevated LAN was also documented in the north central region, as well as in isolated pockets along the southeast border and southwest corner of the state. Areas of elevated LAN exposure and BC incidence correspond to urban centers of which, the Atlanta metropolitan area was the most noteworthy.
Geographic trends in LAN exposure between the most recent (2007) and the oldest (1992) years of data were also evaluated ( Figure 2). The most noteworthy area of increasing LAN occurred in north central GA, in areas surrounding central Atlanta. Red shaded regions indicate areas of increasing LAN over time. The most prominent areas of increasing LAN surround central Atlanta and represent increasing urbanization. Figure 2 obviates the sprawl of the Atlanta metropolitan area, and the successive increases in light pollution.
Adjusted odds ratios for BC indicated that cases were 1.12 times more likely to have been diagnosed if exposed to elevated LAN levels, as opposed to low LAN levels (OR = 1.12, 95% CI [1.04, 1.20]; Table 2). When the analyses were stratified by race, LAN was not associated with  Table 3).

Discussion
The present analyses were based on a case-referent study of 48,511 female cancer cases (34,053 breast and 14,588 lung) and LAN levels averaged prior to cancer diagnosis. Study findings suggest that elevated LAN was associated with an increased odds of BC risk among women after controlling for year of diagnosis, race, tumor grade and stage, age at diagnosis, MSA status, births per 1,000 women aged 15-50, MSA population mobility, female smoking rate, and population over 16 in the labor force. The results are consistent with a previous study that observed a spatial trend of elevated BC in urban areas, which includes the majority of GA's residential population (71%) [19]. It has been hypothesized that disruption of melatonin by LAN can promote BC development [5]. Melatonin suppression [20,21] may promote tumor growth [22] via several possible mechanisms including an increase in estrogen secretion. Although the exact mechanism remains to be described, light exposure may act on tumor formation and growth via a direct oncostatic action, through interference with estrogen receptor function, by affecting thermoregulatory and immune function, and by altering free radical biology (reviewed in [23]). It is also possible that light alters circadian rhythm generation in the suprachiasmatic nuclei (SCN), which has the potential for disruption of clock gene communication with cell cycle regulation in the mammary tissue [24,25].
This study is the first to examine the effects of LAN across racial subgroups. Contrary to our hypothesis, a relationship between LAN and BC incidence was only observed among white women. These results were unexpected given the experimental evidence suggesting that blacks may be more susceptible to circadian misalignment, due to a shorter circadian period, larger phase advances, and smaller phase delays relative to white subjects [18]. The possible set of mechanisms responsible for the observed racial disparity is likely complex, and to date, poorly understood.
Although the level of melatonin production among blacks relative to whites is unclear, racial differences in  naturally occurring melatonin production and secretion have been proposed [26]. Sensitivity of melatonin to light suppression may be influenced by eye pigmentation, which can vary due to race or ethnicity [27]. The percentage of melatonin suppression secretion after light exposure was significantly larger in light-eyed compared to individuals with darker eye pigmentation [27]. The common occurrence of lighter iris color is found almost exclusively among Caucasians [28]. Based on these observations, white women may be at increased risk of melatonin suppression from exposure to LAN, although additional research is needed to elucidate the possible consequences of racial differences in the biological processing of LAN.
Our study findings are especially important given GA's increasing trajectory of urban development. GA was the fourth fastest-growing state between 2000 and 2010 with an increase of 1.5 million residents. Much of that growth was centrally focused in Atlanta and its surrounding metropolitan area, which accounted for over two-thirds (68%) of the state's population growth during the last decade ( Figure 2) [29]. The expansion of the Atlanta metropolitan region will only increase LAN, and could exacerbate the already increased BC rates observed in this growing urban region. Figure 2 confirms Atlanta to be the most noteworthy area of increasing LAN, shown by the circular red area surrounding the core of Atlanta. The core of Atlanta is blue, which translates to no LAN increase. Yet, this is likely due to measurement error because the core urban area registered the maximum LAN value measurable by DMSP-OLS satellites in 1992 (63 watts per cm 2 per sterradian), therefore it does not appear that there has been an increase in light. However, this area has continued to expand, subsequently increasing LAN levels.
When using a satellite image to proxy the light intensity on the ground level, there is understandable concern about the accuracy by which the spatial area depicted on the images represents the size of the lit area on the ground. Imagery from the DMSP-OLS satellite has a tendency to overestimate ground illumination due to coarse spatial resolution, large overlap between pixels, errors in geolocation, or atmospheric water vapor content [30]. Pixel misclassification is exacerbated in the stable light images by counting all occasions of a lighted pixel and by registration errors around persistently lit regions [31]. The combined effect of these factors ultimately results in a general overestimation of the illumination in the area of study. This can be especially problematic when assigning exposure variables to cases and referents, and may result in non-differential misclassification. However, non-differential exposure misclassification, if present, would attenuate rather than exaggerate our results.
The DMSP output from the satellite may not strictly correlate with the restricted portion of the spectrum that is circadian disruptive. Novel ocular studies have identified that MLT suppression is wavelength dependent [32][33][34][35][36][37] and have defined the visible short-wavelength sensitivity of the human melatonin suppression action spectrum [33,[37][38][39]. We conducted an ancillary light validation study to quantify the relationship between presence of circadian-effective ambient outdoor light levels and our exposure variable. This light validation study was conducted in Athens, Georgia using the Daysimeter. The Daysimeter is a device that records ambient light levels that stimulate both the visual and circadian systems [40]. It has previously been used to characterize circadian specific wavelengths, or circadian light [41], and may also help estimate the magnitude of effect on melatonin suppression. This study aimed to describe the relationship between circadian light (CL) at ground level and satellite photometry in the local area (unpublished observations from: Perry G, Bauer S, Wagner S, Vena J). The findings suggested that ground level CL and satellite photometry are significantly correlated (p-value = 0.0003). Our primary interest was to isolate the role the external environment may play on BC risk, although a closer look at the indoor sleeping habitat or nocturnal behavior patterns may prove to be more influential on BC risk than the external light environment. Yet, it is a challenge to estimate indoor exposure ranges or get reliable, consistent reports of individuals' nocturnal behavior. Bias may have been introduced by the removal of cases and referents lacking residential addresses that could be matched with geospatial coordinates. Geocoding match rates are far lower for rural areas than for urban areas [42][43][44][45][46]. In general, rural addresses tend to be less specific. Rural delivery routes and post office boxes are often used instead of street addresses, there is more frequent use of unofficial or colloquial place names in rural areas, and roadway reference data for rural areas are less accurate than they are for urban areas [47]. In general, rural-dwelling individuals have lower LAN exposure values, and are more likely to be white (82% as opposed to 66% in urban locations) [19]. Although the unmatched rural locations omitted from analysis may have introduced bias since geospatial location is related to our exposure of interest, the use of only address-level matches ensured an increased accuracy of our participant's spatial location.
When designing this study, there was no epidemiologic evidence suggesting that exposure to light at night influences lung cancer development. An elevated incidence of lung cancer is not expected with elevated levels of LAN because it is not a hormone dependent form of cancer. Two previous studies investigated the link between levels of LAN and cancer incidence, but no association with lung cancer was reported [9,12]. Recently, Parent et al. (2012) reported novel associations between males ever working at night and cancer risk at several sites, including lung (OR=1.76, 95% CI [1.25, 2.47]) [48]. Although possibly related to LAN, night work could also be associated with exposures to lung carcinogens. If circadian disruption does influence the development of lung cancer, the use of the lung cancer for referents may have attenuated our results. However, little evidence has accrued regarding circadian disruption and lung cancer.

Conclusions
Our analysis chose to focus on LAN as a potential risk factor for BC, and although results were suggestive of such an association, causation cannot be established. We were unable to measure residential stability, and had no information about residence prior to diagnosis. Due to the latency period of BC, this could potentially impact our results. Furthermore, we had no detailed information pertaining to estrogen receptor status, genetic mutation, and lifestyle factors including obesity, physical activity, alcohol consumption or reproductive history of cases. We attempted to control for parity through the use of county-level birth rates per 1,000 women aged 15-50. More detailed covariate data including residential stability and individual-level reproductive history would further strengthen our study design.
Researchers are yet to fully understand disparities in BC risk, however our analyses provide a plausible environmental explanation for the disparities seen in Georgia. Perhaps mechanisms initiated by LAN exposure may exacerbate pathways by which disease virulence and recurrence are linked. Future work should target the influence of individual-level characteristics (e.g., age, gender, diurnal preference, race, adaptability) and factors related to urbanicity on BC outcomes. Also, it is likely that more accurate proxies of individual artificial light at night exposure will become available as technology advances. The combined use of more refined exposure metrics and more detailed individual-level characteristics should be taken together to explore potential mechanisms behind racial disparities in LAN exposure and cancer outcomes. Similar findings in other geographic regions and populations would further validate our study findings.

Cancer data
Cancer incidence data were obtained from the Georgia Comprehensive Cancer Registry (GCCR) for 2000 to 2007. The GCCR is a statewide population-based cancer registry that collects information on all cancer cases diagnosed among Georgia residents. The GCCR is a participant in the National Program for Cancer Registries and the North American Association of Central Cancer Registries. The GCCR meets national standards for cancer registration and is gold certified with high ratings for data quality and representativeness. For this investigation, the GCCR data were particularly valuable because individual-level location data were available for each subject. A latitude and longitude coordinate for residence location at the time of cancer diagnosis (address-matching) was identified for each cancer diagnosis. The coordinate location assigned to each case was of varying accuracy. Location Codes for Address-Matching (LCODES) are 3-4 text codes that indicated the accuracy level of address-matching. There were 29 LCODES output by the address-matching software, as provided by the GCCR. The first 1-2 characters of the LCODE determine the general level of accuracy and include address-, census block group-, census tract-, and county-level of accuracy. Only cases registering address-level accuracy (AI0-AX3) were included for analysis to minimize misclassification bias. Incidence data were selected for breast (C500:509) and lung cancer (C340:349) sites using the SEER Site Groups Primary Site variable based on International Classification of Diseases for Oncology, 3rd edition (ICD-O-3) coding [49]. Analyses were restricted to female breast and lung cancer cases. Only malignant tumors were included.

Exposure data
Georgia is positioned in both the northern and western hemispheres in the southeast region of the United States. Georgia averages 110 sunny days a year and the major metropolitan cities include Atlanta, Augusta, Macon, Savannah, and Columbus [50]. Topography begins at sea level and climbs to nearly 5,000 feet above sea level [51]. Data on nighttime stable lights were obtained from the Unites States Defense Meteorological Satellite Program (DMSP) (Image and data processing by NOAA's National Geophysical Data Center, DMSP data collected by US Air Force Weather Agency). Version 4 Defense Meteorological Satellite Program-Operational Linescan System (DMSP-OLS) Nighttime Lights Time Series archive data set consists of high-resolution regional imagery which collects broadband visible-near infrared image data with a nominal spatial resolution of 2.7 km [31]. When collecting nighttime data, telescope pixel values are replaced by Photo Multiplier Tube (PMT) values which are sensitive to radiation from 0.47 -0.95 um at 10-5 -10-9 Watts per cm 2 per sterradian [52].
The satellite imagery for 1992-2007, used in our analysis, was constructed by the DMSP by averaging daily readings of the satellite sensors and removing cloud cover. Nighttime DMSP-OLS images make use of a time series of images to distinguish stable lights produced by cities, towns, and industrial facilities from ephemeral lights caused by fires and lightning [31]. Time series data is composited by using a 1-km grid (finer spatial resolution than that of the input imagery), providing a uniform grid cell size at all latitudes and contiguous land surfaces [31]. A geographic information system (GIS), ArcGIS (ArcMAP software, version 10.0; ESRI, Redlands, Calif ) was used to map breast and lung cancer cases by spatial location. ArcGIS technology was used to match individuallevel cancer case locations with the LAN levels obtained from satellite images. The task was performed using the spatial analyst tool, "extract values to points," which joins attributes from one layer to another based on the relative location of features. The "extract values to points" between two data sources was performed as follows: First, a Georgia satellite image of nighttime illumination was imported to the ArcGIS software. The nighttime illumination was characterized by various LAN intensities and displayed by raster units (with a minimum LAN value of 0 (no illumination) and the maximum value of 63 watts per cm 2 per sterradian (maximum illumination)). Each 1-km raster pixel represented the corresponding ground-level LAN intensity in that geographical region. Three exposure categories were defined for each LAN level in ArcGIS using Jenk's Natural Break method: low (0-20 Watts per cm 2 per sterradian), medium (21-41 Watts per cm 2 per sterradian), and high (>41 Watts per cm 2 per sterradian). Jenk's Natural Break method is a data classification method designed to reduce the variance within classes and maximize the variance between classes [53,54]. This method was selected to determine the best arrangement of LAN values into different classes, and create "break points" which minimize each class's average deviation from the class mean, while maximizing each class's deviation from the means of the other LAN groups [53,54]. Jenk's Natural Break method was also used by Kloog

Covariate data
Individual-level covariates were obtained from the GCCR for 2000 to 2007 and included race, tumor grade and stage, year of diagnosis, and age at cancer diagnosis. County-level (N = 159) covariates were obtained from the American FactFinder data portal, which is an online table generator for U.S. Census data [55]. Prior to model selection via backwards elimination, the full statistical model contained the following county-level variables; average family size, median household income, Metropolitan Statistical Area (MSA) status, average annual particulate matter (PM) 2.5 concentration, births per 1,000 women aged 15-50 in the last year, percent of MSA population living in different residence in 1995, percent of population below the poverty line, percent of African American females, percent of population that are high school graduates or higher, percent of population over 16 in the labor force, and percent of total population in different residence in 1995. Prevalence of cigarette smoking among Georgia women was measured at the Public Health District level (N = 18), from 2000-2004 and obtained from the Behavioral Risk Factor Surveillance System [56]. County metropolitan statistical area classification was obtained from the U.S. Bureau of the Census 2003 Metropolitan Statistical Area (MSA) standards.

Model building and variable definitions
Manual hierarchical backwards elimination was used to determine the optimal set of covariates. Beginning with the full model, potential confounders were retained in the final model if their removal caused the LAN exposure coefficients' estimates to change by more than 10%. Based on this procedure, the final model included race, tumor grade and stage, year of diagnosis, age at cancer diagnosis, Metropolitan Statistical Area (MSA) status, births per 1,000 women aged 15-50, MSA population mobility, population over 16 in the labor force, and prevalence of cigarette smoking.
Continuous variables were tested for normality using the Anderson-Darling test for normality and inspection of data distributions and variable plots. Variables that were not normal based on this assessment were categorized for the remainder of the analysis. Categories were either defined as quartiles (tumor stage, grade, births per 1,000 women aged 15-50 in the last year and smoking) or were specified based on knowledge of the variable itself. Individual-level year of cancer diagnosis ranged from 2000-2007. Tumor stage was categorized as local, regional, distant and unstaged. Tumor grade was categorized as well-differentiated, moderately differentiated, poorly differentiated, and other. Racial subgroups were dichotomized as black and white. The sample size within other minority subgroups was too small to support reliable model estimation. MSA status was classified at MSA or non-MSA according to U.S. Bureau of the Census 2003 Metropolitan Statistical Area standards. Among study participants living in Metropolitan Statistical Area or Primary Metropolitan Statistical Area, the percentage living in different residence in 1995 was dichotomized as 0% change or > 0% change. Births per 1,000 women, aged 15-50 years were categorized as <40, ≥40-54, >54-66, and >66. Percent female smokers were categorized as <19.5%, ≥19.5%-21.4%, >21.4%-22.3%, and >22.3%.

Statistical analyses
Statistical Analysis Software, SAS Version 9.3 (SAS Institute, Cary, NC) was used for data analysis and the probability p < 0.05 was set as the accepted level of statistical significance. The distributions or frequencies of all variables were compared by case status (breast versus lung cancer cases) and Chi-square tests were used to assess statistical differences between groups. Univariable analyses were performed for each individual-and countylevel covariate. Final logistic regression models were adjusted for variables on the individual, county, and public health district level, as defined above.
Final unconditional logistic regression analyses were performed with case status as the dichotomous outcome (BC case versus lung cancer referents). Analyses were performed for both races combined and stratified by racial subgroup (whites versus blacks). Measures of association were calculated as odds ratios (OR) with 95% confidence intervals (CI).