In this ecologic study we observed statistically significant associations between agglomeration-specific EC and GC SIR and SES and dietary patterns. We hypothesised that strong geographical EC and GC risk patterns highlighted in previous studies [3, 5] could be explained by the existence of important geographical differences in the prevalence of two well-established and modifiable risk factors (SES and dietary pattern).
Two dietary patterns were identified: "restricted food choice" and "unrestricted food choice" that explained approximately 21% percent of the variance in responses to the FFQ. The unrestricted food choice pattern was positively correlated with total fruit, total vegetables, seafood, poultry and regular fibre, and negatively correlated with sweets. This dietary pattern was linked to an inverse risk of EC in male, female and both sexes combined. The restricted food choice was negatively correlated with total fruit and regular fibre, positively correlated with salted and preserved foods and had very small factor loading on total vegetables, seafood and poultry. This dietary pattern was associated with higher risk of EC in male, female and both sexes combined; Low intake of fruit and vegetables has been consistently associated with higher risk of EC with a meta-analysis suggesting that protective effects were more pronounced for fruit than vegetables . Families in the regions of high incidence of EC in our study reported very limited intake of fruit and vegetables relative to families in the low incidence areas, consistent with a case-control study in the region that showed a higher intake of raw vegetables reduced the risk of esophageal cancer by 40-50% .
The restricted food choice was linked with GC increase in both sexes combined. We also found a high intake of salted/preserved meat, canned fish and pickles was associated with increased GC risk in both sexes combined.
A link between certain demographic and economic features of regions and the risk for EC and GC has been shown in several studies [7, 8]. The socio-economic variables used in our study enabled three such indices to be studied: income, urbanisation and literacy. We found higher incidences of EC and GC in men and/or women were related to lower annual income, lower annual expenditure on food, lower annual expenditure on fruit and vegetables, higher percentage of unemployment, and higher percentage of employment in agriculture and construction sectors. Both cancer sites analysed in this study had higher SIR in the rural setting. This association may be related to lower SES, higher unemployment and high levels of farming in rural agglomerations.
In our study, expenditure on food in general and expenditure on fruit and vegetables had large positive factor loadings on the income and urbanisation indices. In addition, income and urbanisation indices were positively correlated with unrestricted food pattern and negatively correlated with restricted food pattern. This correlation was stronger in the eastern region, especially in the Turkmen plain. Therefore, lower SES was linked to a diet deficient in fruit and vegetables in rural agglomerations, which is an important risk factor for EC and GC. An increased risk of gastric cancer associated with agricultural occupations has been consistently reported, and exposure to pesticides, organic and inorganic dusts, fertilizers, and nitrates has been suggested as the major contributing risk factors [40–42]. There is no Pesticide Register in Iran to compile information on the use of these products. As a result, specific ecological indicators cannot be used to measure the populations' exposure to pesticides. Consequently, the percentage in agricultural occupations, where pesticide exposure could be assumed to be higher, and the urbanisation score were used as indirect indicators of the use of pesticides in agglomerations. We found a significant negative association between EC and GC risk and urbanisation score.
Some details of our study methods require discussion. First, the exact timing of SES and diet-related exposures and cancer occurrence is important for our study. The lag time between risk factors exposure and EC and GC cancer development was ascertained for 3 large prospective cohort studies involving more than half a million men and women [43–45]. In these prospective cohort studies a lag time between 6 to 12 years was long enough for the development of EC and GC in healthy participants, and, more importantly, to find a significant association between SES and dietary exposures and EC and GC cancer occurrence. Our study had an average lag time of 10 years, with a range of 6-12 years, between exposure measurements (1993-1996) and outcomes (2001-2005), which is consistent with these findings.
Second, could human migration in the study region have caused enough selection bias to influence the result? It is known that external migrants to the study region have lower incidence of EC and similar GC incidence to the national rate . Between the 1995 and 2005 censuses 556,455 people (on average 1.4% per annum of the study population) migrated to the study region. Most immigrants (83%) were healthy labour force participants and their younger relatives, explaining the lower cancer rates of migrants. However, external migration from other provinces, occurring mainly to the major cities of the study region, was accountable for only 29% of total migration with internal migration accounting for the reminder. It seems unlikely that these modest migration figures would strongly influence the observed associations.
Third, controls from a local case-control study were used to identify dietary patterns. The number of controls per wards ranged from 26 in the low populated ward Bandar Gaz to >250 for wards with major cities like Babol . In order to find any selection bias due to percentage of coverage in different wards or urban and rural areas we compared age, residential place (urban/rural), sex and ward distribution of cases with EC and GC incidence for 2003 to 2006 period. There was no significant difference in these demographic characteristics between controls from the case control study and cases on the registry. About one third of the controls were selected as neighbouring the cases in the case-control study. This mechanism of control selection possibly obtained a non-random representation of dietary habits in wards. This may the dilute association between EC and GC and dietary patterns.
Fourth, in this study SES and dietary pattern scores were used as markers of the heterogeneous distribution of lifestyle and dietary factors influencing EC and GC risk. Selection of these variables was limited by the availability of information at agglomeration or ward level, so they only partially reflect the distribution of related risk factors. However, their inclusion served to smooth SIR, taking into account both the spatial relation among agglomerations and the variability associated with these indices.
Fifth, justification of sample size is necessary. For factor analysis it is recommended that five subjects per item, with a minimum of 100 subjects regardless of the number of items is a sufficient sample size . There were 17 food items and 2322 subjects in the dietary pattern analysis and 12 Socio-economic items and 152 units (agglomerations) for the SES factor analysis, and so these met the minimum sample size criteria. To the best of our knowledge no study has focused on sample size and robustness issues in multilevel Poisson regression in a comprehensive manner. However, results from a simulation study suggest that for generalised linear mixed models with low prevalent events at least a minimum of 100 groups and 30 to 50 individuals per group were necessary . Our study contained 152 groups (agglomerations) and a mean of 11 and 16 cases for EC and GC. While the group size was large enough for accurate regression parameter estimation, small sample size within agglomerations suggested possible bias in the second level standard errors.
Ecologic studies are perhaps best considered to be hypothesis generating, although small area analysis tends to reduce ecological fallacy, since the populations defined by agglomerations boundaries are more homogeneous. While this might well be true of villages and towns of average size, in large cities this may not be so. However, the results reported here correspond to an overall mean, and socio-economic and dietary patterns differences inside cities have been disregarded. It would be interesting to extend our work by assessing whether such differences exist in major cities, such as Sari, Ghaemshahr and Gorgan.