Do marginalized neighbourhoods have less healthy retail food environments? An analysis using Bayesian spatial latent factor and hurdle models

Background Findings of whether marginalized neighbourhoods have less healthy retail food environments (RFE) are mixed across countries, in part because inconsistent approaches have been used to characterize RFE ‘healthfulness’ and marginalization, and researchers have used non-spatial statistical methods to respond to this ultimately spatial issue. Methods This study uses in-store features to categorize healthy and less healthy food outlets. Bayesian spatial hierarchical models are applied to explore the association between marginalization dimensions and RFE healthfulness (i.e., relative healthy food access that modelled via a probability distribution) at various geographical scales. Marginalization dimensions are derived from a spatial latent factor model. Zero-inflation occurring at the walkable-distance scale is accounted for with a spatial hurdle model. Results Neighbourhoods with higher residential instability, material deprivation, and population density are more likely to have access to healthy food outlets within a walkable distance from a binary ‘have’ or ‘not have’ access perspective. At the walkable distance scale however, materially deprived neighbourhoods are found to have less healthy RFE (lower relative healthy food access). Conclusion Food intervention programs should be developed for striking the balance between healthy and less healthy food access in the study region as well as improving opportunities for residents to buy and consume foods consistent with dietary recommendations.


Background
A growing body of literature has shown that neighbourhood retail food environment (RFE) has a role in shaping residents' food shopping and consumption behaviours [1][2][3][4][5]. Identifying and modifying characteristics of neighbourhood RFE could therefore be an important step in promoting population-wide healthy eating and reducing diet-related chronic diseases. An extensively explored research question is whether the RFE is less healthy in marginalized 1 neighbourhoods, wherein residents are more vulnerable to adverse health outcomes. The exploration is largely motivated by the deprivation amplification hypothesis, which postulates that residents living in deprived neighbourhoods tend to have fewer health-promoting resources such as healthy foods [6]. In light of Lytle's conceptual model of eating behaviours [7], the more people are constrained by individual (e.g., disability) and social (e.g., income) factors, the more their eating behaviours are explained by the food environment. In other words, Lytle's model posits that marginalized residents are particularly at risk of poor diet and subsequent nutrition-related chronic disease if they live in a less healthy RFE.
Nevertheless, findings in terms of the association between marginalization and RFE are mixed across countries. Studies from the US consistently indicate that neighbourhoods with lower income and higher proportions of minority residents have reduced healthy food access, but the evidence is weak in other developed countries including Canada [8][9][10]. These inconsistent findings in past studies do not conclusively answer the question of whether marginalized neighbourhoods have a less healthy RFE, in part because of limitations in the approaches used to characterize the 'healthfulness' of neighbourhood RFE and neighbourhood marginalization as well as deficiencies in the statistical methods used.

Characterizing neighbourhood RFE healthfulness
The 'healthfulness' of the neighbourhood RFE has been characterized using numerous methods. For example, focusing on absolute densities or numbers of so-called healthy food outlets such as supermarkets represents a focus on a single dimension of the complex RFE and thus could be biased. As reported, densities of healthy and less healthy food outlets are positively correlated, indicating that a neighbourhood could simultaneously have high densities of healthy and less healthy food outlets [11]. Recent studies have attempted to characterize the RFE healthfulness using relative healthy food access metrics, such as the proportion of healthy food outlets of all accessible food outlets, see for example the modified RFE index [12].
These relative measures however, ignore in-store characteristics (i.e., the quality and price of available foods as well as in-store marketing) which could vary within outlet types. For instance, the literature has suggested variations in shelf-space devoted to fruits and vegetables or healthy eating options, which have been proven relevant to healthy eating, within the same outlet types across neighbourhoods [13,14]. Moreover, using outlet types to categorize healthy and less healthy food outlets has the potential to misclassify outlets and exclude outlets (e.g., specialty food stores) whose category is undetermined with a dichotomous classification scheme [15]. Another limitation associated with crude proportions for estimating neighbourhood RFE healthfulness is its uncertainty. Two areas with the same crude proportions, say 0.5, but different total number of accessible food outlets, say 2 and 20, respectively, are regarded to have a RFE with the same level of healthfulness.

Characterizing neighbourhood marginalization
Much of the extant research is also limited by inadequate characterizations of neighbourhood marginalization. Most studies, in particular those in the US, have explored the association between individual socio-demographic and/or socio-economic indicators (i.e., proportions of low-income and minority residents) and the neighbourhood RFE. These individual indicators represent but a small fraction of marginalization which is a multi-faceted construct. Hence, many previous conclusions regarding these associations are actually based on associations between an oversimplified metric of the neighbourhood RFE and specific indicators of marginalization rather than multi-dimensional marginalization. The literature suggests that representing an overall construct such as marginalization by selecting a particular facet of the construct could "reduce strength of the intended signal and thus underestimate its association with the outcome of interest" [16]. In addition, selecting individual socio-economic or socio-demographic indicators is problematic since they may correlate with another indicator belonging to the same marginalization dimension, such that it could act as a proxy of its related indicator in the regression analysis and consequently the association. While the regression analysis could include all marginalization indicators, the multicollinearity problem is likely to occur.
Alternatively, marginalization can be measured with a composite index [17]. For instance, Larsen and Gilliland [18] calculated deprivation for London, Ontario by adding the standardized scores of percentage of lone-parent families, prevalence of low income, percentage of low educational attainment, and percentage of unemployment. Such a composite index may also be subject to arbitrary inclusion of marginalization indicators. Compared with the London case, a study conducted in Montreal, QC, Canada [19] included an additional indicator, the percentage of recent immigrants in the past 5 years, to operationalize deprivation.
Another limitation of current composite marginalization indices is that the included indicators are unweighted, an approach that assumes each indicator contributes evenly to marginalization. This assumption is problematic given that population structures vary across neighbourhoods [20]. To weight each indicator, statistical approaches implemented in the frequentist framework such as principal component analysis and factor analysis have been applied to construct the composite indices [17,[21][22][23]. These approaches are flawed in presuming that indicators (and the associated constructs they purport to measure) in adjacent areas are independent, an assumption usually violated in spatial studies at a smallarea level.

Statistical methods in neighbourhood RFE studies
Methodologically, with few exceptions, most studies use non-spatial statistical approaches to analyse the association between neighbourhood RFE and marginalization. For example, non-spatial versions of ordinary least square (OLS) and poisson/negative binomial regression approaches have been applied to model the continuous (e.g., distance to the nearest food outlet) [24,25] and discrete (e.g., count of accessible food outlets) [22,24,26] measures of neighbourhood RFE, respectively. Residuals from regression analyses could be spatially auto-correlated given that spatial dependence is likely to exist between RFE measures at small-area levels with adjacent areas having similar RFE, a phenomenon rooted in the understanding that socioeconomic processes occur systematically and spatially across metropolitan areas [27]. Ignoring spatial autocorrelation renders conclusions regarding the association potentially invalid. The mixed findings in the literature could also be partly attributed to this methodological limitation. A recent meta-analysis of 54 papers revealed that although the spatial nature is widely acknowledged in RFE studies, very few adopted appropriate spatial statistical approaches [28].
Of the few studies that did use spatial approaches, Baker et al. [29] applied a spatial scan method to model the counts of fast-food restaurants and supermarkets in urban areas of St. Louis, Missouri. Their research found that mixed-race or white high-poverty communities and all-black communities regardless of poverty are less likely to have access to healthy foods compared to their predominantly white high-income counterparts. McKenzie [27] assessed neighbourhood disparities in supermarket access for Portland, Oregon region with a spatial error model. Findings revealed that in comparison to their counterparts in urban areas, neighbourhoods in suburban areas, either poor or non-poor, have longer travel distance and time to the nearest supermarket. Within suburban neighbourhoods however, the study found that deprivation was associated with shorter travel distance but longer travel time. Applying and comparing both spatial and non-spatial regression techniques, Wang et al. [30] analysed the relationship between spatial proximity to fresh food retailers and socioeconomic status in Saskatoon and Regina, Saskatchewan, Canada at the dissemination area level. In addition to identifying significant associations between healthy food access and socio-economic variables, their research reported that in comparison with spatial regression approaches, OLS overestimated the magnitude of the associations. Lamichhane et al. [31] analysed the relationship between access to supermarkets as well as fast-food outlets and neighbourhood characteristics with a Bayesian spatial Bernoulli model for the State of South Carolina at the census block group level. Several characteristics including income, housing value, and educational attainment were found to have a positive association with access to both supermarkets and fastfood outlets, whereas a negative association was identified for characteristics such as percentage of minority and population living under poverty after accounting for geographic location (e.g., urban, rural, etc.) and population density. Finally, Lamichhane et al. [32] applied a Bayesian spatio-temporal Poisson model to analyse the relationship between sociodemographic characteristics and densities of supermarkets and convenience stores for four US cities at the Census Tract level. Results indicated that poorer neighbourhoods have better access to both supermarkets and convenience stores after controlling for covariates including population density.

Study objectives
To address the limitations in past studies, this research uses measures of the consumer nutrition environment (a Canadian adaptation of the widely-used NEMS-S [33] and the NEMS-R [34]) to classify "healthy" versus "less healthy" food outlets rather than assuming invariance in the consumer nutrition environment within outlet types.
Second, this study constructs four composite indices representing the four different marginalization dimensions for the study region, namely residential instability, material deprivation, dependency, and ethnic concentration, using a spatial latent factor model. A recent study reported that compared to its non-spatial counterpart, the spatial latent factor model provides more precise estimation for composite dimension scores, which thus enables more accurate assessment of the association between dimensions of neighbourhood environment and health outcomes [35]. Specifically, each marginalization dimension is derived from a number of relevant indicators which are theoretically informed and have been empirically validated [17]. These dimensions have been proven to be strongly and significantly associated with several public health outcomes derived from the nationally-generalizable Canadian Community Health Survey.
Finally, using Bayesian spatial hierarchical models, this research investigates whether marginalized neighbourhoods experience less healthy RFE. Healthfulness of neighbourhood RFE is represented as relative healthy food access and modelled via probability distributions rather than crude proportions of healthy food outlets. Various buffering sizes are used to characterize neighbourhood RFE, accounting for potential transportation modes. More details regarding the datasets and methodologies are given in the following sections.

Study region
Our study was conducted in the Regional Municipality of Waterloo ( Fig. 1), Ontario, specifically the cities of Waterloo, Kitchener, and Cambridge, which include 625 dissemination areas (DA). For reference, a DA is the smallest census area in Canada that covers the entire territory and follows roads and physical boundaries [36]. DAs are delineated such that the population size is generally between 400 and 700 [36]. The average population density in the study region was 3273.37/km 2 , ranging from 1.26 to 16,754.11/km 2 .

Marginalization indicators
Guided by contemporary theories regarding marginalization in Canadian societies [37,38] and the selection of characteristics for constructing areal deprivation indices in previous studies [39][40][41], we followed Matheson et al. 's approach [17] to include 18 indicators from 2006 Canadian census that belong to four marginalization dimensions: residential instability, material deprivation, dependency, and ethnic concentration ( Table 1). The inclusion of these indicators enables a comprehensive depiction of neighbourhood marginalisation which involves diversified social problems relevant to health. The hypothesized loading sign of the indicator and its corresponding marginalization dimension, which indicates the direction of correlation, is also presented. For example, percentage of living alone (R1) is assumed to be positively associated with residential instability, whereas percentage of dwellings that are owned (R6) is presumed to have a negative loading.

Measures of neighbourhood RFE
Food stores and restaurants were categorized as healthy if their nutrition environment measures survey (NEMS-S or NEMS-R) score fell within the highest two quartiles. The NEMS-S [33] and NEMS-R [34] are inventory-type measures of food stores and restaurants, respectively, that score outlets according to the quality, relative affordability, availability, and marketing of foods and beverages that comprise a large proportion of caloric intake at the population level. Data collection methods employed in the current study have been reported in detail elsewhere [42,43]. Briefly, the Region of Waterloo's public health inspection database was used to identify food outlets, and systematic direct observation was used to identify additional outlets and remove non-existent food outlets within the three cities from the sampling frame. One of each chain convenience store, pharmacy and superstore, and each grocery store and independently-owned convenience store, pharmacy, and specialty store in the three cities were assessed using the NEMS-S adapted for Canada (n = 422 stores). One of each chain restaurant and each independently-owned restaurant was assessed using the NEMS-R (n = 912). NEMS food outlet scores ranged from 0 to 43 for food stores and from −11 to 37 for restaurants. Data were collected in 2010.
The number of accessible healthy and total food outlets within 1, 4, and 8 km network buffering zones were calculated from each DA's centroid. The first cut-off represents a walkable distance (10-15 min) which has been widely used in Canadian studies [18,19,24,44], while the second, which has been used in past research for the same study region [45], represents a 5-min driving distance and also represents accessibility for people who use alternative transportation modes such as bicycling and public transit. A third buffering size which approximately represents a 10-min driving distance, 8 km, is used for testing the sensitivity in terms of how the relationships change under the assumption that residents own cars. Descriptive statistics of accessible healthy and total food outlets are shown in Table 2.
The count and crude proportion of accessible healthy food outlets are mapped in Fig. 2. Areas without access to food outlets within a walkable distance are highlighted using hatch lines in Fig. 2b. Within 1 km, the central parts

Methods
We use Moran's I statistic to test spatial autocorrelation within each marginalization indicator and RFE measures including counts and crude proportions of healthy food outlets. A Moran's I value approaching 1/−1 indicates strong positive/negative spatial autocorrelation, indicating that adjacent neighbourhoods have similar/dissimilar values of marginalization indicators and RFE. In contrast, a value equal to or close to zero suggests spatial randomness. In other words, values of marginalization indicators and RFE are randomly distributed over space. Spearman's rank correlation analysis is applied to examine the correlations between indicators belonging to the same marginalization dimension. Below we detail the spatial latent factor model for constructing marginalization dimensions and spatial regression models for exploring the association between RFE healthfulness and marginalization dimensions as well as population density. All models are implemented in the Bayesian framework. For reference, Bayesian approaches combine prior knowledge and observed data to estimate posterior distributions of unknown parameters.

Spatial latent factor model
Given that each marginalization indicator is theoretically linked to a specific dimension [17], the confirmatory rather than the exploratory factor model is used. Except for the dimension an indicator belongs to, the factor loadings of this indicator on other dimensions are set to zero. Similar approaches have been applied in Congdon [46][47][48]. Specifically, the normalized marginalization indicator j at area i (denoted as V ij ) is assumed to follow a normal distribution with mean (α j + δ j × X ni ) and variance σ 2 j [Model (1)], where X ni is the nth marginalization dimension at area i (that V ij belongs to), which is spatially structured (see Appendix 2 for more details); α j is the intercept representing the average of indicator j over the study region; and δ j is the factor loading of V ij on X ni . For reference, the constructed factors X 1i , X 2i , X 3i , and X 4i represent residential instability, material deprivation, dependency, and ethnic concentration, respectively.

Model for the 1 km dataset: spatial hurdle model
Considering that ~30 % of DAs (178 out of 625) had no access to healthy food outlets within a walkable distance and adjacent areas have similar healthy food access, we used a spatial hurdle model to analyse the 1 km dataset, accounting for the potential zero-inflation and spatial autocorrelation. Similar spatial hurdle models have been applied to model emergency department visits [49,50] and adult mortality [51] with excess zeros. An alternative to the hurdle model for accounting for zeroinflation is the zero-inflated model [52], which assumes zeros arise from two sources-the "structural" zeros and "chance" zeros. The hurdle model is appropriate for this study because cases of zero accessibility are fully observed rather than latent-a DA either can or cannot access healthy food outlets within a walkable distance, and this access is not dependent on chance. Using a binomial hurdle model (more details given in Appendix 1), zero counts and positive counts are modelled via a Bernoulli distribution with probability parameter π i and a truncated binomial distribution with probability parameter p i , respectively. Specifically, π i represents the likelihood of a binary indicator-whether or not a DA has access to healthy food outlets, while p i is the probability of a food outlet being healthy in DA i , which represents the prevalence of healthy food outlets (thus the healthfulness of neighbourhood RFE). Notably, p i is equivalent to a modelled version of the relative healthy food access [45]. Compared with calculated or crude proportion of healthy food outlets, p i is a more robust metric to reflect RFE healthfulness. Using a sampling distribution (i.e., binomial) to model empirical counts (e.g., the number of accessible healthy food outlets) that occur as proportions (i.e., the proportion of healthy food outlets), the uncertainty associated with crude proportions of healthy food outlets as shown in Fig. 2 can be accounted for by incorporating the sample size (i.e., the total number of accessible food outlets). Logistic regression was further performed for π i and p i [Models (2), (3)], where α 1 and α 2 are intercepts for the Bernoulli and truncated binomial components and represent the average (logit) probability to access healthy food outlets and the (logit) average RFE healthfulness (or relative healthy food access) over the region, respectively. X T is a 1 × 5 vector of covariates (with corresponding regression coefficient vectors β 1 and β 2 for Bernoulli and truncated binomial components, respectively). In particular, these coefficients represent the four marginalization dimensions (X 1i , X 2i , X 3i , and X 4i ) estimated from Model (1) and population densitya major driving factor of food outlet distributions [53,54]. The parameter vectors u (u 1i and u 2i ) and s (s 1i and s 2i ) are unstructured and spatial random effects (a.k.a., heterogeneity), respectively. These random effects are included to account for unmeasured covariates (spatial or non-spatial), overdispersion, and spatial autocorrelation [55].

Model for the 4 and 8 km datasets: spatial binomial model
A regular spatial binomial model is used for the 4 and 8 km datasets because all DAs have access to healthy food outlets within the 4 and 8 km buffers. Specifically, the number of accessible healthy food outlet is assumed to follow a binomial distribution with probability parameter p i . Similarly, a logistic regression model is fitted for p i [Model (4)]. Symbols in Model (4) refer to the same variables in Models (2) and (3).

Model fit and implementation
Prior specifications are provided in Appendix 2. Models were implemented with the WinBUGS software [56]. The spatial latent factor model [Model (1)] was jointly implemented with spatial hurdle model [Models (2), (3)] and spatial binomial model [Model (4)], respectively, accounting for uncertainties associated with the constructed marginalization dimensions. Two parallel chains were fitted for the models, starting with diverging initial values. We checked model convergence by visually examining trace plots, history plots, autocorrelation plots, and Gelman-Rubin statistic plots. Model selection was based on the deviance information criterion (DIC) [57]. The best model is the one with lowest DIC. We ran each chain for 600,000 iterations, discarded the first 200,000 as burn-ins, and kept every 40th sample, resulting in a total of 20,000 samples for posterior estimates. Sensitivity analysis for prior specification was performed with alternative vague priors for parameters in the models. Similar results were obtained and DIC difference is smaller than 5, indicating that modelling results are insensitive to prior selections.

Moran's I analysis of marginalization indicators and RFE measures
Results of Moran's I analysis for marginalization indicators are presented in Table 3. Most indicators are found significantly and spatially correlated with the exception of M4 (% of unemployment), D2 (dependency ratio), and E1 (% of 5-year recent immigrants), indicating the necessity to use spatial statistical approaches to construct the composite marginalization dimensions. Table 4 shows results of Moran's I test of count and crude proportions of healthy food outlets. All RFE measures at the three scales are significantly auto-correlated with high autocorrelation except the crude proportion at the 1 km scale, which has a moderate autocorrelation. This finding indicates that adjacent areas have similar absolute and relative healthy food access thus again demonstrates the necessity to apply spatial statistical approaches.

Bivariate correlation analysis of marginalization indicators
Results of bivariate analysis of marginalization indicators are shown in Table 5. As expected and consistent with previous findings [17], indicators belonging to the same marginalization dimension are significantly and highly or moderately correlated. Exceptions are R2 and R7, M1 and M4, and M4 and M6, which have significant but weak correlations.

Spatial latent factor modelling
Factor loadings from the spatial latent factor model [Model (1)] are presented in Table 6. All indicators significantly load on their corresponding marginalization dimensions, with the expected positive or negative sign shown in Table 1. The posterior mean as well as the 95 % credible interval (CrI) 2 of factor loadings ascertain indicators that most central to defining corresponding marginalization dimensions. For example, the level of material deprivation, dependency, and ethnic concentration seem to be mainly driven by the percentage of government transfer payment, percentage of seniors (65+), and percentage of visible minority, respectively, whereas all indicators of residential instability similarly relate to the constructed factor, with the exception of the percentage of residential mobility (same house as 5 years ago) which has a relatively low impact.
We map the four marginalization dimensions constructed from the spatial latent factor model (Fig. 3). Clear spatial patterns of the four marginalization dimensions can be identified from the map: areas with high residential instability locate along the main road-King Street-in the region, mainly concentrating in the central parts of Waterloo, Kitchener, and Cambridge. Highly materially deprived areas locate in central Waterloo, central and northeast Kitchener, and south Cambridge. Five distinct clusters of areas with high levels of dependency are found at south Waterloo, central Kitchener, and west Cambridge. As for areas with high ethnic concentration, they cluster at west Waterloo and east Cambridge, and scatter across the region.

Spatial regression
Results regarding the associations between RFE healthfulness and marginalization dimensions as well as population density are presented in Table 7

Modelling results interpretations
In the region of Waterloo's cities, neighbourhoods with higher residential instability and material deprivation are more likely to have access to healthy food outlets (i.e., better absolute healthy food access) within a walkable distance. This makes sense since healthy food outlets ( Fig. 1) as well as residentially instable and materially deprived areas (Fig. 3) concentrate along the arterial streets of the region. This finding aligns with previous Canadian findings that socio-economically deprived residents have better access to absolute densities of healthy food outlets [10, 19, 22, 24-26, 30, 44, 58]. A probable explanation is that residents who are socio-economically deprived might be more likely to find affordable housing in highly populated areas [22,24] where healthy food outlets are located, given that population density is a driving force of food outlet distribution as noted above.
In contrast, modelling the relative healthy food access (via binomial component from the spatial hurdle model), which represents RFE healthfulness in this study, reveals that areas with higher material deprivation have a relatively less healthy RFE at the walkable distance scale, despite higher probability to access to healthy food outlets. The finding is contrary to past Canadian studies that explored the relationship between material deprivation and relative healthy food access which is measured with crude proportions. For instance, most materially deprived neighbourhoods in Toronto were found to have healthier RFE (i.e., lower crude proportion of less healthy food outlets) [22]. Mercille et al. [58] reported that the poorest areas in Montreal have lower crude proportions Table 3 Moran's I test of marginalization indicators p value: **** <0.001; ** <0.01; * <0.05 The smaller the p value, the less likely that the correlation occurs by chance of fast-food outlets over all accessible restaurants and higher crude proportions of fruit and vegetable stores over all accessible food stores in comparison to their wealthier counterparts. This inconsistency could be attributed to the differences between our research and previous studies in terms of the methods used for differentiating 'healthy' and 'less healthy' food outlets, the completeness of RFE datasets, and the appropriateness of statistical modelling approaches. Compared with the two Canadian studies noted above, our study is strengthened by differentiating 'healthy' and 'less healthy' based on in-store characteristics instead of food outlet types. This approach for defining healthy food outlets enabled all retail food outlets to be included in our dataset, which was a major strength of the current study. Also, in contrast to previous studies, we explicitly accounted for spatial autocorrelation occurring within RFE measures and marginalization indicators as demonstrated above, which increases the reliability of our results. Finally, we modelled the count of healthy food outlets with a binomial distribution rather than the crude proportion of healthy food outlets, which is associated with uncertainty thus is not a stable estimation of RFE 'healthfulness' . As mentioned, our modelling approach is more robust to analyse relative healthy food access because it accounts for the underlying total number of accessible food outlets (thus the number of accessible less healthy food outlets) which cannot be reflected by crude proportions. Not surprisingly, while increasing the buffering size to 4 km and 8 km (thus increased mobility) based on alternative transportation modes such as bicycling, public transit, and driving, population density and marginalization dimensions are not significantly associated with RFE healthfulness since discrepancies in relative healthy food access between areas decrease with larger travel distances (Fig. 2b, d, f ). An exception is the negative association between dependency and RFE healthfulness at the 4 km scale, indicating that neighbourhoods with higher proportions of seniors and children have a less healthy RFE; however, this might not be problematic given that these dependent populations may be more likely to walk than to take public transit or bicycle.

Policy implications
Findings from our study are important and informative for food environment planning and interventions for combating adverse diet-related health outcomes. Specifically, rather than improving absolute densities of healthy food outlets, a more pressing mission may be to strike a better balance between healthy and less healthy food access, especially given that increasing evidence shows that residents with higher relative healthy food access have healthier food purchasing [11,59] and consumption [60,61] behaviours, and lower body weight [62][63][64][65]. Traditional approaches such as building new supermarkets [66] have been proposed in the US for improving healthy food access thus the balance, but were found ineffective for promoting healthy eating [66] possibly due to residents' hesitation of relying on a new food store [67]. Policy and program interventions to improve the food environment in Canada are nascent [68]. One potentially effective intervention for the Region of Waterloo could be modifying the in-store characteristics of existing food outlets in materially deprived areas, for example, providing fruits and vegetables in less healthy food outlets through intervention programs such as healthy corner stores, which have been implemented in municipalities of Toronto and Vancouver [69,70]. This approach, if undertaken, should prioritize food outlets within a walkable distance to areas that fall inside the highest material deprivation quantile (Fig. 2b). An alternative intervention could be restricting the construction of less healthy food outlets within or around these neighbourhoods via zoning bylaws. While Canadian planning laws do not permit discrimination against specific types of food outlets [71,72], the Regional Municipality of Waterloo can apply several urban planning tools to limit the establishment of less healthy food outlets, for example, prohibiting fastfood restaurant establishments and regulating the densities or quotas of less healthy food outlets in materially deprived neighbourhoods [71,73].
Modelling results of 4-and 8 km-datasets suggest that increasing mobility might be effective for alleviating the disparities of RFE healthfulness. Yet travelling further to access healthier RFEs could economically burden materially deprived residents. As discussed in LeClair and Aksan [74], the high travelling costs might outweigh the cost savings from food shopping, thus deterring residents from taking public transit to procure healthy foods. In this sense, improving public transportation to healthy food retailers via interventions such as providing healthy food outlets (e.g., supermarkets) sponsored shuttle services could be potentially effective for encouraging materially deprived residents to travel beyond the walkable zones for food purchasing, complementary to aforementioned interventions.

Methodology implications
Methodologically, this study contributes to the RFE literature by introducing a flexible modelling approach to study the association between neighbourhood RFE and a b c d   marginalization. While the spatial lag [30] and spatial error [27,30] models are inappropriate to model count data (e.g., number of supermarkets accessible to a DA), the applied Bayesian hierarchical approach can model the count of food outlets by following a discrete distribution, for example the binomial distribution as demonstrated in this study, while simultaneously account for spatial autocorrelation by including spatial random effects. Moreover, this Bayesian approach applied is superior to the spatial scan statistical method [29], which is also capable of modelling count data, in terms of its feasibility to incorporate covariates. Another noticeable advantage of the applied Bayesian approach is its capability to model spatio-temporal RFE datasets, as demonstrated by Lamichhane et al. [32]. Future research could examine how neighbourhood RFE might change over time in tandem with varying levels of marginalization. Furthermore, the spatial hurdle model used for analysing the 1 km dataset accounts for zero-inflation, an issue rarely reported by past RFE studies but could occur in the case that a large portion of neighbourhoods in the study region have no access to (healthy) food outlets within a walkable distance. Not appropriately taking into account zero-inflation may result in biased or imprecise inferences. Although the negative binomial model implemented via conventional frequentist approaches can deal with zero-inflation in some cases, it cannot easily address the spatial autocorrelation issue.

Study limitations
Findings of this study are subject to several limitations. First, to create the buffering zones, the geographic centroid rather than the population centroid was used to represent each DA. We consider this approach acceptable considering that most DAs are relatively small so geographic centroids approximate population centroids. Second, we used 1, 4, and 8 km to represent potential transportation modes; however, a unified travelling distance might not be suitable for all DAs. In reality, residents in different neighbourhoods could take different times to travel 4 km by bus due to varying public transit availability and routes. More nuanced methods for characterizing transportation-based RFE (see for example Farber et al. [75]) should be applied in future research. Lastly, 'healthy' and 'less healthy' were differentiated based on a binary category. Although we observed similar results by conducting sensitivity analysis with a more rigorous definition of healthy food outlets (i.e., outlets falling into the highest tercile instead of the highest two quartiles), this categorization approach should be refined in future studies.

Conclusion
This paper contributes empirically and methodologically to the RFE literature that explores the association between neighbourhood marginalization and RFE healthfulness. Using Bayesian spatial hierarchical models, this research found that residents in neighbourhoods with higher residential instability, material deprivation, and population density are more likely to have absolute access to healthy food outlets within a walkable distance. Materially deprived neighbourhoods however, are also more likely to have a relatively less healthy RFE at the walkable distance scale. These findings indicate that a simple 'yes' or 'no' answer for the deprivation amplification hypothesis in the context of RFE is inappropriate. To infer a relatively unbiased conclusion, incorporating the complete RFE dataset, considering various assessment strategies (i.e., absolute and relative access) of RFE, and applying sound spatial statistical approaches are warranted.
For the Region of Waterloo in particular, striking the balance between healthy and less healthy food outlets in these neighbourhoods via interventions such as modifying in-store characteristics, restricting the opening of less healthy food outlets, and improving public transit to healthy food outlets may be warranted. The Bayesian spatial hierarchical modelling approach, including spatial latent factor and spatial hurdle models, as shown in this study can be further explored in other Canadian settings or different countries. Future research could tailor the buffering cut-offs for different types of food outlets, which are potentially linked to behaviours underlying travel patterns to visit specific types of food outlets and subtypes among them.