Research | Open | Published:
Demarcation of local neighborhoods to study relations between contextual factors and health
International Journal of Health Geographicsvolume 9, Article number: 34 (2010)
Several studies have highlighted the importance of collective social factors for population health. One of the major challenges is an adequate definition of the spatial units of analysis which present properties potentially related to the target outcomes. Political and administrative divisions of urban areas are the most commonly used definition, although they suffer limitations in their ability to fully express the neighborhoods as social and spatial units.
This study presents a proposal for defining the boundaries of local neighborhoods in Rio de Janeiro city. Local neighborhoods are constructed by means of aggregation of contiguous census tracts which are homogeneous regarding socioeconomic indicators.
Local neighborhoods were created using the SKATER method (TerraView software). Criteria used for socioeconomic homogeneity were based on four census tract indicators (income, education, persons per household, and percentage of population in the 0-4-year age bracket) considering a minimum population of 5,000 people living in each local neighborhood. The process took into account the geographic boundaries between administrative neighborhoods (a political-administrative division larger than a local neighborhood, but smaller than a borough) and natural geographic barriers.
The original 8,145 census tracts were collapsed into 794 local neighborhoods, distributed along 158 administrative neighborhoods. Local neighborhoods contained a mean of 10 census tracts, and there were an average of five local neighborhoods per administrative neighborhood.
The local neighborhood units demarcated in this study are less socioeconomically heterogeneous than the administrative neighborhoods and provide a means for decreasing the well-known statistical variability of indicators based on census tracts. The local neighborhoods were able to distinguish between different areas within administrative neighborhoods, particularly in relation to squatter settlements.
Although the literature on neighborhood and health is increasing, little attention has been paid to criteria for demarcating neighborhoods. The proposed method is well-structured, available in open-access software, and easily reproducible, so we expect that new experiments will be conducted to evaluate its potential use in other settings. The method is thus a potentially important contribution to research on intra-urban differentials, particularly concerning contextual factors and their implications for different health outcomes.
In the area of epidemiological studies, the 1990s witnessed increasingly widespread use of ecological methods for the study of contextual factors, a field known as ecoepidemiology [1–4]. Since then, various researchers have focused on improving methods that allow a better grasp of the importance of collective social factors in processes related to population health [5–8]. The central concept of this research is that although health outcomes occur in individuals, a large share of the determinants of these processes take place at other levels, referred to generically as collective or contextual [8, 9]. The development of multilevel statistical models that allow analysis of contextual levels simultaneously with the individual level has helped expand our understanding of the role played by multiple social factors in health outcomes [10–13].
Although spatial approaches have still not been fully used in epidemiological studies in Brazil, their integration into research on contextual factors shows huge potential for application in health studies. One of the most widely used ways of demarcating population groups on collective scales is by spatial partitioning of the territory [14–17]. Area of residence, for example, has been used to grasp the social and environmental conditions to which these groups are exposed .
In addition to improvement in measurement and data sources for intrinsic group-level properties, another major challenge for researchers is the definition of adequate spatial units of analysis for studying properties potentially associated with the target outcomes. Particularly in countries with capitalist economies, and especially in developing countries like Brazil, the way urban territory is occupied both reflects and is conditioned by the political and economic macrostructure. Thus, the incorporation of spatial units in the study of social inequalities in health is essential for capturing these conditioning and determinant factors.
Spatial units of analysis at the contextual level vary according to the scales of investigation (global, regional, local) and criteria (social, political-administrative, ecological) adopted by the study. The most commonly used divisions of municipal urban territory in health studies at the local level are political-administrative, like districts or boroughs, administrative neighborhoods, ZIP areas, and census tracts . Linked to these political-administrative spatial sections, various types of information are available in databases, including Health and Environmental Information Systems .
In a recent series of North American and British studies, various political-administrative units of analysis, referred to generically as "neighborhoods" , have been used to detect relevant contextual effects in the occurrence of health outcomes, as for example in self-rated health, children's health , infectious diseases, adult health [23, 24], lifestyle [25, 26], mortality , and others. Based on the results of these and other studies, contextual socioeconomic factors exert a specific influence on the prediction of health outcomes, even after considering individual socioeconomic conditions .
The main advantage of using political-administrative units is the ease in georeferencing various data in geographic information systems (GIS). Having been organized in hierarchically nested subsets, information at both the individual level (whose address should be geocoded) and other levels can be referred to the respective political-administrative unit for study at the collective level. Although such divisions are useful for a general approach, they present problems in relation to their availability and limitation for health research and public policy proposals.
The demarcation of administrative districts and administrative neighborhoods (subdivisions of municipality) is not legally regulated in all Brazilian municipalities, and only some large State capitals have such geographically demarcated units. Even where they exist, these units include populations of widely varying sizes, with highly diverse residential patterns and very heterogeneous socioeconomic levels. Meanwhile, census tracts are minimum spatial units for census data collection and spatial reference  but with insufficient size to represent collective social processes that occur at the local level [29, 30]. In addition, the small number of inhabitants in census tracts produces problems of excessive statistical variability in the epidemiological and social indicators.
From the point of view of the social unit, few studies have focused on guaranteeing the representation of social processes at the collective level. The concept of neighborhoods as "distinctive areas into which larger spatial units may be subdivided... The distinctiveness of these areas stems from ... geographical boundaries, ethnic or cultural characteristics of the inhabitants, psychological unity among people who feel that belong together, or concentrated use of an area's facilities for shopping, leisure, and learning", integrates the sociological approach and provides the basis for the demarcation of representative spatial units for social processes in order to study potentially important contextual factors for health outcomes. In this sense, a proposal that has been explored is the demarcation of local neighborhoods as units of analysis, consisting of sets of relatively homogeneous census tracts according to socioeconomic and spatial contiguity criteria, such as that designed in the Project on Human Development in Chicago Neighborhoods .
As a spatial construct, the neighborhood denotes a geographic unit whose residents share proximity and the circumstances that derive from it , like social unity, involving recognition of identity among the inhabitants and in the development of interpersonal networks between neighbors. These social properties are important for (1) supporting collective actions in given circumstances and (2) providing the basis and motivation for collective actions . To allow the study of these properties and their influence on health, an operational definition of neighborhood is essential, which can be facilitated by means of GIS tools and the availability of georeferenced social and population data.
In Rio de Janeiro, in particular, where the model of socio-spatial segregation differs from the downtown-versus-suburb pattern , administrative neighborhoods are real mosaics that harbor areas of great socioeconomic prosperity, permeated by impoverished areas. Some cases involve islands of prosperity or poverty. To contemplate this complexity, the demarcation of local neighborhoods as units of spatial and social analysis based on the clustering of census tracts is a plausible alternative, since it allows capturing diverse socio-spatial processes that occur among residents of these areas.
The objective of this work is to propose the demarcation of local neighborhoods as geographic units, through spatial analysis that combine contiguous and socio-demographically homogeneous census tracts. The expectation is to discriminate between distinct population groups in the city of Rio de Janeiro.
In 2000, the city of Rio de Janeiro had a total population of some 7,000,000 and consisted of 8,145 census tracts  distributed across 158 administrative neighborhoods.
The procedure for the creation of local neighborhoods was based on clustering of census tracts with permanent private households (homes occupied throughout the year, regardless of season), contiguous and internally homogeneous in relation to the selected socioeconomic indicators.
The procedures for classification of areas that allowing clustering a large set of data from smaller areas in groups (regions of analysis) with the objective of maximizing the internal (within-group) homogeneity and external (between-group) heterogeneity are referred to as regionalization. According to Duque et al. , various methods can be used for regionalization, and they include two major groups, differentiated on the basis of whether or not they explicitly consider spatial contiguity between areas.
In the current study, among the methods that consider spatial contiguity, we conducted a classification of the census tracts for spatial clustering in local neighborhoods using the SKATER method (Spatial 'K'luster Analysis by Tree Edge Removal), using algorithms adapted by Assunção et al. , initially proposed for use by the Brazilian Institute of Geography and Statistics (IBGE), or National Census Bureau, and subsequently compiled in the Skater software and made available through TerraView . This method is a heuristic model based on the graph theory , whose partitioning is performed with the "spanning tree edge removal" method. SKATER was designed to define homogeneous areas based on clustering of smaller areas (spatial objects) according to control variables (indicators), using the distance between their values as the combination pattern and aiming for the areas to have a minimum, previously stipulated population size. Connectivity graphs are created in order to capture the local neighborhood relationship between spatial objects and summarizes it in a minimum spanning tree whose edges (links) with the highest degree of dissimilarity are pruned successively . The result is the classification of the spatial objects in regions with maximum internal homogeneity. The geographic boundaries of the administrative neighborhoods were respected such that the demarcated local neighborhoods are hierarchically nested subsets within them, that is, there are no local neighborhoods with census tracts that belong to different administrative neighborhoods. In addition, the boundaries imposed by large geographic barriers like highways, railways, lagoons, and islands were also maintained (according with these 'natural' boundaries).
After creation of the local neighborhoods, we identified areas that are not necessarily geographically connected but which display similar socio-demographic characteristics, although they are located in distant administrative neighborhoods, and which we refer to as "super-groups". For this purpose, we conducted an analysis of the clustering of local neighborhoods with homogeneous socio-demographic patterns, using the non-hierarchical K-means method.
Brazilian population census, 2000;
Map databases of census tracts and administrative neighborhoods in the city of Rio de Janeiro for the year 2000, from the Health Information Laboratory of the Institute for Scientific and Technological Communication and Information, Oswaldo Cruz Foundation (LIS/ICICT/FIOCRUZ);
Satellite images available for viewing on Google Earth™ .
Stages performed for demarcation of local neighborhoods
Creation of socio-demographic indicators based on data from the 2000 population census for the census tracts comprising the city of Rio de Janeiro;
Exclusion of non-residential census tracts, with no population, with fewer than five permanent private households;
Cartographic revision of the map database of census tracts and administrative neighborhoods in the city of Rio de Janeiro (i.e. lines of boundaries of some census tracts polygons were not well connected; polygons of lagoons were excluded);
Linkage of the indicators to the map database;
Construction of local neighborhoods of census tracts, considering the criteria of contiguity (shared boundaries) and contingency (administrative neighborhood);
Cluster analysis of homogeneous census tracts (Cluster SKATER).
Criteria for definition of local neighborhoods
• Choice of indicators
The choice of socio-demographic indicators considered their relevance and variation across space, in order to allow discriminating between distinct areas according to each variable, and was based on previous studies in which the available variables were selected by means of principal components analysis [41, 44, 45]. We have also chosen less heavily correlated indicators, since they were more adequate for submitting to cluster analysis, thus avoiding redundancy of information . For example, if we had two indicators that were heavily correlated they could be indicating the same phenomena or processes characterizing redundant information on the dataset. In this case, it is recommended to choose one of them to submit to cluster analysis. In short, the objective was to define the minimum number of variables capable of discriminating between different population profiles.
Indicators were selected from the following three domains.
First, demographic characteristics:
- total population that is a key variable to delimit the minimum of population in regionalization process;
- permanent private household indicator of demographic concentration;
- concentration of children aged 0 through 4 years as a proxy of birth rate that allows the identification of deprived areas (where birth rates were greater than in prosperous areas);
- economic dependency ratio that configures an index of people outside the workforce, the dependent population;
- male/female ratio that is important to describe the demographic composition of areas.
Second, housing conditions:
- inadequate sanitation conditions that allow the distinction of different urban services available at different areas;
- concentration of rented homes, a typical pattern of Brazilian middle class, distinguishing them from areas where home ownerships are more common (more frequent in slums and prosperous areas);
- concentration of houses (not apartments) as opposed to vertical expansion that indicates the dominant settlement characteristic of urban sets, usually present in areas with high demographic concentration;
- inhabitants per household to identify crowded households (the information of the number of inhabitants per room is no more available in Brazil because there were changes in last census methodology).
Third, household conditions:
- mean heads-of-households schooling, a traditional indicator of population socioeconomic status;
- mean heads-of-households income, a traditional indicator of population socioeconomic status that discriminate nuances of different areas;
- mean heads-of-households income greater than 20 times Brazilian minimum wage, a variant of mean income particularly used to identify prosperous areas in Brazil.
Starting with a set of 10 indicators (Figure 1), we did various combinations, thereby reducing the number of indicators to the minimum set that allowed adequate demarcation of local neighborhoods. Most of these indicators has been used by the Brazilian Census Bureau and researchers for describing socio-demographics characteristics of Brazilian urban population. [41, 44, 45]. The adequacy of boundaries of local neighborhoods obtained with each combination of indicators was analyzed mainly by a visual assessment (as detailed bellow).
All the variables were normalized before classification, not only because some of them did not display normal distribution but also to avoid the influences of the nature of each variable (i.e. some variables were percentages, others ratio-normalization ensures that they "have the same weight" in the classification of cluster analysis).
• Population size
After analyzing the mean and maximum population size for spatial units of local neighborhoods obtained from the initial minimum population sizes (of 10,000, 7,500, and 5,000 individuals) and their respective boundaries, we established the minimum population of 5,000 residents to form each local neighborhood.
We avoided obtaining isolated areas with less than the minimum required number of inhabitants. This situation happened with all minimum population sizes because those areas could not be aggregated into a larger cluster group either because they did not show similarity regarding the socioeconomic indicators with a contiguous cluster group or because the generated cluster group should have boundaries falling inside an administrative neighborhood (the contingency geographic unit).
• Contingency geographic unit
The boundaries of the administrative neighborhoods were maintained, as a contingency geographic unit for regionalization, to provide the use of one more hierarchical level in future multilevel studies, due to the widespread availability of data in health information systems linked to this unit;
• Visual assessment by overlapping layers
The resulting local neighborhood partitions were critically evaluated by means of overlapping layers in a GIS environment and visual observation of the boundaries imposed by major geographic barriers: highways, railways, and natural geographic accidents like massifs, lagoons, and islands.
The polygons in the local neighborhood spatial units were compared visually to the presence of geographic barriers existing in the territory, identified by means of satellite images, so as to ensure that the local neighborhoods did not display such barriers internally but only on their edges (for example, a major avenue should not cross a local neighborhood, since would pose a geographic barrier that "isolates" the resident population along one of its sides from those living on the opposite side). We also visually analyzed the presence of major contrasts displayed in the form of urban occupation with different social patterns inside each administrative neighborhood; together with the production of thematic maps of socioeconomic indicators, this allowed verifying the adequacy of the boundaries for the local neighborhoods created by the process. The choice of a minimum population of 5,000 proved to be the most adequate, since for example other alternatives did not distinguish well between slums (favelas or shantytown areas) and the areas surrounding them versus the areas with distinct patterns comprising some administrative neighborhoods.
Demarcated local neighborhoods and their socioeconomic profile
The administrative neighborhoods comprising the city of Rio de Janeiro contain populations varying from 136 to 297,459 inhabitants, while the census tracts vary from 136 to 4,529 inhabitants. As a result, the mean number of local neighborhoods per administrative neighborhood was five (median 3.5) ranging from 1 (the minimum in four cases described bellow) to 39 (maximum subdivision) presented at the most populated administrative neighborhood. Larger administrative neighborhoods (more people) tend to present higher socioeconomic variability and were partitioned in more local neighborhoods.
From the total of 158 administrative neighborhoods in the city, four had total population fewer than the minimum population of 5,000 inhabitants and configured geographic isolated areas, with socioeconomic pattern extremely different from their surroundings. Nonetheless, these administrative neighborhoods were classified as four local neighborhood units: 1 - Grumari (a settlement in a coastal natural preservation area); 2 - Joá (a settlement in a coastal rock mountain); 3 - Cidade Universitária (an irregular area inside universitary campus); and 4 - Ilha de Paquetá (an island).
The mean number of census tracts allocated to each local neighborhood was 10, ranging from a minimum of one to a maximum of 36. Besides the exceptions mentioned above, five local neighborhoods with fewer than 5,000 inhabitants were demarcated as a result of the classification method. These clusters were composed by socioeconomically different census tracts compared to their surrounding, but the total population at each local neighborhood reaches slightly more than 4,000 inhabitants.
The mean number of permanent private households per local neighborhood was 2,269 (SD 758.75), ranging from 25 (Joá exception) to 6,072.
The local neighborhoods constructed on the basis of the 10 indicators totaled 800 geographic units, while those demarcated on the basis of four indicators totaled 794, with no important differences in the internal partitioning of the administrative neighborhoods. Thus, we chose the demarcation achieved with the smallest number of indicators for the final model, based on four socioeconomic indicators. These indicators were population 0 to 4 years of age, inhabitants per household, mean schooling and mean income (Figure 1). The Google Earth™ tools for approximating and distancing images, as well as for rotating the point of view and three-dimensional effects, combined with the thematic maps and road maps of the areas comprising the city, allowed evaluating the local neighborhoods' geographic boundaries.
Figures 2 illustrates the demarcation of local neighborhoods in a selected area of the city's South Side (Zona Sul), highlighting the distribution of favelas (hatched areas) in some of the local neighborhoods. For thematic visualization, we used the distribution of standardized mean income categories (values with the mean centered on zero so that negative values are below the mean and positive values above it). We observed two important results that contribute to consider local neighborhoods boundaries adequate. First, we observed that the regular census tracts located around irregular tracts (favelas) and with a similar income pattern to them were included in the same local neighborhood as shown in Figure 2 (situations A, B, and C, for example). Second, we observed that irregular tracts (favelas) adjacent to regular tracts with much diverse economic pattern were allocated into different local neighborhoods as shown in Figure 2 (situations D and E, for example).
As shown in Figure 3, the proposed model allowed discriminating between different socioeconomic profiles, even in a set with irregular occupation, as featured in the example by the Rocinha administrative neighborhood, considered homogeneous by the municipal government but not homogeneous by our modeling strategy. Therefore it was divided into seven distinct local neighborhoods (delimited by yellow lines). Figure 4 shows the profile of indicators characterizing the seven local neighborhoods demarcated into Rocinha administrative neighborhood.
Figures 5, 6, 7 and 8 highlight the Ilha do Governador area and present the contribution of each of the four socioeconomic indicators to the classification and demarcation of local neighborhoods. Within each administrative neighborhood (polygons with thicker lines), demarcation of the local neighborhoods appears (polygons with thinner lines) with the thematic visualization of the distribution (in categories) of the indicators used in the final model: mean monthly income in number of times the minimum wage (figure 5); mean years of schooling (figure 6); mean number of persons per household (figure 7); and proportion of inhabitants from zero to four years of age (figure 8).
Clustering of local neighborhood sets with similar profiles in terms of socioeconomic status (SES), even though geographically distant, allowed a synthesis of the profile of indicators in five super-groups: 1- low SES with low population density (rural); 2 - low SES with high population density (favela); 3 - lower-middle SES; 4 - middle SES; and 5 - high SES (Figure 9).
Figure 10 allows visualizing the spatial distribution of local neighborhoods resulting from the proposed method (polygons with thinner lines) and their inclusion in the socioeconomic super-groups (visualization theme). It is thereby possible to characterize the sets of local neighborhoods comprising the city of Rio de Janeiro.
Given the relatively large size of the administrative neighborhoods, they tend to show sharp socio-demographic heterogeneity. The local neighborhood units demarcated by this study allowed decreasing this residential and socioeconomic heterogeneity, adequately separating the distinct areas that comprise each administrative neighborhood. The definition of a minimum population size allowed less variability between the local neighborhoods in terms of their population contingent.
Although we did not define an upper limit to the population size at the local neighborhood, we avoided the strategy of partitioning the administrative neighborhoods in a way that would create units with higher population size. We consider that this limit should vary according to the objectives, strategies and actions related to the phenomenon or health event of interest. The social characteristics as social cohesion, cultural habits and collective values, for example, should be identified or not depending on the scale, the spatial units and, consequently, on the population size delimited. In a recent study that compares different ways of delimiting neighborhoods, the authors concluded that the size and composition of the neighborhoods may be different in different parts of a study area .
A census tract is classified as showing irregular occupation when the occupied areas were originally invaded (by squatting), with no prior order in the mode of occupation, and where the residents do not own their homes (although they later may obtain adverse possession or property deeds). Census tracts contiguous to the favelas have suffered a process of steady real estate devaluation, and in many cases, from the socioeconomic point of view, they are very similar to the irregular tracts or favelas themselves [48–50].
As shown in Figure 2 and Figure 3, the composition of local neighborhoods in relation to areas with irregular tracts (favelas) proved quite satisfactory. Some areas of the city's South Side show situations of major contrasts (situation D and E, for example), with abrupt changes in the residential and socioeconomic profile between different census tracts (Figure 3). In these cases, the proposed method achieved good discrimination between these different patterns. Other areas (situation A and B, for example) around the irregular census tracts (favelas) showed census tracts with regular occupation, but with a population having a similar socioeconomic profile, with no change in the population pattern. In these situations, in which the groups display the same socioeconomic pattern, it was possible to identify this similarity (continuity), since they constituted the same local neighborhood (Figure 2). As one shifts to the city's West Side (Zona Oeste), this situation becomes common, i.e., with fewer heavily contrasting areas.
Another example of the capacity to discriminate between different socioeconomic patterns in local neighborhoods was the partitioning of the Rocinha favela, an administrative neighborhood with approximately 60,000 inhabitants, into seven distinct local neighborhoods. Figure 3 shows the local neighborhoods' boundaries and the differences in the pattern of residential occupation that reflect the different socioeconomic conditions captured by the indicators used in defining the local neighborhoods. Figure 4 shows the profile of indicators characterizing each of the seven local neighborhoods demarcated in Rocinha. Although the entire area of the administrative neighborhood consists of irregular tracts, Figure 3 and Figure 4 illustrate how it was possible to differentiate between areas with older occupation, which have a better street layout allowing easier access, and those with more recent occupation, with worse sanitation and more precarious and less vertical housing, that is, distinct areas whose residents show different income and schooling patterns and a different concentration of children and total occupants in the households.
Although Figures 5 to 8 show a simplified distribution of the indicators in only four categories, it illustrates the importance of each of the four socioeconomic indicators in demarcating the local neighborhoods. Each single indicator's contribution is limited in terms of discriminating the different internal compositions of the administrative neighborhoods. However, combined use of the four indicators optimizes this capacity, with the joint configuration producing the partitioning that allowed distinguishing between the 794 local neighborhoods.
The income and schooling indicators allowed capturing different socioeconomic dimensions that exert distinct impacts on health conditions , so both should be considered in studies on social inequalities in health. The mean number of persons per household, an indicator of household crowding, is essential to capture the population's living conditions. Due to changes in the variables studied in the population census, there is no longer information on the mean number of household residents per room, an indicator traditionally used to characterize urban occupation [46, 47], since the households comprising the areas with the best social conditions have low resident-per-room density, while those with the worst conditions show high density. Thus, the mean number of inhabitants per household, although presenting a narrower range, proved adequate for differentiating between household density patterns in the various areas. Finally, the population's age composition, captured by the proportion of inhabitants from zero to four years of age, is an important demographic indicator with great capacity to characterize different social profiles in relation to the population turnover and growth in urban areas, especially in developing countries like Brazil .
There is general consensus in the literature that neighborhood refers to geographic units of a limited size, with relative internal demographic and residential homogeneity, as well as some level of social interaction and symbolic meaning for residents. Despite the growth in the literature exploring neighborhood effects on health, little attention has been paid to criteria and methods for demarcation of neighborhoods. The capacity to differentiate socio-spatial inequalities, demonstrated by the partitioning of local neighborhoods grasped by means of the proposed method, will allow performing new studies on the effects of the population's living conditions on health.
The process of identifying the boundaries of a neighborhood depends in part on the definition of the neighborhood that is appropriate to a particular planning initiative or study i.e., social, physical, or political. Consequently, there is no one ideal way of defining a neighborhood and its spatial boundaries .
The effect of neighborhood conditions should be looked at using several different ways to define boundary neighborhood identification: administrative (i.e. census tracts, ZIP codes), political (i.e. defined by associations, organizations), recognized (i.e. resident perception map, cognitive mapping) and created (i.e. school neighborhoods) [33, 51–53]. In a recent series of North American and British studies, various political-administrative units of analysis were referred to generically as "neighborhoods" . This is the main way that it has been used to study the effect of neighborhood conditions on health. There is little in the public health literature suggesting that alternative methods for delineating neighborhood boundaries have been attempted .
Weiss and colleagues (IMPACT study) utilized a multi-step neighborhood definition process including development of census block group maps, review of land use and census tract data, field visits and street-level observations. Defined neighborhoods (36 - 3 in each of 12 NYC communities) range from 1 to 8 census block groups, with populations ranging from 2252 to 11,503 (mean = 5320). Authors inform that the use of observation as part of the boundary definition process facilitates the identification and grouping of census block groups, having attributes consistent with the concept of "neighborhood" and with the study objectives. However, considering time and funding perspective, they concluded that, although subjectivity cannot be eliminated, neighborhoods defined this way can be compared to block group combinations identified by cluster analyses of census data .
The researchers of the Project on Human Development in Chicago Neighborhoods (PHDCN) defined neighborhoods with a method of census tracts direct aggregation, considering contiguity and some sociodemographic indicators from census (i.e. racial-ethnic composition). Sampson and colleagues collapsed 847 census tracts in the city of Chicago to form 343 neighborhood clusters - an ecological unit of about 8,000 people, large enough to approximate local neighborhoods; respectful of geographic boundaries and knowledge of Chicago's neighborhoods .
Despite we did not evaluate the symbolic significance to residents, local neighborhoods demarcated were conceptualized based on similar goal to of the two studies described above [54, 55], looking for geographic units of limited size, with relative homogeneity in housing and population, as well as some level of social interaction. Some advantages were expected with this study's methodology because we used available databases, free TerraView software and easy tools to deal with visual overlapping analysis. These characteristics allow researchers to develop studies with fewer funding and to deal with complexities presented by large urban settlements as diverse as Rio de Janeiro city.
In Brazil, there was no study published using other neighborhood demarcation than political-administrative boundaries. Only one proposal of spatial partitioning using cluster analysis was published at 1996, by Carvalho et al. , applied in an island of Rio de Janeiro municipality.
It is hoped that the use of local neighborhood spatial units of analysis in studies on the properties of contextual characteristics, like those that have been developed in the PHDCN , can be implemented in many Brazilian cities. Socioeconomic characteristics of neighborhoods, like income, schooling, age composition, racial/ethnic composition, and indices of inequality, poverty, and affluence, are associated with various health outcomes , including self-rated health  and lifestyle . Signs of physical disorder reflect the deterioration of urban space and are associated with worse health conditions [57, 58].
We particularly hope to further the study of psychosocial characteristics in the local neighborhood context and their role in the determination of health outcomes. Characteristics of the social setting like cohesion and social control, establishment of networks, organizations, and prevalent lifestyle can promote or jeopardize health. The differential capacity of neighborhoods to reinforce the residents' common values and the maintenance of effective social control explain the variations in violence rates in Chicago that are not attributable only to aggregate individual demographic characteristics . Collective efficacy (the combination of mutual trust and intention to intervene for the common good) acts as a mediator of the effects of socioeconomic stratification on violence. Informal social control and collective efficacy can also be generalized to a series of important objectives for the well-being of neighborhood populations .
In parallel with the study, another important step for the development of neighborhood and health approaches in Brazil, especially in Rio de Janeiro, is the enhancement of data georeferencing capacity for health events through precise localization of addresses in census tract and local neighborhood spatial units. Currently, most of the information published on health events only reaches the administrative neighborhood or administrative district level. Isolated initiatives require great effort for georeferencing at smaller levels, which limits the availability of health data to specific studies . This situation should change in the coming years, since the National Census Bureau (IBGE) is consolidating a street registry comprising all the census tracts of municipalities with more than 100 thousand inhabitants and has announced that it will make the registry available shortly for use in a system to locate addresses by census tract .
The influence of social processes on health is increasingly clear, and it does not suffice to merely shape a population cluster if its spatial unit of analysis fails to capture the social processes taking place between a population and its place of residence.
When studying the properties of local neighborhood spatial units, one should not lose sight of their place on macro-determinant scales. As shown in Figure 10, according to the currently proposed method, population groups were clustered in local neighborhood spatial units that are nested in administrative neighborhoods. Administrative neighborhoods, in turn, are nested in a continuum of hierarchical levels up to global levels. The multiplicity of different levels can be relevant for some research questions. Super-groups represent division of the municipal territory in major groups of local neighborhoods that express other possibilities for aggregation in which spatial contiguity is not important. The specification of relevant levels for given studies is one of the theoretical definitions that precede data collection and statistical analyses . The local neighborhood is thus only one of the scales, the one closest to the local level, but it is not always the most appropriate, and it is especially not the only one to contribute to the contextual effects on health .
We hope that it will be possible to evaluate the adequacy of local neighborhood spatial units proposed for health investigations in order to study different health events, such as violence, communicable diseases, and mortality.
We agree with Ana Diez-Roux et al  that Epidemiology is very sophisticated at measuring characteristics at the individual level, but not as sophisticated at measuring patterns in ecological sets. This seriously affects our capacity to examine contextual effects. In the current study, we present an approach that minimizes the problems related to residential heterogeneity between areas and maximizes the possibility to identify contextual characteristics permeating social processes within local neighborhoods.
Since the proposed method for demarcation of local neighborhoods is a structured method based on available data and open source computer programs and that can be easily reproduced in other cities, both in Brazil and abroad, we hope that it will allow progress on studies of intra-urban social differentials in the residential context and their implications for various health outcomes.
We emphasize that there is not just one way of demarcating neighborhoods. The proposed local neighborhood method is one of the possible ways of differentiating intra-urban space. Using this method, it was possible to construct spatial units that integrate populations with similar profiles and that are geographically proximate. This approach can be used and adapted to different constructs, depending on the study problem and underlying theoretical model. In this case, various parameters can be altered, like the minimum population size and the target indicators.
Schwartz S: The fallacy of the ecological fallacy: the potential misuse of a concept and its consequences. Am J Public Health. 1994, 84: 819-824. 10.2105/AJPH.84.5.819.
Susser M: The logic in ecological: I. the logic of analysis. Am J Public Health. 1994, 84: 825-835. 10.2105/AJPH.84.5.825.
Pearce N: Traditional epidemiology, modern epidemiology, and public health. Am J Public Health. 1996, 86 (5): 678-683. 10.2105/AJPH.86.5.678.
Castellanos PL: Epidemiologia, Saúde pública, Situação de Saúde e Condições de Vida. Considerações Conceituais. Condições de vida e situação em saúde. 1997, (Org.) Barata RB. Rio de Janeiro, ABRASCO, 31-76.
Evans RG, Barer ML, Marmot TR: Why Are Some People Healthy & Others Not?: Determinants of Health of Populations. 1994, Hardcover, Aldine de Gruyter
Krieger M: Epidemiology and Web of causation: has anyone seen the spider?. Social Science and Medicine. 1994, 39 (7): 889-903.
Diez-Roux A: Bringing context back into epidemiology: variables and fallacies in multilevel analysis. Am J Public Health. 1998, 88 (2): 216-222. 10.2105/AJPH.88.2.216.
Rose G: Individuos enfermos y poblaciones enfermas. Bol Epidemiol OPS. 1985, 6 (3): 1-5.
Barcellos C, Sabroza PC, Peiter P, Rojas LI: Organização espacial saúdee qualidade de vida: A análise espacial e o uso de indicadores na avaliação de situações de saúde. Informe Epidemiológico do SUS. 2002, 11 (3): 129-138.
Diez-Roux A: Multilevel analysis in public health research. Annu Re. Public Health. 2000, 21: 171-192. 10.1146/annurev.publhealth.21.1.171.
Sampson RJ: The neighborhood context of well-being. Perspectives in biology and Medicine. 2003, 46 (3): S53-S64. 10.1353/pbm.2003.0059.
Kaplan GA: What's Wrong with Social Epidemiology, and How We Can Make Better?. Epidemiological Review. 2004, 26: 124-135. 10.1093/epirev/mxh010.
Cummins S, Macintyre S, Davidson S, Ellaway A: Measuring neighborhood social and material context: generation and interpretation of ecological data from routine and non-routine data sources. Health and Place. 2005, 11: 249-260. 10.1016/j.healthplace.2004.05.003.
Carvalho M, Souza-Santos R: Analysis of spatial data in public health: methods, problems, and perspectives. (in Portuguese). Cad Saúde Pública. 2005, 21 (2): 361-378. 10.1590/S0102-311X2005000200003.
Szwarcwald CL, Bastos FI, Esteves MAP, Andrade CLT, Paez MS, Medici EV, Derrico M: Income inequality and health: the case of Rio de Janeiro. (in Portuguese). Cad Saúde Pública. 1999, 15 (1): 15-28. 10.1590/S0102-311X1999000100003.
Drumond M, Barros MBA: Social inequalities in adult mortality in the City of S. Paulo. (in Portuguese). Rev Bras Epidemiol. 1999, 2 (1/2): 34-39. 10.1590/S1415-790X1999000100004 .
Santos SM, Barcellos C, Carvalho MS: Ecological analysis of the distribution and socio-spatial context of homicides in Porto Alegre, Brazil. Health and Place. 2006, 12: 38-47. 10.1016/j.healthplace.2004.08.009.
Briggs DJ, Elliot P: The use of geographical information system on environment and health. World Health Stat Q. 1995, 48 (2): 85-94.
Barcellos C, Ramalho WM, Gracie R, Magalhães MAFM, Fontes MP, Skaba D: Geocoding health data in sub-municipal scale: some Brazilian experiences. (in Portuguese). Epidemiol Serv Saúde. 2008, 17 (1): 59-70.
Barcellos C, Santos SM: Colocando os dados no mapa: a escolha da unidade espacial de agregação e integração de bases de dados em saúde e ambiente através do geoprocessamento. Informe Epidemiológico do SUS. 1997, 6 (1): 1-29.
Santos SM, Chor D, Werneck GL, Coutinho ES: Association between contextual factors and self-rated health: a systematic review of multilevel studies. (in Portuguese). Cad Saúde Pública. 2007, 23 (11): 2533-2554. 10.1590/S0102-311X2007001100002 .
Buka SL, Brennan RT, Rich-Edwards JW, Raudenbush SW, Earls F: Neighborhood Support and the Birth Weight of Urban Infants. Am J Epidemiol. 2003, 157 (1): 1-8. 10.1093/aje/kwf170.
Acevedo-Garcia D: Zip Code-Level Risk Factors for Tuberculosis: Neighborhood Environment and Residential Segregation in New Jersey 1985-1992. Am J Public Health. 2001, 91 (5): 734-741. 10.2105/AJPH.91.5.734.
Zieler S, Krieger N, Tang Y, Coady W, Siegfried E, DeMaria A, Auerbach J: Economic Deprivation and AIDS Incidence in Massachusetts. Am J Public Health. 2000, 90 (7): 1064-73. 10.2105/AJPH.90.7.1064.
Ellaway A, Anderson A, Macintyre S: Does Area of Residence Affect Body Size and Shape?. I J Obesity and Related Metabolic Disorders. 1997, 21 (4): 304-81. 10.1038/sj.ijo.0800405.
Lynch J, Kaplan G: Socioeconomic Position. Social Epidemiology. Edited by: Berkman L, Kawachi I. 2000, New York: Oxford University Press, 13-35. 1
Diez-Roux A: Investigating neighborhood and area effects on health. Am J Public Health. 2001, 91 (11): 1783-1789. 10.2105/AJPH.91.11.1783.
Pickett KE, Pear M: Multilevel analyses of neighborhood socioeconomic context and health outcomes: a critical review. J Epidemiol Community Health. 2001, 55: 111-122. 10.1136/jech.55.2.111.
Raudenbush SW, Sampson RJ, Ecometrics : Toward a Science of Assessing Ecological Settings, with Application to the Systematic Social Observation of Neighborhoods. 1997, Working paper, presented at the annual meeting of the American Society of Criminology, San Diego
Tassinari WS, Leon AP, Werneck G, Faerstein E, Lopes CS, Chór D, Nadanovsky P: Socioeconomic context and perceived oral health in an adult population in Rio de Janeiro Brazil a multilevel analysis. (in Portuguese). Cad Saúde pública. 2007, 23 (1): 127-136. 10.1590/S0102-311X2007000100014.
Keller : apud Chaskin. 1997, 1968
Sampson RJ, Gannon-Rowley : Assessing "Neighborhood Effects": Social Processes and New Directions in Research. Annu Rev Sociol. 2002, 28: 443-78. 10.1146/annurev.soc.28.110601.141114.
Chaskin RJ: Perspectives on neighborhood and community: a review of the literature. Social Service Review. 1997, 71 (4): 521-47. 10.1086/604277.
Coulton C, Cook T, Molly I: Aggregation issues in neighborhood research: A comparison of several levels of census geography and resident defined neighborhoods. Making Connections initiative Working Paper. Cleveland. 2004, 24-
Abreu MA: Evolução urbana no Rio de Janeiro. 1988, Rio de Janeiro: Jorge Zahar, 2
IBGE: Censo demográfico do Município do Rio de Janeiro. 2000, Rio de Janeiro: Instituto Brasileiro de Geografia e Estatística
Duque JC, Ramos R, Suriñach J: Supervised regionalization Methods: a survey. International Regional Science Review. 2007, 30: 195-220. 10.1177/0160017607301605.
Assunção RM, Neves MC, Câmara G, Costa Freitas C: Efficient regionalization techniques for socio-economic geographical units using minimum spanning trees. Int J Geographical Information Science. 2006, 20 (7): 797-811. 10.1080/13658810600665111.
TerraView 3.1.4. Aplicativo da biblioteca de geoprocessamento TerraLib. Divisão de processamento de Imagens do Instituto Nacional de Pesquisas Espaciais - DPI/INPE. 2007,http://www.dpi.inpe.br/terraview_eng/index.php
Openshaw S: A geographical solution to scale and aggregation problems in region-building, partitioning and spatial modeling. Transactions of the Institute of British Geographers (New Series). 1997, 2: 495-472.
Silva NA, Matzenbacher LA, Cortez BF: Processamento de Áreas de Expansão e Disseminação da Amostra do Censo Demográfico 2000. Textos para discussão. Diretoria de Pesquisas, n.17. IBGE. Ministério do Planejamento, Orçamento e Gestão. Brasil. 2004
SPSS®: Statistical Package for Social Science. version 10.0.
Google Earth™: Version 4.2.0205.5730. compiled in November 2007
Carvalho MS, Cruz GO, Nobre FF: Spatial Partitioning Using Multivariate Cluster Analysis and a Contiguity Algorithm. Statistics & Medicine. 1996, 15: 1885-1894.
Santos SM, Noronha CP: Mortality spatial patterns and socioeconomic differences in the city of Rio de Janeiro. Cad Saúde Pública. 2001, 17 (5): 1099-110. 10.1590/S0102-311X2001000500012 . (in Portuguese)
Vickers D, Rees P: Introducing the Area Classification of Output Areas. Population Trends. 2006, 125: 15-29.
Flowerdew R, Maley DJ, Sibel CE: Neighbourhood effects on health: Does it matter where you draw the boundaries?. Socl Sci Med. 2008, 66: 1241-1255. 10.1016/j.socscimed.2007.11.042.
Santos M, Silveira LM: O Brasil: território e sociedade no início do século XXI. 2001, Rio de Janeiro/São Paulo: Record
Zaluar AM: A desordem urbana e os antagonismos e acomodações entre sobrados e mucambos. Teoria & Sociedade. 2001, 8: 148-159.
Koga D: Medidas de Cidades: Entre Territórios de Vida e Territórios Vividos. 2003, São Paulo: Ed Cortez
Haynes R, Daras K, Reading R, Jones A: Modifiable neighborhood units zone design and residents' perceptions. Health & Place. 2007, 13 (4): 812-825.
Coulton C, Cook T, Irwin M: Aggregation issues in neighborhood research: A comparison of census geography and resident defined neighborhoods.http://digitalcase.case.edu:9000/fedora/get/ksl:2006052511/Cook-Agression-2004.pdf
Zhang X, Christoffel KK, Mason M, Lin L: Identification of contrastive and comparable school neighborhoods for childhood obesity and physical activity research. Int J Health Geographics. 2006, 5: 14-10.1186/1476-072X-5-14.
Weiss L, Ompad D, Galea S, Vlahov D: Defining Neighborhood Boundaries or Urban Health Research. Am J Prev Med. 2007, 32 (6): S154-S159. 10.1016/j.amepre.2007.02.034.
Sampson JR, Raudenbush SW, Earls F: Neighborhoods and Violent Crime: A Multilevel Study of Collective Efficacy. Science. 1997, 277 (5328): 918-924. 10.1126/science.277.5328.918.
Wen M, Browning CR, Cagney KA: Poverty, affluence, and income inequality: neighborhood economic structure and its implications for health. Soc Sci Med. 2003, 57 (5): 843-60. 10.1016/S0277-9536(02)00457-4.
Sampson RJ, Raudenbush SW: Systematic social observation of public spaces: A new look at disorder in urban neighborhoods. Am J Sociology. 1999, 105 (3): 603-651. 10.1086/210356.
Cohen DA, Mason K, Bedimo A, Scribner R, Basolo V, Farley TA: Neighborhood physical conditions and health. Am J Public Health. 2003, 93 (3): 467-471. 10.2105/AJPH.93.3.467.
Skaba DA, Carvalho MS, Barcellos C, Martins PC, Terron SL: Geoprocessing of health data: treatment of information on addresses. (in Portuguese). Cad Saúde Pública. 2004, 20 (6): 1753-1756. 10.1590/S0102-311X2004000600037.
Cummins S: Commentary: investigating neighbourhood effects on health - avoiding the "Local trap". Int J Epidemiol. 2007, 21: 1-2. 10.1093/ije/dym033.
Diez-Roux A, Mujahid MS, Morenoff JD, Raghunathan T: Response to Invited Commentary: Mujahid et al. Respond to "Beyond the metrics for measuring Neighborhood Effects". Am J Epidemiol. 2007, 165 (8): 872-873. 10.1093/aje/kwm039.
The authors acknowledge researcher Renato Assunção for his kind explanations of the SKATER method, the inestimable contribution of the INPE development team, represented by Karine R. Ferreira and Antônio Miguel V. Monteiro, who accepted the challenge of developing regionalization demarcated by predefined larger areas, generating a beta version of TerraView that allowed the analyses presented in this article, and the exchange of ideas with colleagues from the group on Health Geography in Rio de Janeiro, represented by researcher Christovam Barcellos.
This study is part of the researches projects which receives funding from CAPES (0204056), CNPq (301689/2007-5, 150750/2009-9 and 500087/2010-5) and FAPERJ (E-26/100479/2007).
The authors declare that they have no competing interests.
SM Santos participated in the article's conceptualization and conducted the literature review, structured the database, analyzed and interpreted the compiled data, and wrote the article. D Chor and GL Werneck participated in the article's conceptualization and contributed to the analysis and interpretation of the results and helped write the article. All authors read and approved the final manuscript.