This study sought to evaluate the performance of an SSO instrument in the context of six Toronto neighbourhoods. This instrument was based on scales and constructs from prior studies largely generated in US settings. The factor analysis based on the quantitative data, taken together with the qualitative findings, raises important questions and concerns with respect to the 'transferability' of such constructs to the Toronto context.
Interpreting the Data
The qualitative component of this study contributes to the literature on SSO by acknowledging the inherently subjective nature of neighbourhood observations (and recognizing the positive contributions of such subjective data sources) - a consideration absent from the literature to date. The inclusion of qualitative methods to our study demonstrates the depth and richness of analysis that can be obtained by using a mixed-methods approach for SSOs. To our knowledge this is one of the first studies to incorporate qualitative observational findings and researcher reflections into an investigation employing quantitative systematic observational tools. As such it represents an important contribution to our understanding of neighbourhood evaluation. One British study (Morrow, 2000)  has examined youth residents' self-reported perceptions of neighbourhood physical context and its impact on youth well-being using qualitative methods, however we use these techniques to tap raters' perceptions, not those of residents. Furthermore, Morrow does not attempt to tie her findings to observational data collected in the neighbourhoods under study.
Analysis of the qualitative data revealed three broad themes related to use of the SSO instrument in the field: (1) features of the BF being observed, (2) features of the BF relative to the neighbourhood, and (3) features of the rater's experience while observing the BF. Each of these themes poses further questions with regard to the utility of the SSO instrument in the Toronto context and challenges some of the assumptions upon which SSO research has been based to date. In particular, our qualitative findings urge us to question the meaning of the quantitative results with respect to the underlying social processes relating to neighbourhood disorder.
The first theme (features of the BF being observed) problematizes the 'objective' nature of this form of data collection. There were numerous instances where raters found it challenging to appropriately characterize BFs, and they feared misrepresenting BFs - and the implications of such misrepresentations when drawing comparisons between neighbourhoods. This may be a concern when examining urban settings which have low levels of severe disorder and substantial intra-neighbourhood variation as was noted in our study. The quantitative findings are more readily interpretable in light of the complex nature of the data collection process. For example, the factor 'physical decay and disorder' was the only one which resembled the constructs generated in the US studies. The high prevalence of cigarette butts (noted by raters in all neighbourhoods) accounted for a significant proportion of this factor. The physical decay/disorder construct was further influenced by the high prevalence of ratings of 'poor/fair/deteriorated condition of public spaces' (with this rating assigned to 63.6% of BFs). The standardized instructions may have skewed these results in favor of a preponderance of 'fair' ratings. The standardized instructions indicated that any street, sidewalk, public transit stop, public parks or grounds, public schools or any non-private land should be marked in 'fair' condition if it showed irregular maintenance (including those with even small amounts of cracked concrete or paint or moderately overgrown vegetation) and overall the space was "in decent condition, but (rater) would recommend additional upkeep." Such instructions logically resulted in most raters ranking public spaces as being in fair condition. However it is questionable whether this degree of disorder on its own would result in a negative experience for persons using the BF. It was discomfort with this type of rating that raters' comments reflected.
The second theme addressing features of the BF relative to the surrounding neighbourhood challenges commonly held assumptions regarding neighbourhood homogeneity as these relate to Toronto neighbourhoods. The qualitative findings underline the importance of heterogeneity within the neighbourhood and how this sense of heterogeneity may impact the overall impression of a specific area within a neighbourhood or the entire neighbourhood itself. The impact of such heterogeneity challenges the classic understanding of a neighbourhood which presupposes certain levels of homogeneity within a specified bounded area . It may also provoke questions concerning the significance of the BF or smaller bounded communities within neighbourhoods in the presence of considerable neighbourhood variability. Certainly the quantitative findings reflected neighbourhood heterogeneity as well. The creation of the 'physical decay and disorder' scale revealed that low-income neighbourhoods were more likely to be characterized by greater levels of disorder/decay than middle/high-income neighbourhoods. However, the box-and-whisker plots indicate that considerable heterogeneity exists within each neighbourhood, regardless of income. Taken together, these mixed-method findings suggest that the impact of concentrated disorder (evident in smaller pockets within neighbourhoods) may be diluted when describing neighbourhoods more broadly, implying a notion of "covert disorder" in the Toronto setting.
The third theme examining features of the rater's experience while observing the BF speaks to the access to information that would not have been otherwise obtained by using a purely quantitative approach or even the use of other data collection methods such as observations performed by driving through neighbourhoods [7, 21]. The fact that raters performed observations on foot, walking up and down a BF numerous times, offered residents the opportunity to interact with them, in turn yielding detailed narrative accounts. When raters did not engage with residents, their very presence on the BF provided them access to observations that might have escaped notice using other methods of data collection such as drive-by observation. This is because, as observers on foot, raters could observe in 360° over a longer period of time, since most observations required at least 30 minutes for completion.
Moreover, the narrative accounts of raters' experiences often revealed unanticipated information concerning the neighbourhoods. For example, raters suggested in group discussion that evidence of extreme social disorder was often fleeting to the outsider--erupting to the surface at intervals, but not always obvious at first glance. As well, information obtained through interaction and observation frequently challenged raters' preconceived notions of the BF or neighbourhood. The qualitative findings reinforce the importance of heterogeneity and covert disorder in explaining features of the neighbourhoods under study. By capturing the raters' experiences in a systematic way, the qualitative portion of our study was able to access yet another level of rich contextual information that supported and helped in interpreting our quantitative results.
The quantitative findings also provided new insights regarding SSO. The generation of the two distinct constructs from the resources meta-category is an original contribution to our understanding of neighbourhoods. The 'neighbourhood social accessibility' factor speaks to neighbourhoods as dynamic entities rather than static ones. It reflects the ease with which one can enter and leave a neighbourhood and - coupled with signs advertising social and cultural events as well as the presence of drinking establishments - suggests features of neighbourhoods that make them desirable places for both residents and non-residents alike. The 'recreational opportunities' factor represents a related construct in that places to play or meet in public spaces without being overwhelmed by traffic (and its attendant congestion, parking difficulties, noise and pollution) might also prove appealing to residents. In a forthcoming study employing concept mapping (Sheppard et al: "Are Canadians influenced by their urban neighbourhoods? Neighbourhood characteristics and their perceived impact on self-rated mental well-being," submitted)- which asked residents for their perspectives on neighbourhoods and mental health - residents indicated that pedestrian-friendly neighbourhoods, accessible by public transit or other means, with plenty of public services, places to meet and occasions to celebrate - all were reported as contributing to residents' mental well-being. The quantitative findings in the present study suggest that such neighbourhood factors are 'observable' (physically quantifiable) and are important to residents.
Within the quantitative analysis, we were able to identify some differences between low and high income neighbourhoods on the factor 'physical decay and disorder' and on some individual items. Several items did not demonstrate significant differences between low and high income neighborhoods, but achieved statistical significance upon stratification by individual neighbourhood. These variables (including resident reaction to raters, presence of public courtesies, graffiti, signs advertising cultural or social events, the presence of children, teenagers and adults, features of the built environment) may be features of neighbourhoods that are not necessarily linked to income, but are rather descriptors of the unique character of neighbourhoods. Conversely, several items were only statistically significant upon stratification by income (poor/fair condition of commercial buildings, buildings for sale or rent, residences protected by dogs). These items may be more strongly linked to income than neighbourhood, or there may be insufficient power to resolve them.
It was not possible to extract a model with interpretable factor structure for the Social meta-category. Possible explanations include lack of power due to rare items or small sample size, or an artifact due to how the items were dichotomized. However this may also reflect fundamental differences between the neighbourhoods included in the present investigation and the neighbourhoods that were used in the Chicago and Baltimore studies - meaning that the constructs of Territoriality and Social Disorder may not be applicable to Toronto.
By combining the quantitative and qualitative analyses, a number of interesting points for discussion are posed. With respect to the transportability of previously employed SSO tools into a Canadian context, it is fair to ask whether the notion of 'social disorder' is the most appropriate to this setting . In particular, the choice of variables included in the tool and how these were operationalized were of concern to raters. For example, raters were concerned by the limited ability of the SSO instrument to capture certain characteristics of the BF that they felt were important for the Toronto setting, such as the aesthetic appeal of a BF (e.g. the degree of order and diversity of land use), the rater's sense of personal safety, and their experiences during data collection. In contrast, as elicited during the group discussion, many raters felt that items within the SSO tool relating to extreme physical or social disorder were not as relevant to the study of Toronto neighbourhoods, but that these items accounted for a considerable proportion of the overall observation. Oreopoulos (2005) also posited differences in neighbourhood disorder as experienced by Canadian and US residents . Instead of focusing on disorder, the mixed method findings in our study suggest that perhaps the notion of 'order' may be more pertinent in the context of Toronto neighbourhoods (but not 'order' conceived of as a mere corollary to that of 'disorder' premised in most of the SSO literature). We caution that the raters' reflections are those of a very select group of participants - academic researchers, not laymen. Nevertheless, the positive emphasis that all raters gave to diversity (as an appealing attribute at both the BF and neighbourhood level) challenges prevailing planning notions that stress uniformity. The qualitative data therefore help to highlight and reinforce specific challenges regarding the transfer and subsequent application of an SSO tool from one urban context to another. In the case of Toronto, additional variables corresponding to more specific concepts of safety, aesthetic appeal and order, and heterogeneity might be considered in any future revision of the SSO instrument.
Anecdotally, it is not unusual for visitors to Toronto (or even agency representatives providing funding to low-income inner-city neighbourhoods) to ask "when are we going to get to the 'bad' neighbourhood?" As such, relative disorder (both physical and social) and decay are not always obvious to the casual observer. This is not to suggest that Toronto does not have its share of both physical and social disorder. For example, while few homeless people were observed during the study, we know that Toronto has many homeless residents . Rather the tool failed to capture this facet of city life. If we are to use the concept of disorder (rather than order), perhaps a recognition of the concealed nature of disorder (particularly social disorder) is more reflective of the Toronto experience.
Taken together, our qualitative and quantitative findings compel us to interrogate the theoretical assumptions underlying social and physical disorder in a Toronto context. As such, we propose that alternative theoretical concepts might be more relevant to this setting given the complexity of the phenomena under investigation. For example, the finding of considerable heterogeneity and and the discussions amongst raters concerning 'covert disorder' might suggest that residents perceive their local BF or immediate surroundings as more representative of their functional neighbourhood. From this perspective, the concept of smaller functional geospatial communities bounded within traditionally defined neighbourhoods may very well have a substantial impact on any conceptual framework attempting to describe how neighbourhoods affect health - particularly as they relate to health in Toronto or similar urban settings.
This observational study (like others before it) chose to focus on the facades of BFs as units for observation. Given the questions raised here regarding the nature of disorder, it is fair to ask whether we are looking for disorder in the 'right' places. Perhaps we should be sampling the considerable network of 'back alleys' that are a staple of Toronto's inner-city neighbourhoods (a suggestion offered by raters during the group discussion). The data from our forthcoming concept mapping study (Sheppard et al.: "Are Canadians influenced by their urban neighbourhoods? Neighbourhood characteristics and their perceived impact on self-rated mental well-being," submitted) suggests that residents of apartment buildings (particularly high-rise buildings) include the internal spaces between apartments (lobbies, elevators, common areas) as important features of 'neighbourhood' for them. Perhaps we need to adapt a tool that will capture both 'internal' and 'external' neighbourhood characteristics (and the relative order or disorder therein). What is the 'appropriate' level of observation when evaluating neighbourhoods?
As noted already, this study has a number of important limitations. We were unable to generate factors for the social meta-category, with potential explanations including: a lack of power (secondary to either low prevalence or small sample size), artifact from dichotomizing the variables, or a fundamental problem with the construct of social disorder underlying the adapted scales (adapted from US and UK contexts).
Another important question that we cannot answer by this investigation relates to linkages between the constructs generated by SSO and health. This study rather represents an initial step in understanding the utility and applicability of these tools in different contexts. As such our findings will assist health researchers in interpreting the findings they acquire when using these measures. It is important to note that the nature of the relationship between neighbourhoods and health has been difficult to delineate and such linkages are no doubt complex. Given the increasing interest in this area of research and the use of SSO tools, improved understanding of the instruments themselves (evaluating both their strengths and limitations) is a valuable contribution. For example, if we take the prevalence of homelessness and relate it to health, for example, what are the attributes at the individual level that contribute to vulnerability, and what aspects of place make individuals vulnerable? These are questions which will only be answered by studies employing a variety of methodologies (observation, concept mapping, surveys, interviews, document analysis, policy analysis, etc.), such as Klinenberg's approach of 'social autopsy' .
Our employment of mixed quantitative and qualitative methods represents a unique contribution to the field of neighbourhoods-and-health research. The qualitative findings enhance our understanding of the quantitative data and analyses, but also add new and important information regarding raters' experiences in conducting such research. Both forms of data (and their interpretation) contribute to our understanding of neighbourhood-level characteristics and both pose important questions regarding the best ways to characterize neighbourhoods.