Unpacking analyses relying on area-based data: are the assumptions supportable?
© Glover et al; licensee BioMed Central Ltd. 2004
Received: 07 October 2004
Accepted: 09 December 2004
Published: 09 December 2004
In the absence in the major Australian administrative health record collections of a direct measure of the socioeconomic status of the individual about whom the event is recorded, analysis of the association between the health status, use of health services and socioeconomic status of the population relies an area-based measure of socioeconomic status.
This paper explores the reliability of the area of address (at the levels typically available in administrative data collections) as a proxy measure for socioeconomic disadvantage. The Western Australian Data Linkage System was used to show the extent to which hospital inpatient separation rates for residents of Perth vary by socioeconomic status of area of residence, when calculated at various levels of aggregation of area, from smallest (Census Collection District) to largest (postcode areas and Statistical Local Areas). Results are also provided of the reliability, over time, of the address as a measure of socioeconomic status.
There is a strong association between the socioeconomic status of the usual address of hospital inpatients at the smallest level in Perth, and weaker associations when the data are aggregated to larger areas. The analysis also shows that a higher proportion of people from the most disadvantaged areas are admitted to hospital than from the most well-off areas (13% more), and that these areas have more separations overall (47% more), as a result of larger numbers of multiple admissions.
Of people admitted to hospital more than once in a five year period, four out of five had not moved address by the time of their second episode. Of those who moved, the most movement was within, or between, areas of similar socioeconomic status, with people from the most well off areas being the least likely to have moved.
Postcode level and SLA level data provide a reliable, although understated, indication of socioeconomic disadvantage of area. The majority of Perth residents admitted to hospital in Western Australia had the same address when admitted again within five years. Of those who moved address, the majority had moved within, or between, areas of similar socioeconomic status.
Access to data about individuals from the Western Australian Data Linkage System shows that more people from disadvantaged areas are admitted to a hospital, and that they have more episodes of hospitalisation. Were data to be available across Australia on a similar basis, it would be possible to undertake research of greater policy-relevance than is currently possible with the existing separations-based national database.
The majority of work in Australia describing the association between the health status, use of health services and socioeconomic status of the population uses an area-based measure of socioeconomic status. It is necessary to use a proxy measure (the socioeconomic status of the population in the area) because there is no direct measure in the major administrative health record collections of the socioeconomic status of the individual about whom the event is recorded.
However, the application of an area-based measure requires a number of assumptions, including that people who move do so between, or within, geographic areas of similar socioeconomic status; and that the (often large) areas used in these analyses provide a reliable indication of the socioeconomic status and health service utilisation of the individuals in the area about whom the event is recorded. Area level socioeconomic status can also be considered as an independent predictor. For example, an individual with low socioeconomic status in an area of higher socioeconomic status is more likely to have better health outcomes than their counterpart in an area of lower socioeconomic status [1, 2]. This aspect is not addressed in this paper.
In relation to this latter point, Hyndman et al  found that "Misclassification of individuals to SES groups based on the basis of postcode caused an underestimation of the true relationship between SES and health-related measures. A reduction of this misclassification by using smaller spatial areas, such as CD or census enumeration districts, will provide improved validity in estimating the true relationship." A reduction in strength of correlation with increasing size of area is consistent with the results of this paper. In a study of hospitalisations in Michigan, USA, Hofer et al  found that 'the impact of socioeconomic characteristics on hospitalization rates is consistent when measured by individual or community-level measures'. This is an encouraging finding for those limited to using area-based data.
Another limitation of the majority of Australian health-related datasets is that they record events (eg., hospital inpatient separations, services by general medical practitioners), rather than individuals.
The analysis in this report uses the Western Australian Data Linkage System to explore the reliability of area data as a proxy for socioeconomic disadvantage when analysed for the relatively large geographic units often used in health-related research: it also addresses the limitations of using data about events, rather than individuals. It does this by examining the extent to which hospital inpatient separation rates vary, both overall and by socioeconomic status of area of residence, when calculated at various levels of aggregation, from Census Collection District (CD) – the smallest area level for which a measure of socioeconomic status is available – to the larger units of postcode and Statistical Local Area (SLA). Methods applied include the calculation of correlation coefficients and examination of hospital separation rates by quintile of socioeconomic disadvantage of area, separately for events and individuals.
The report also examines the reliability of the socioeconomic status of the address over time, by examining the extent of change in socioeconomic status of area of residence for individuals with repeat hospital episodes over a five year period.
The analysis shows that aggregating data to larger area reduces the gap between the index scores for the most disadvantaged and least disadvantaged areas, with the greatest impact on the scores for the most disadvantaged areas. This results in an understatement of the extent of disadvantage in the most disadvantaged areas, as well as an understatement in inequality between the most well off and the poorest areas.
Over the five years from 1994 to 1998, a total of 358 948 residents of Perth were admitted to a hospital in Western Australia on one or more occasion, an average of 71 750 individuals admitted per annum. Just over half (53.6%) the individuals admitted were females; 46.4% were males.
Perth residents admitted to hospital, by age and sex, at first admission, 1994–98
Rate per 1000
Perth residents admitted to hospital, by number of admissions and year of separation, 1994–98
Two or more admissions
Residents of Perth admitted to hospital, 1994–1998, by number of admissions per person
Admissions per person
Females accounted for just over half (53.6%) of those admitted once, compared with 59.7% of those admitted more than once. For males, the proportions were 46.4% and 40.3%, respectively.
There were 1 665 308 separations of Perth residents from Western Australian hospitals, an average of 2.53 separations per person admitted over the five years from 1994 to 1998. Over half (55.1%) of the separations were of females and 44.9% were of males.
The main differences in the profiles of male and female separations are evident at the youngest ages (higher proportions of males), from ages 20 to 44 years (higher proportions of females) and from 50 to 79 years (higher proportions of males). The ages at which the highest rates of admissions of individuals and of multiple admissions (the gap between the separations and admitted profiles) occur are clearly visible in the chart.
Separations of Perth residents, by age and sex, 1994–98
Rate per 1000
Effect of aggregation of areas on disadvantage scores
As noted, the majority of the analysis by socioeconomic status undertaken in the health sector in Australia is area based, and uses the postcode or SLA as the unit of analysis. This raises the question of the extent to which area based analyses at the postcode or SLA level provide a reliable indication of the socioeconomic status and health service utilisation of the individuals admitted. This report explores the reliability of postcode or SLA level data by examining the extent to which rates of individuals admitted and separations vary when calculated at various levels of aggregation (CD, postcode and SLA). Ideally, the comparison would be between the socioeconomic status of individuals and of areas; however, the smallest area level for which a measure of socioeconomic status is available is the CD.
Range of IRSD scores for area of address of individuals and separations
Median for individuals
Collection District (1)
Statistical Local Area (3)
Ratio of IRSD scores in area (3) to area (1)
Thus, the use of larger area aggregates reduces the gap between the index scores for the most disadvantaged and least disadvantaged areas (thus lessening the extent of inequality between these areas), with the greatest impact on the scores for the most disadvantaged areas (thus understating the extent of inequality in these areas). Notably, the difference between the maximum and minimum scores, and the absolute level of the scores, is much less marked between the postcode and SLA.
Spearman correlation coefficients between IRSD of address for individuals (at first discharge) and area level
Area level of first discharge
more than one separation
more than one separation & moved address
Effect of aggregation of areas on separation rates
Residents of Perth admitted to hospital, 1994–1998, by socioeconomic disadvantage of area for selected area levels
Q1: Least disadvantaged
Q5: Most disadvantaged
1 665 308
1 665 308
1 665 308
Rate (per 100 000 population)
Q1: Least disadvantaged
Q5: Most disadvantaged
Rate ratio: Ratio of rate in Q5 rate in Q1
When data are aggregated to postcode area or SLA, the differentials in separation rates between Quintile 5 and Quintile 1 areas are smaller (differentials of 1.23 and 1.20, respectively) than at the CD level (a differential of 1.47) (Table 7). In the case of postcodes, this is largely because of the lower separation rate in Quintile 5 areas (likely to be a result of the process of aggregating CDs), whereas for SLAs it is a combination of a lower separation rate in Quintile 5 areas and a higher rate in Quintile 1 areas (likely to be a result of the aggregation process, exacerbated by the variable size of SLAs – see section titled 'Methods, Area' under 'Methods.' The differential in rates of individuals admitted is the same for data at the SLA and CD level, but higher for postcode areas. These results again reflect the difficulty inherent in producing groups of approximately equal populations.
Number of separations per individual, by socioeconomic disadvantage of area, Perth residents, 1994–1998
Separations per person
Separations per individual, by socioeconomic disadvantage of area, Perth residents, 1994–1998
Separations per person
Ratio of rates in Q5/Q1
Rate per 100 000 population
Average admissions per person with two or more admissions
The average number of admissions per person for people admitted to hospital on more than one occasion over the five years to 1998 was 4.4; this varied from 4.2 separations per person admitted in the least disadvantaged areas to 4.7 in the most disadvantaged areas.
Reliability over time of address as a proxy for socioeconomic status
Studies using the address of usual residence as a proxy for socioeconomic status require two important assumptions. They are that:
• people who move do so within, or between, areas of similar socioeconomic status; and that
• the areas used in an area based analysis (which can vary in size and are quite often large) provide a reliable indication as to the socioeconomic status and use of health services of the individuals in the area.
Data from the 1996 Census show that 53.5% of Perth's population at the 1996 Census reported that they had a different address to that at the previous Census, five years earlier . Data were not available to compare the IRSD of the first and last SLA of address of the Perth population who moved. However, almost one quarter (24.0%) of Perth residents who moved between the 1991 and 1996 Censuses moved to an address within the same SLA. That is, some 59.3% of the population were in the same SLA after five years (either moved within the SLA, or did not move). This is an encouraging statistic for area based analyses.
Similarly, almost four out of five people admitted to hospital more than once in a five year period had not moved (out of the CD of their address at the first separation) by the time of their second separation. For example, of the 298 809 people admitted to a Perth hospital more than once over the five year period 1994 to 1998, over three quarters (78.6%, 64 075 people) had the same address at the time of the second separation. People were recorded as having 'moved' if the CD of their address changed between the first and last separation over the period from 1994 to 1998. Movement to a different address within a CD was not included.
The following table illustrates, for people with multiple admissions, the extent of movement by socioeconomic status. For this part of the analysis, the CD of first and last separation have been allocated to quintiles of socioeconomic disadvantage of area, to provide a comparison of the extent of movement between different levels of socioeconomic status. The construction of the quintiles is described in the section titled 'Methods, Measurement of socioeconomic status' under 'Methods.'
Residents of Perth admitted to hospital more than once, 1994–1998, who changed address, by socioeconomic disadvantage of area
CD of first separation
CD of last separation (%)
• people from the most well off areas are less likely to have moved to areas of greatly different socioeconomic status (ie, changed quintiles) than are those from the most disadvantaged areas – 40.2% of people in the most advantaged areas (Quintile 1) remained there, despite moving from the CD of their first separation. The proportion in the most disadvantaged (Quintile 5) areas was a lower 30.5%;
• while there is movement right across the socioeconomic profile, most movement is between adjacent quintiles. For example, of the 18 875 people who lived in the most disadvantaged areas at their first separation (and moved before a subsequent admission), 71.2% had moved to a CD in the same or next ranked quintile (Quintiles 5 or 4), with just 4.6% moving to the most advantaged areas. Similarly, of the 9 537 people in the most well off areas at their first separation, 63.0% had moved to a CD in the same or next ranked quintile (Quintiles 1 or 2), with a similarly low proportion (4.7%) moving to the most disadvantaged areas;
• the most substantial movement between quintiles was of people moving from an address rated as Quintile 5 to one rated as Quintile 4 (40.7%); this was marginally higher than the proportions moving within Quintiles 4 or 1 (40.3% and 40.2%, respectively).
Correlation coefficients between quintile of socioeconomic disadvantage of area of address of first and last separation, 1994–98
Area of address
CD of first separation
CD of last separation
SLA of first separation
SLA of last separation
The analysis shows that, for Perth residents admitted to hospital, the use of larger area aggregates reduces the gap between the index scores for the most disadvantaged and least disadvantaged areas, thus understating the extent of inequality between these areas. The greatest impact of aggregation of areas is on the scores for the most disadvantaged areas. This results in an understatement of the extent of disadvantage in the most disadvantaged areas, as well as an understatement in the extent of inequality between the most well off and the poorest areas.
Further, the analysis shows that a more people from the most disadvantaged areas are admitted to hospital than from the most well-off areas (13% more), and that these people have more separations overall (47% more), as a result of larger numbers of multiple admissions.
As regards the extent of movement, four out of five people admitted to hospital more than once in a five year period had not moved (out of the CD of their address at the first separation) by the time of their second separation. In addition:
• people from the most well off areas are less likely to have moved to areas of greatly different socioeconomic status than are those from the most disadvantaged areas;
• while there is movement right across the socioeconomic profile, most movement out of a quintile is to areas in adjacent quintiles; and
• the most substantial movement between quintiles was of people moving from an address rated as Quintile 5 to an address rated as Quintile 4, although this was only marginally higher than the proportions moving within Quintiles 4 or 1.
In summary, postcode level and SLA level data provide a reliable indication of socioeconomic disadvantage of area, when compared with CD-level data. That is, the association between rates of total separations and individuals admitted and socioeconomic disadvantage of area evident at the smallest area level (CD) is also evident in the higher level area aggregates of postcode and SLA.
It is reasonable to assume that similar relationships exist in other Australian cities, as well as in other health-related activity (eg. visits to general medical practitioners).
Given the widespread use in Australia of area based analyses at the postcode and SLA level, and the limitations of CDs an area level for the analysis of most health datasets, it is important to know that such analyses provide a reliable indication of the direction and underlying strength of the influence of socioeconomic factors in hospital admissions rates.
Number of areas and average population for CDs, postcodes and SLAs in Perth, 1996
It is also clear that data as to socioeconomic position at the smallest area level possible or, more importantly, of individuals, would also be of value. Were data to be available across Australia on a similar basis to that from the Western Australian Data Linkage System, it would be possible to undertake research of greater policy-relevance than is currently possible with the existing separations-based national database. Such moves are under consideration in several Australian States.
Further, linking data (eg, using probabilistic linkage) for individuals in the Western Australian Data Linkage System to the Australian Bureau of Statistics Population Census has the potential to add considerable value to such analyses. For example, it would be possible to examine an individual's characteristics of education, occupation, labour force status, housing tenure etc., and to more directly examine the relationships between the number of individuals admitted and total separations and these important socioeconomic variables. Linkage to death registration data would also be valuable in understanding more about outcomes related to socioeconomic status. This latter example is a possibility under recently announced plans for the ABS to test the linking of 2006 Census of Housing and Population data to other datasets, such as deaths registrations, held under their Act. This is similar to the approach elsewhere, including New Zealand . It is to be hoped that such arrangements can be put in place in Australia in the near future.
The report addresses differences in the number of individuals admitted and the number of separations they incurred. These are described as 'individuals', or individuals admitted' and separations (the total number of separations, where an individual may have had one or more episodes of hospitalisation over the period of the analysis). 'Separation' is the term describing a completed hospital episode: it is defined in the section titled 'Glossary, Separation' under 'Glossary.'
Details of all separations to public and private hospitals in Western Australia for the five years from 1994 to 1998 were extracted from the Western Australian Hospital Morbidity Database (HMDS). Any separation records thought to belong to the same person had previously been linked together within the Data Linkage System, permitting analyses to be performed for both separations and individual persons. The population used in calculating rates is the 1996 Census population.
The analysis has been limited to separations of residents of Perth, but includes separations occurring at any public acute or private hospitals in Western Australia.
Areas used in the analysis are the Census Collection District (CD), postcode and Statistical Local Area (SLA). See Glossary for definitions of CD, postal area and SLA.
The HMDS includes address details for each separation from a hospital in Western Australia since 1993. These addresses have been linked to a Western Australian street address database to assign northing and easting points (geo-codes). These points were then assigned to the appropriate 1991 or 1996 CD using the ABS CData96 mapping tool. The postcode and SLA of the address were then determined by allocation of CDs to postcode or SLA. The boundaries for CDs and SLAs are consistent. However, boundaries for CDs and postcodes are not, so CDs were allocated to postcodes on a 'best fit' basis (see Glossary).
Consequently, comparisons can be made between results for CDs and postcode areas, CDs and SLAs and postcode areas and SLAs. This is particularly important, as much of the area analysis undertaken in the health sector in Australia uses the postcode or the SLA, as a majority of data are only available at these area levels, and it is widely accepted that the larger the area, the less homogenous the population is likely to be.
There were 2 297 CDs in Perth at the 1996 Census, with 105 postcodes and 37 SLAs. The average population size at each of these area levels is shown in Table 12; these data emphasise the variation in size of the areas at each area level.
Measurement of socioeconomic status
In the absence of any direct measure of socioeconomic status in the hospital inpatient data, the socioeconomic status of the area of the address of the individual admitted is used as a proxy measure. The Index of Relative Socio-Economic Disadvantage (IRSD) is the measure used to provide the socioeconomic status of the area of the address. The IRSD is one of five Socio-Economic Indexes for Areas (SEIFA) produced by the Australian Bureau of Statistics (ABS) from data collected at the 1996 Population Census. It is calculated at the CD level and can be produced for other area levels. The postcode and SLA level index scores in this report are the population weighted average of the IRSD scores for the CDs in the postcode or SLA. This calculation is undertaken for all CDs in the postcode or SLA, not just those for which hospital episodes were recorded.
Population of quintiles at various area levels, 1996
1 228 036
1 274 297
1 228 036
Three (different) IRSD scores were added to each hospital separation record, based on the CD, postcode or SLA that had been previously assigned to the address on that record. It should be noted that these IRSD scores were actually the average score for the particular CD, postcode or SLA as calculated from 1996 Census data. Quintile ranks for each aggregation level were also applied using population weighting as described above.
For analyses involving multiple admissions, the IRSD value used was that for the first separation in the five-year period. These 'first' separations were isolated using the internal links between separation records for the same person and the separation date. Of course, many of these 'first' separations could have been preceded by separations occurring before 1994.
Rates are crude rates, per 100 000 population. Ideally the data would have been standardised (by the indirect method). However, access to the source data were limited and to requested tables, and standardisation was not an option.
As the data were from a complete enumeration (all admissions to hospital), confidence intervals were only calculated for measures of difference (in this case, rate ratios).
The Spearman Rank Correlation has been used in the analysis to indicate the degree of correlation between pairs of variables.
The Collection District (CD) is the smallest area level in the Australian Bureau of Statistics' statistical geography and is primarily an area used in the five yearly population census.
Index of Relative Socio-Economic Disadvantage
The Index of Relative Socio-Economic Disadvantage (IRSD) is one of five Socio-Economic Indexes for Areas produced by the Australian Bureau of Statistics at recent population censuses. Produced using Principal Components Analysis, it summarises information available from variables related to education, occupation, income, family structure, race (the proportion of Indigenous people), ethnicity (poor proficiency in use of the English language) and housing. The variables are expressed as percentages of the relevant population. The IRSD is available at the Census Collection District level and was then be calculated for postcodes and SLAs by weighting the CD level scores by their population. The IRSD is calculated to show the relativity of areas to the Australian average for the particular set of variables which comprise it; this average score is set at 1000. Scores below 1000 indicate areas with relative disadvantaged populations under this measure, and scores above 1000 indicate areas with relatively advantaged populations. The IRSD scores at the Census Collection District (CD) level have been grouped to postal area, an area developed by ABS for the presentation of population counts and other Census data from the five-yearly population censuses to approximate postcode areas, as the ABS does not collect the postcode at the Census.
The term describing a completed hospital episode is a 'separation'. At the time of admission to hospital, the age, sex, address of usual residence and other personal details of the patient are recorded. At the end of the episode, at the time of separation from hospital, details of the episode itself are recorded, including the date, time and method of separation (discharge, death or transfer of a patient to another care setting eg. hospital, nursing home). Consequently, hospital inpatient data collections are based on separations.
The postal area is an area developed by ABS for the presentation of population counts and other Census data from the five-yearly population censuses. It approximates postcode areas, as the ABS does not collect the Australia Post postcode at the Census. Postal areas comprise Census Collection Districts (CDs) grouped to approximate postcode areas. Where a CD does not fit entirely within a postcode area, it is allocated to the postcode area into which the population largely falls. Where a CD covers more than one postcode area, the total CD population is allocated to one postcode.
The IRSD scores at the Census Collection District (CD) level have been allocated to postal areas as described in the section titled 'Methods, Index of Relative Socio-Economic Disadvantage' under 'Methods.'. Similarly, the postal area of each separation was approximated from the CD of the address.
The term postcode, rather than postal area, is used in the text, for ease of reading.
See postal area, above.
Quintile of socioeconomic disadvantage of area
See section titled 'Methods, Measurement of socioeconomic status' under 'Methods.'
An SLA in Perth is generally equivalent to a local government area, with additional codes allocated to local government areas split for statistical purposes (mainly local government areas with large populations, split to form SLAs with smaller populations).
The hospital inpatient data on which this analysis was based were drawn from the Western Australian Data Linkage System. The Western Australian Data Linkage System was established in 1995 with three year funding from the Western Australian Lotteries Commission. The system is now jointly funded, managed and staffed by the Department of Health (WA), the University of WA; the Institute for Child Health Research and Curtin University of Technology.
Its aim is to link unit records from core Department of Health data collections and other relevant data collections, for the purpose of providing linked data to support health planning, purchasing, evaluation and research.
The data extraction and analysis for this report was undertaken by Diana Rosman, Manager, Data Linkage Unit, Health Information Centre, Department of Health (Western Australia).
- Joshi H, Wiggins R, Bartley M, Mitchell R, Gleave S, Lynch K: Putting health inequalities on the map: does where you live matter, and why?. In Understanding health inequalities. Edited by: Graham H. 2001, Buckingham: Open University Press, 143-155.Google Scholar
- Cesaroni G, Farchi S, Davoli M, Perucci CA: Individual and area-based indicators of socioeconomic status and childhood asthma. European Respiratory Journal. 2003, 22 (4): 619-24.PubMedView ArticleGoogle Scholar
- Hyndman JC, Holman CD, Hockey RL, Donovan RJ, Corti B, Rivera J: Misclassification of social disadvantage based on geographic areas: comparison of postcode and collector's district analyses. Int J Epidemiol. 1995, 24: 165-176.PubMedView ArticleGoogle Scholar
- Hofer TP, Wolfe RA, Tedeschi PJ, McMahon LF, Griffith JR: Use of community versus individual socioeconomic data in predicting variation in hospital use. Health Serv Res. 1998, 33 (2 Pt 1): 243-59.PubMedPubMed CentralGoogle Scholar
- Australian Bureau of Statistics: 1996 Census Basic Community Profile. Table B01 Selected Characteristics. 1998, ABS CanberraGoogle Scholar
- HealthWIZ. [http://www.prometheus.com.au/html_control/index_frame.htm]
- New Zealand Census-Mortality Study. [http://www.wnmeds.ac.nz/academic/dph/research/nzcms/]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.