Skip to main content

Table 5 How ESTREAA hierarchy category might be assigned, and AAEP might be estimated for selected circumstances

From: Use of attribute association error probability estimates to evaluate quality of medical record geocodes

Example

Attribute association

Circumstance

How AAEP was estimated

1

1. Patient identifying fields

Patient lacks government issued ID and address, and patient names and date of birth match to three individuals in external data

The patient could be three different persons matched to in various sources. The AAEP is calculated as 1 − (1/3) or 0.666

2

2. Patient-date of diagnosis

Patient diagnosis year known, but month and day unknown

One day out of 365 is chosen, thus the probability of choosing the wrong day is 1/365. AAEP = 1 − (1/365) = 0.997

3

3. Patient: date of diagnosis-address

Patient address is missing house number

Patient address matches to the address featuresa of 20 residences on one street. AAEP = 1 − (1/20), or 0.95

4

3. Patient: date of diagnosis-address

Patient address is missing prefix direction. To confirm that address is valid, it is matched to USPS ZIP + 4 database

Patient address matches to 2 addresses in USPS ZIP + 4 database. AAEP = 1 − (1/2) or 0.5

5

3. Patient: date of diagnosis-address

Error suspected in more than one component of patient address (‘multimatch’ address). Patient address can be matched to 12 different address features in geographic reference data depending on which address component(s) are edited

AAEP = 1 − (1/12) or 0.916

6

3. Patient: date of diagnosis-address

Address at diagnosis cannot be geocoded. Patient address history unknown or incomplete. Patient address identified via linkage to external source on patient name and date of birth, and used to match to geographic reference data with one to one match. Date of diagnosis was not spanned by duration of address validity in external data source

AAEP estimated at 0.25 based on best available information about error rate of external data source

7

3. Patient: date of diagnosis-address

Patient has PO Box address. Patient address history unknown or incomplete. Patient names and PO Box address match to owner names and mailing addresses of 4 parcels, whose sale dates precede the date of diagnosis

There are 4 possible addresses and only one is chosen. AAEP = 1 − (1/4) = 0.75

8

3. Patient: date of diagnosis-address

Patient year of diagnosis known. Patient day and month of diagnosis is unknown. Patient address history for year of diagnosis is known. During that time patient lived at 3 addresses in sequence for 0.4, 0.1, and 0.5 % of the year; the first address is chosen

AAEP = 1 − 0.4 or 0.6 %

9

4. Patient: date of diagnosis-address-geocode

Patient address matches to a street in geographic reference data with 21 address features that are missing house numbers

Patient address matches to address features of 21 residences on one street. AAEP = 1 − (1/21), or 0.952

10

4. Patient-date of diagnosis-address-geocode

Patient street address could not be matched to street level geographic reference data. Patient postal code matched to postal code area centroid

Postal code encompasses 13,500 address features. AAEP = 1 − (1/13,500) = 0.999

11

5. Patient-date of diagnosis-address-geocode-enumeration area

Patient address lacks a house number. Street to which patient address is geocoded is contained within 1 enumeration area

Because all potential matches are contained within the chosen enumeration area, AAEP = 0

12

5. Patient-date of diagnosis-address-geocode-enumeration area

Patient address lacks a house number; there are 70 address feature matching candidates. The area of uncertainty that contains the potential matches spans 2 enumeration areas. These contain 20 and 50 candidate address features; the latter enumeration area is chosen

AAEP = 1 − (50/70) = 0.285

13

5. Patient-date of diagnosis-address-geocode-enumeration area

Patient street address could not be matched to street level geographic reference data. Patient postal code matched to postal code area centroid. Postal code area spans 4 enumeration areas, which contain 2160, 1620, 1620 and 6750 address features respectively; the latter enumeration area is chosen

Postal code encompasses 12,150 address features. AAEP is 1 − (6750/12,150) = 0.444

14

5. Patient-date of diagnosis-address-geocode-enumeration area

Both patient address and postal code are unmatched in geographic reference data. County centroid is assigned as geocode

AAEP is 1 − (1/395,909 address features in county) or 0.999

  1. aEmergency dispatch address features as published by county or city data authors