Mapping HIV clustering: a strategy for identifying populations at high risk ofHIV infection in sub-Saharan Africa
© Cuadros et al.; licensee BioMed Central Ltd. 2013
Received: 2 March 2013
Accepted: 15 May 2013
Published: 22 May 2013
The geographical structure of an epidemic is ultimately a consequence of thedrivers of the epidemic and the population susceptible to the infection. The‘know your epidemic’ concept recognizes this geographicalfeature as a key element for identifying populations at higher risk of HIVinfection where prevention interventions should be targeted. In an effort toclarify specific drivers of HIV transmission and identify prioritypopulations for HIV prevention interventions, we conducted a comprehensivemapping of the spatial distribution of HIV infection across sub-SaharanAfrica (SSA).
The main source of data for our study was the Demographic and Health Surveyconducted in 20 countries from SSA. We identified and compared spatialclusters with high and low numbers of HIV infections in each country usingKulldorff spatial scan test. The test locates areas with higher and lowernumbers of HIV infections than expected under spatial randomness. For eachidentified cluster, a likelihood ratio test was computed. A P-valuewas determined through Monte Carlo simulations to evaluate the statisticalsignificance of each cluster.
Our results suggest stark geographic variations in HIV transmission patternswithin and across countries of SSA. About 14% of the population in SSA islocated in areas of intense HIV epidemics. Meanwhile, another 16% of thepopulation is located in areas of low HIV prevalence, where some behavioralor biological protective factors appear to have slowed HIV transmission.
Our study provides direct evidence for strong geographic clustering of HIVinfection across SSA. This striking pattern of heterogeneity at themicro-geographical scale might reflect the fact that most HIV epidemics inthe general population in SSA are not far from their epidemic threshold. Ourfindings identify priority geographic areas for HIV programming, and supportthe need for spatially targeted interventions in order to maximize theimpact on the epidemic in SSA.
KeywordsHIV Spatial epidemiology Disease mapping Sub-Saharan Africa Mathematical modeling
Sub-Saharan Africa (SSA) has by far the largest HIV epidemic in the world, with 68%of the global HIV disease burden and 1.9 million new infections in 2010 . The complex epidemiological context of this infection has prevented todate a clear identification of the specific drivers that have led to such largegeneral population HIV epidemics in SSA, and nowhere else .
The ‘Know your epidemic, know your response’ is a Joint United NationProgramme on HIV/AIDS (UNAIDS) focused on HIV prevention , which has become one of the first calls to modify the current strategyby recognition of the fact that there is not a single global HIV epidemic. Thisconceptual approach emphasizes the necessity to identify populations at higher riskof HIV infection, and to implement an effective prevention strategy byacknowledgement of the substantial variation in which HIV spreads throughpopulations and communities [4, 5].
This strategy also highlights the significant role that geographical space plays inthe identification of populations at higher risk. This fundamental characteristic ofan epidemic, however, has been poorly explored in the context of HIV. Measures ofdisease occurrence are frequently available only by large geographicaladministrative units. These large scales (national or regional) could hide thenatural scale of the HIV transmission process.
To overcome the current gap of knowledge about the spatial structure of the HIVepidemic, particularly in SSA, we explored and described the geographicalheterogeneity of the HIV epidemic in SSA. We aimed to identify geographical settingswhere the risk of HIV infection is higher or lower. Studying the epidemics at their“microscopic” scale, that is through spatial mapping of the clusteringof HIV infection at the local level, may provide a fresh look into the dynamics ofthese epidemics [6, 7], and informed insights about their drivers in this part of the globe.
The main source of data for our study were the Demographic and Health Survey(DHS)  databases. Countries were included for analysis based upon theavailability of DHS HIV serological biomarker survey and the geographicalcoordinates of each survey data point. For each country, we only considered themost recent DHS where HIV data were collected. As a result, a total of 20countries in SSA were included: Burkina Faso (2010), Burundi (2010), Cameroon(2004), Congo Democratic Republic (2007), Ethiopia (2011), Ghana (2003), Guinea(2005), Kenya (2008–2009), Lesotho (2009), Liberia (2007), Malawi (2010),Mali (2006), Mozambique (2009), Rwanda (2010), Senegal (2010–2011),Swaziland (2006–2007), Sierra Leone (2008), Tanzania (2007–2008),Zambia (2007), and Zimbabwe (2010–2011).
Spatial clustering detection
We identified the spatial clusters with high and low numbers of HIV infections ineach country through a Kulldorff spatial scan statistics analysis . This methodology has become the most widely used test for clusteringdetection in epidemiology [10–12], and its efficiency and accuracy has been well documented [13, 14]. A spatial scan statistics is a cluster detection test able to findthe location of areas with higher or lower numbers of cases (for instance HIVinfections) than expected under spatial randomness, and then evaluate theirstatistical significance by gradually scanning a circular window that spans thestudy region. The radius of the circle is changed continuously so that it cantake any value from 0 up to a pre-specified maximum value. A maximum circularwindow of 100 Km radius was used for scanning potential clusters with high orlow numbers of HIV infections.
For each potential cluster, a likelihood ratio test was computed assuming thatthe number of HIV infections in each circular window is an independent Bernoullirandom variable. The numbers of observed and expected HIV infections within andoutside the circular window were then compared with the likelihood L 0 under the null hypothesis of spatial randomness. The circular windowswith the highest likelihood ratio values were identified as potential clusters.An associated P-value of the statistics was then determined throughMonte Carlo simulations and used to evaluate whether HIV infections are randomlydistributed in space or not.
After a cluster was identified, the strength of the clustering was estimatedusing the relative risk (RR) of HIV infection within the cluster versus outsidethe cluster. The fraction of the population, and HIV prevalence were alsoestimated for each cluster. Furthermore, the general RR of HIV infection for allindividuals belonging to (high or low) clusters was also estimated by combiningall (high or low) clusters identified in a particular country. All geographicinformation system (GIS) analysis and cartographic displays were performed withthe software ArcGIS version 9.2 .
A deterministic compartmental mathematical model was constructed based onextension of earlier models [16–18] to describe the heterosexual transmission of HIV in a givenpopulation . The model consists of a system of coupled nonlinear differentialequations, and stratifies the population according to HIV status, stage ofinfection and sexual risk group. Our model incorporates 10 sexual risk groups inthe population, starting from lower to higher levels of sexual risk behavior.The level of sexual risk behavior was parameterized by the effective partnershipchange rate in each risk group, and in essence it is a measure of the risk ofexposure to the HIV infection. Further details about the model structure can befound in the Additional file 1.
General description of the clusters with high and low HIV prevalenceidentified in the 20 countries included in the study
National HIV prevalence
Clusters with high HIV prevalence
Clusters with low HIV prevalence
Number of clusters
Fraction of the population (%)
HIV prevalence range (%)
Strength of the clustering* range
Number of clusters
Fraction of the population (%)
HIV prevalence range (%)
Strength of the clustering* range
1.94 – 4.35
3.33 – 6.69
Congo D. R.
0 – 0.08
0 – 0.004
3.61 – 7.73
2.59 – 4.99
4.86 – 8.23
3.19 – 4.85
0 – 0.50
0 – 0.27
1.04 – 1.92
0.31 – 0.53
8.33 – 17.70
2.18 – 4.41
0 – 0.77
0 – 0.14
12.13 – 18.18
2.3 – 3.44
21.61 – 29.73
4.26 – 4.28
0 – 2.6
0 – 0.52
15.05 – 22.01
2.07 – 2.21
0 – 2.69
0 – 0.31
0 – 4.95
0 – 0.47
22.20 – 25.21
1.70 – 1.74
0.99 – 8.17
20.76 – 30.75
1.29 – 1.91
1.90 – 9.94
0.11 – 0.59
14.19 – 16.37
0.31 – 0.71
HIV prevalence within the clusters with high prevalence ranged from 1.9% in a clusterin Senegal (RR = 3.33, P < 0.001) to 30.8% in acluster in Zimbabwe (RR = 1.91, P < 0.001), witha median of 11.5%. Likewise, HIV prevalence within the clusters with low HIVprevalence ranged from 16.4% in a cluster in Lesotho (RR = 0.71,P = 0.034) to clusters with 0% HIV prevalence in Liberia,Ethiopia, Ghana, Tanzania, Kenya, Mozambique, and Malawi, with a median of 1.7%.
The fraction of the population within clusters with high HIV prevalence had a medianof 14.4%, and this fraction increased with the national HIV prevalence(P = 0.005). This association, however, did not explain most ofthe variation (adjusted R-squared = 0.41). The fraction of thepopulation within clusters with low HIV prevalence had a median of 15.5%. However,we found no statistically significant trend in the association between the fractionof the population within clusters with low HIV prevalence and the national HIVprevalence (P = 0.73).
The results of our analysis indicate stark geographical variation in HIV prevalencein most of the countries. The observed spatial variation in HIV prevalencehighlights a clustered HIV transmission across SSA within micro-epidemics ofdifferent scales. The map of HIV clustering reflects a landscape with‘valleys’ (areas with high HIV prevalence), ‘dams’ (areaswhere HIV found barriers to propagate efficiently), and ‘islands’ (smallisolated areas with characteristically either very high or very low HIVprevalence).
Our results indicate that only ~14% of the population across the countries resideswithin clusters of high HIV prevalence. The strength of the clustering tended to behigher in countries with low national HIV prevalence. For instance, the strongestclustering (highest RR) is found in a cluster in Senegal (RR = 6.69, HIVprevalence = 4.3%); the country with the lowest national HIV prevalence(0.7%). Our study revealed similar settings with localized epidemics at high HIVprevalence hidden in a map of low national HIV prevalence, such as in Burkina Faso,Congo, Sierra Leone and Ethiopia.
The strength of the clustering was smaller in countries with high national HIVprevalence, indicative of more diffusive epidemics. For instance, in Zambia andLesotho, the strength of the clustering was fairly small (RR = 1.74, andRR = 1.28, respectively). In Swaziland, no clusters with high HIVprevalence were identified. This result underlines how the HIV epidemics in thesehigh prevalence countries had percolated throughout much of the demography andgeography of these countries.
We also identified clusters with low HIV prevalence in most of the countries includedin our study. These clusters appear to reflect ‘dams’ where somebehavioral or biological protective factors appear to have slowed HIV transmissionin such populations, in contrast to their neighboring populations. In fact, weidentified settings with very low HIV prevalence even in countries with substantialHIV epidemics such as in Tanzania, Kenya and Malawi.
The topography of this infection poses a question about the drivers of such starkheterogeneities even at micro-geographic scales within countries. Male circumcision , the presence of other sexually transmitted infections (STIs) , tropical co-infections increasing HIV viral load , hormonal contraception , viral factors , and host genetics and immunology  vary across SSA, and are believed to influence HIV transmission risk.Behavioral factors such as concurrency , number of sexual partners , commercial sex , and coital frequency  appear also to vary across SSA, and may contribute to explaining theheterogeneities in prevalence. Preliminary statistical analyses of the DHS databases(not shown) indicated that it is challenging, if not a formidable task, todisentangle the contribution of the different factors in the clustering of theinfection. This is a consequence of the complex array of independent variables toconsider, and also because of the population sizes of the clusters’sub-samples which are not large enough to power meaningful multivariate regressionanalyses. Nevertheless, these preliminary analyses suggest that the scale anddistribution of the differences in the biological and behavioral factors, withinversus outside the clusters, may not be sufficient to explain the observed sharpcontours in the topography of HIV infection at the local level. We suspect thatthere is an additional dynamical factor that has strongly influenced the localecology of this infection even when the differences in the biological and behavioralfactors may not have been markedly large.
We hypothesize that the HIV epidemic among the general population in much of SSA isnot far from its epidemic (or sustainability) threshold. A generic feature of aninfection epidemic is that near the epidemic threshold, the prevalence dependsnon-linearly on the determinants of infection transmission, and that small changesin the epidemiological context can drive much larger changes in the prevalence ofthe infection . Figure 4 illustrates this dynamical effectfor the case of HIV infection. As can be seen in the figure, modest changes insexual risk behavior near the epidemic threshold could generate a substantialincrease in HIV prevalence. Conversely, beyond the region of epidemic threshold, thesame increase in sexual risk behavior could generate only a modest increase in HIVprevalence.
Accordingly, we hypothesize that an essential driver of the stark variability in HIVinfection transmission in SSA is that the epidemiology of this infection is not farfrom its epidemic threshold in the general population outside of conventionalhigh-risk groups. The conspicuously large clustering of HIV infection may notstrictly reflect conspicuously large variations in sexual risk behavior or thepresence (or absence) of specific biological co-factors in HIV transmission. Thevariability in sexual risk behavior or biological co-factors within the populationhas driven a much larger variability in HIV prevalence, thanks to the non-linearepidemic dynamics near the infection sustainability threshold. This hypothesis mayalso contribute to explaining the global variability in HIV infectious spread whereonly in SSA massive general-population HIV epidemics have occurred. In SSA, butnowhere else, the epidemiology of HIV infection has crossed, though not by far, theepidemic threshold of sustainability in the general population (Figure 4). That was enough to spark localized epidemics in the generalpopulation; and these epidemics, not far above the sustainability threshold,exhibited consequently high diversity in size at the micro-geographic scale(Figures 1 and 2).
Several study limitations could have affected our results. First, the selection ofthe DHS round for the different countries was constrained by the availability of HIVbiomarker information and geographical coordinates of each survey data point at anyparticular DHS round. This limited our ability to consider more countries in SSA foranalysis with more recent DHS rounds. Small clusters of HIV infection could havebeen missed if there is not enough sampling within their geographic setting. Giventhe multiple logistical difficulties in conducting the DHS, some of our measurescould have been influenced by inherent biases in the data such as the variability inresponse rates to HIV testing [28, 29].
Mobile individuals and high-risk subpopulations such as female sex workers, injectingdrug users, and men who have sex with men, may have been undersampled by the DHS.Clusters of HIV infection among such subpopulations may have been missed in ouranalysis. It is not clear though whether undersampling of such populations couldnecessarily affect our findings or not. Epidemics among high-risk subpopulationsshould lead to some infection onward transmission such as among spouses and clientsof female sex workers which are less likely to be undersampled in the DHS. Lastly,due to the cross-sectional nature of the data used in this study, some of theclusters identified here could reflect epidemics at different stages, rather thangenuine differences in epidemic sizes. The HIV epidemic in SSA, however, is in amature stage [25, 30], and therefore this potential limitation is probably not influencing ourresults. Moreover, we analyzed the clustering of the infection in four countriesthat had more than one DHS serological biomarker survey at different years, and noconsequential changes in the distribution of the clusters were observed (notshown).
In sum, our study provides evidence for a striking geographic clustering of HIVinfection across SSA. The exact drivers of such rich and complex infectiontopography are not well understood. However, the clustering possibly reflectsdifferences in specific behavioral and biological factors between sub-populationsthat have been amplified as larger differences in HIV prevalence, as a consequenceof the infection epidemiology being not far from its epidemic threshold. Ourfindings support the need for spatially-targeted prevention strategies in SSA, andour results delineate the map of the high disease-burden areas. If indeed HIVepidemiology in SSA is not far from its epidemic threshold in the generalpopulation, this would indicate that even modest intervention-driven changes in riskbehavior, or risk of HIV acquisition, through behavioral or biomedicalinterventions, may have considerable impact in reducing the sizes of the Africanepidemics. The recently observed rapid declines in HIV prevalence in SSA [1, 31, 32] may suggest that this transition is already taking place.
Joint United Nations Programme on HIV/AIDS
Demographic and Health Survey
Sexually transmitted infection.
The authors are thankful for Measure DHS for putting these data in the service ofscience, and for the U.S. Agency for International Development (USAID) and otherdonors supporting these initiatives. This work was supported by the QatarNational Research Fund, a Qatar Foundation funded program (NPRP 5-752-3-177).Additional support was provided by the Biostatistics, Epidemiology, andBiomathematics Research Core at the Weill Cornell Medical College in Qatar.
- UNAIDS/WHO: AIDS epidemic update. 2010, available athttp://www.unaids.org/documents/20101123_FS_SSA_em_en.pdf,accessed 23 July 2012, : UNAIDS fact sheet, available at,accessed 23 July 2012
- Abu-Raddad LJ, Barnabas RV, Janes H, Weiss HA, Kublin JG, Longini IMJ, Wasserheit JN, Group tHVLW: Have the explosive HIV epidemics in sub-Saharan Africa been driven by highercommunity viral load?. AIDS. 2012, 10.1097/QAD.0b013e32835cb927.
- UNAIDS: Practical guidelines for intensifying HIV prevention: towards universalacces. Book Practical guidelines for intensifying HIV prevention: towards universalacces. 2007,
- Wilson D, Halperin DT: Know your epidemic, know your response: a useful approach, if we get itright. Lancet. 2008, 372: 423-426. 10.1016/S0140-6736(08)60883-1.PubMedView Article
- Buse K, Dickinson C, Sidibé M: HIV: know your epidemic, act on its politics. JRSM. 2008, 101: 572-573. 10.1258/jrsm.2008.08k036.View Article
- Fichtenberg CM, Ellen JM: Moving from core groups to risk spaces. Sex Transm Dis. 2003, 30: 825-826. 10.1097/01.OLQ.0000097141.29899.7F.PubMedView Article
- Real LA, Biek R: Spatial dynamics and genetics of infectious diseases on heterogeneouslandscapes. J R Soc Interface. 2007, 4: 935-948. 10.1098/rsif.2007.1041.PubMedPubMed CentralView Article
- Demographic and health surveys.http://www.measuredhs.com/,
- Kulldorff M: A spatial scan statistic. Comm Stat Theor Meth. 1997, 26: 1481-1496. 10.1080/03610929708831995.View Article
- Wand H, Ramjee G: Targeting the hotspots: investigating spatial and demographic variations inHIV infection in small communities in South Africa. J Int AIDS Soc. 2010, 13: 41-10.1186/1758-2652-13-41.PubMedPubMed CentralView Article
- Ryan J, Mbui J, Rashid J, Wasunna M, Kirigi G, Magiri C, Kinoti D, Ngumbi P, Martin S, Odera S: Spatial clustering and epidemiological aspects of visceral Leishmaniasis intwo endemic villages, Baringo District, Kenya. Am J Trop Med Hyg. 2006, 74: 308-317.PubMed
- Kulldorf M, Song C, Gregoria D, Samociuk H, DeChello L: Cancer map patterns: are they random or not?. Am J Prev Med. 2006, 30: s37-s49. 10.1016/j.amepre.2005.09.009.View Article
- Aamodt G, Samuelsen S, Skrondal A: A simulation study of three methods for detecting disease clusters. Int J Health Geogr. 2006, 5: 15-10.1186/1476-072X-5-15.PubMedPubMed CentralView Article
- Kulldorf M, Zhang Z, Hartman J, Heffernan R, Huang L, Mostashari F: Benchmark data and power calculations for evaluating disease outbreakdetection methods. MMWR. 2004, 53: 144-151.
- ESRI: ArcGIS 9.x. 2004, Redlands, CA, USA: ESRI
- Abu-Raddad LJ, Longini IM: No HIV stage is dominant in driving the HIV epidemic in sub-SaharanAfrica. AIDS. 2008, 22: 1055-1061. 10.1097/QAD.0b013e3282f8af84.PubMedView Article
- Abu-Raddad LJ, Patnaik P, Kublin JG: Dual infection with HIV and malaria fuels the spread of both diseases insub-Saharan Africa. Science. 2006, 314: 1603-1606. 10.1126/science.1132338.PubMedView Article
- Awad S, Cuadros D, Abu-Raddad L: Characterizing HIV prevalence distribution across sub-populations at variablelevels of sexual behavior. Int J Infect Dis. 2012, 16 (Supplement 1): e180-e181.View Article
- Weiss HA, Halperin D, Bailey RC, Hayes RJ, Schmid G, Hankins CA: Male circumcision for HIV prevention: from evidence to action?. AIDS. 2008, 22: 567-574. 10.1097/QAD.0b013e3282f3f406. 510.1097/QAD.1090b1013e3282f1093f1406PubMedView Article
- Abu-Raddad LJ, Magaret AS, Celum C, Wald A, Longini IM, Self SG, Corey L: Genital herpes has played a more important role than any other sexuallytransmitted infection in driving HIV prevalence in Africa. PLoS One. 2008, 3: e2230-10.1371/journal.pone.0002230.PubMedPubMed CentralView Article
- Heffron R, Donnell D, Rees H, Celum C, Mugo N, Were E, de Bruyn G, Nakku-Joloba E, Ngure K, Kiarie J: Use of hormonal contraceptives and risk of HIV-1 transmission: a prospectivecohort study. Lancet Infect Dis. 2012, 12: 19-26. 10.1016/S1473-3099(11)70247-X.PubMedPubMed CentralView Article
- Kaul R, Cohen CR, Chege D, Yi TJ, Tharao W, McKinnon LR, Remis R, Anzala O, Kimani J: Biological factors that may contribute to regional and racial disparities inHIV prevalence. Am J Reprod Immunol. 2011, 65: 317-324. 10.1111/j.1600-0897.2010.00962.x.PubMedView Article
- Morris M, Kretzschmar M: Concurrent partnerships and the spread of HIV. AIDS. 1997, 11: 641-648. 10.1097/00002030-199705000-00012.PubMedView Article
- Auvert B, Buvé A, Ferry B, Caraël M, Morison L, Lagarde E, Robinson NJ, Kahindo M, Chege J, Rutenberg N: Ecological and individual level analysis of risk factors for HIV infection infour urban populations in sub-Saharan Africa with different levels of HIVinfection. AIDS. 2001, 15: S15-S30.PubMedView Article
- Leclerc PM, Garenne M: Commercial sex and HIV transmission in mature epidemics: a study of fiveAfrican countries. Int J STD AIDS. 2008, 19: 660-664. 10.1258/ijsa.2008.008099.PubMedView Article
- Brown MS: Coitus, the proximate determinant of conception: inter-country variance insub-Saharan Africa. J Biosoc Sci. 2000, 32: 145-159. 10.1017/S0021932000001450.PubMedView Article
- Anderson R, May R: Infectious Diseases of Humans: Dynamics And Control. 1991, Oxford: Oxford University Press
- Marston M, Harriss K, Slaymaker E: Non-response bias in estimates of HIV prevalence due to the mobility ofabsentees in national population-based surveys: a study of nine nationalsurveys. Sex Transm Infect. 2008, 84 (Suppl 1): i71-i77.PubMedPubMed CentralView Article
- Mishra V, Barrere B, Hong R, Khan S: Evaluation of bias in HIV seroprevalence estimates from national householdsurveys. Sex Transm Infect. 2008, 84 (Suppl 1): i63-i70.PubMedPubMed CentralView Article
- Chen L, Jha P, Stirling B, Sgaier SK, Daid T, Kaul R, Nagelkerke N, for the International Studies of HIVAI: Sexual Risk Factors for HIV Infection in Early and Advanced HIV Epidemics inSub-Saharan Africa: Systematic Overview of 68 Epidemiological Studies. PLoS One. 2007, 2: e1001-10.1371/journal.pone.0001001.PubMedPubMed CentralView Article
- UNAIDS: UNAIDS Report on the Global AIDS Epidemic 2010. Book UNAIDS Report on the Global AIDS Epidemic 2010. 2010, City
- UNAIDS/WHO: Global HIV/AIDS response: Epidemic update and health sector progress towardsUniversal Access.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative CommonsAttribution License (http://creativecommons.org/licenses/by/2.0), whichpermits unrestricted use, distribution, and reproduction in any medium, provided theoriginal work is properly cited.