A Bayesian approach to study the space time variation of leprosy in an endemic area of Tamil Nadu, South India

Background In leprosy endemic areas, patients are usually spatially clustered and not randomly distributed. Classical statistical techniques fail to address the problem of spatial clustering in the regression model. Bayesian method is one which allows itself to incorporate spatial dependence in the model. However little is explored in the field of leprosy. The Bayesian approach may improve our understanding about the variation of the disease prevalence of leprosy over space and time. Methods Data from an endemic area of leprosy, covering 148 panchayats from two taluks in South India for four time points between January 1991 and March 2003 was used. Four Bayesian models, namely, space-cohort and space-period models with and without interactions were compared using the Deviance Information Criterion. Cohort effect, period effect over four time points and spatial effect (smoothed) were obtained using WinBUGS. The spatial or panchayat effect thus estimated was compared with the raw standardized morbidity (leprosy prevalence) rate (SMR) using a choropleth map. The possible factors that might have influenced the variations of prevalence of leprosy were explored. Results Bayesian models with the interaction term were found to be the best fitted model. Leprosy prevalence was higher than average in the older cohorts. The last two cohorts 1987–1996 and 1992–2001 showed a notable decline in leprosy prevalence. Period effect over 4 time points varied from a high of 3.2% to a low of 1.8%. Spatial effect varied between 0.59 and 2. Twenty-six panchayats showed significantly higher prevalence of leprosy than the average when Bayesian method was used and it was 40 panchayats with the raw SMR. Conclusion Reduction of prevalence of leprosy was 92% for persons born after 1996, which could be attributed to various intervention and treatment programmes like vaccine trial and MDT. The estimated period effects showed a gradual decline in the risk of leprosy which could be due to better nutrition, hygiene and increased awareness about the disease. Comparison of the maps of the relative risk using the Bayesian smoothing and the raw SMR showed the variation of the geographical distribution of the leprosy prevalence in the study area. Panchayat or spatial effects using Bayesian showed clustersing of leprosy cases towards the northeastern end of the study area which was overcrowded and population belonging to poor economic status.


Background
Leprosy is a chronic infectious disease caused by the bacterium, Mycobacterium leprae, which can affect all ages and both sexes. Over the last two decades, prevalence of leprosy has come down substantially at the global level. Introduction and expansion of Multi drug therapy (MDT) in leprosy control programmes have dramatically lowered the prevalence level in almost all the endemic countries. For instance, in India leprosy prevalence has come down from 51 per 10,000 in 1981 to around 2.4 per 10,000 in March 2004 [1] and further below 1 per 10,000 by December 2005 [2].
In disease epidemiology, many of the infectious disease events do not occur randomly in geographical context but occur in clusters. In fact, leprosy epidemiology shows such uneven distribution between different geographic areas of the country, e.g. in China [3] and in Indonesia where in the leprosy endemic was high, the cases were extensively clustered and not equally distributed [4]. Hence, data analyses and interpretation should not ignore spatial dependence [5]. In India, we have observed that distribution of leprosy is uneven [6] even within the smallest community groups such as villages, right up to the family level [7].
Geographical or spatial analysis comes into play due to the existence of spatial dependence in data. Bayesian method lends itself for representing the spatial dependence during the estimation of model parameters. Application of Bayesian analysis in the field of leprosy is very limited [8]. A large controlled, double blind, randomized, prophylactic leprosy vaccine trial was conducted in South India to assess the prophylactic efficacies of four different candidate vaccines [9]. Within the trial area, we observed that leprosy was not randomly distributed and showed significant spatial dependence confirmed by using various measures of spatial autocorrelation like Moran's I, Geary's C and Kulldorff's SATSCAN statistics [10]. Hence, our main objective was to examine the variation in the prevalence of leprosy using four Bayesian models described by Arbyn etal [11] which was earlier proposed by Lagazio et al [12] and to explore possible factors that might have influenced these variations in the study area.

Materials and Methods
Leprosy prevalence data from an endemic area, covering 148 panchayats (rural administrative units) comprising of 264 contiguous villages from Chingleput district, Tamil Nadu, South India was studied. This area was specifically identified for a leprosy vaccine trial [9] because of the high endemicity of leprosy. The entire population of about 300,000 people was screened by house to house examinations for leprosy and cases were identified at four time schedules between January 1991 and March 2003. Few socio-economic factors like population density and economic status were also collected.

Definition of leprosy
A case of leprosy was defined as a person, having one or more of the following manifestations and who needed antileprosy treatment: hypopigmented or reddish skin lesion(s) with definite loss of sensation; damage to the peripheral nerves, as demonstrated by loss of sensation and or weakness of the muscles in parts supplied by these nerves; skin smear positive for acid fast bacilli.
The entire population in this area was screened for leprosy before vaccination. First, paramedical workers, trained in leprosy detection, screened the population. A proportion (5%) of this population was randomly allotted to "blinded" senior persons -either medical officers or senior paramedical officers for quality control. All cases and suspects detected by the junior paramedical workers were examined for diagnosis by two senior persons and by a third independent examiner in case of disagreement. Skin smear examination for detecting acid fast bacilli was done for all suspects and definite cases by the senior workers. A team of independent clinicians visited the field at frequent intervals to monitor the procedures for diagnosis of leprosy [9].
The data collected was also validated in many ways with the earlier surveys [7]. Hence the quality of data collected was remarkable and comparable to world standard as certified by the independent assessment committee consisting of national and international experts. Models with and without interaction terms were compared using the Deviance Information Criterion (DIC) [14], being a generalization of the Akaike's Information Criterion in the Bayesian framework, i.e., lower the DIC values better the model. Posterior distributions (using the priors and the available data) of the parameters of interest were obtained using Gibbs sampling in WinBUGS [15]. The data for the SC model (observed and expected cases) were obtained by aggregating the data over the four time periods. The data for the SP model (observed and expected cases) were obtained by aggregating the data over the age groups. For the SC model the average of all the cohorts and for SP model the average of all periods was used as the reference.
The spatial effects were measured using each of the models. These smoothed effects were compared using the raw standardized morbidity rate (SMR) [unsmoothed].
We have dealt with subgroups like panchayats hence we used prevalence instead of incidence to get sufficiently larger number of cases to draw meaningful conclusions. Moreover if we use incidence rather than prevalence, we will have only three survey data leaving the baseline survey. Generally the second survey incidence (a mixed bag of old prevalence-missed and new cases) is not considered for vaccine efficacy and also for trend analysis. Hence we would be left with the third and fourth survey and ultimately this exercise of Bayesian model would not be possible to examine the trend over years. Hence as a pragmatic measure, we considered prevalence cases for this study.
In the study area majority of leprosy cases belonged to paucibacillary variety and the multibacillary cases constituted a smaller proportion (17.4%, 3.3%, 5.4% and 1.4% in the four surveys respectively). Fixed duration of MDT for six months was practiced all through the study period. For the purpose of analysis in this paper we restricted it only for paucibacillary variety. Hence, prevalence trends also indicate trends in leprosy incidence and both go hand-in-hand. Any change in beneficial effect that happens during the survey period decreases incidence as well as prevalence. Since we had limited data collected specifically by panchayat on socio-economic factors and there could be other important factors apart from the two mentioned above, we did not include any of these factors directly in the model. Though the period of study was very short, in the region, major remarkable changes have taken place both in the leprosy control programme and improvement in the socio-economic status. According to the World Development Indicators 2005 [16] the rural poverty in India declined from 53% to 27% between 1977-78 and 1999-2000 and the Indian social structure was transformed between 1991 and 2001 like increased literacy rate, urbanization, industrialization, new economic liberalization etc.
The formulae used are described in Appendix-1

Results
There were 6,601 cases, 4,731 cases 3,342 cases and 2,098 cases of leprosy respectively at the four time periods. The DIC value for the space cohort model with and without interaction were compared. Since the models with the interaction term out-performed with smaller DIC values ( Table 1 and Table 2). Further analyses were carried out with interaction term included into the SC and SP model. The Markov Chain Monte Carlo Simulation (MCMC) scalar parameters and their 95% credible intervals for the SC and SP models with interactions are shown in Table 3 and Table 4 respectively.

Cohort effect using the SC model with interaction
The median cohort effects declined from 1.86 to 0.08 over the successive cohorts. The cohort effect using SC model (Table 5) measured in terms of relative risk, was significantly higher than the average in the older cohorts C ≤ 11, i.e. persons born before 1957. The cohort effect steadily decreased up to 1922-1931 (C ≤ 6) and increased substantially during 1927-1936,1932-1941 (C = 7 & C = 8) and again significantly declined to a higher risk of 36% at C = 11 and finally reached to a lower risk of 92% (C = 20) than the average for persons born after 1996. There were four major jumps (difference between a cohort and the preceding one) observed in the risk pattern of the cohorts i.e. 1927-1936 (higher risk of 45%), 1942-51(reduced risk of 31%), 1987-1996 (reduced risk of 33%) and 1992-2001(reduced risk of 40%).

Period effect using the SP model with interaction
The period effect over four time points using SP model showed a significantly higher risk of 1.032 than the average, i.e. 3.2% to a lower risk 1.8% (Table 6).

Space effect using SC and SP model with interaction
The spatial effect values (smoothed Bayesian) using SC and SP models using interaction terms were similar as observed in the Belgium study [11]. The spatial effects of different panchayats are listed in Table 7 and it can be further visualized in the choropleth map ( Figure 1). The spatial effects varied between 0.59 and 2. Bayesian model identified 26 panchayats that had a significantly higher risk of leprosy. There was a higher risk of leprosy (50% or more) found in 10 panchayats and most of them lay close to each other towards the North-Eastern end of the study area. The lowest risk (relative risk 0.59) was observed in two panchayats. Raw SMR identified 43 panchayats at a higher risk, which was statistically significant ( Figure 2); whereas it was 26 panchayats by the Bayesian models.

Discussion
We examined variation in prevalence of leprosy using Bayesian methods over 20 rolling cohorts and four time periods from a meticulously collected dataset of a vaccine trial conducted in South India. Observing the cohort effects it neither showed a steady decreasing nor an increasing effect. It showed an intricate effect, whereas leprosy prevalence trends in period effects has decreased continuously over four time points.
The steady decreasing period effect reflects that the control programme had reached all the target population whereas the intricate cohort effects showed that the effects captured were not similar in all the cohorts.
The higher difference in the risk in successive cohorts could be attributable to persons coming forward for seeking treatment, gradual awareness in the community that leprosy is curable, slowly combating the social stigma, early screening programmes, better case detection methods, availability of therapies and elimination programmes. This is particularly true in the vaccine trial setting where health systems operations and leprosy programme have been implemented in total [6].
The turning point in the risk pattern (from high risk to low risk) occurred during the cohort 1952-1961 (C = 12). This change could be due to the impact of Dapsone based National Leprosy Control Programme introduced during 1955. Gradual reduction of risk in the latter cohorts (C = 13, C = 14) may be due to continuing effect of programs and possibly improvement in socio-economic conditions the transition that has been taking place in India. Further reduction in risk (C = 19) could be due to more effective and intensified treatment programmes like Multi-Drug Therapy in 1991, that has changed the face of leprosy [17] and introduction of effective prophylactic vaccines through the leprosy vaccine trial [9].
Our data indicate 92% reduction in the leprosy prevalence for persons born after 1996 (C = 20). The case detection and treatment activity in the community has brought down the new infection rate in the younger age group as a secondary effect of MDT in addition to the primary effect of prophylactic vaccines. The older age group might have been infected in the past i.e. before the introduction of Dapsone and MDT as well as before the introduction of the vaccination programme. They might already be harbouring the infection and the break down could result in fresh disease. This is similar to the endogenous reactivation observed in tuberculosis [18].
We observed slow decline in the estimated period effects of leprosy in the study area. It agrees with the fact that the prevalence of leprosy has come down over years globally   as well as in India [19]. Leprosy being a chronic disease, temporal changes within endemic regions are slow [7]. This phenomenon is observed in the study area. Moreover, the decline in the risk of leprosy over the time periods in the study area could be due to better understanding on nutrition, hygiene, and increased public understanding of the disease (socio-economic factors) which limit the spread of leprosy or increased resistance to leprosy.
The spatial effect using Bayesian model was compared with the raw SMR gives the variation in the geographical distribution of the leprosy prevalence. The use of Bayesian smoothing approach accounts for the variability in the population at risk and clustering effect. Observing the spatial Bayesian effect, there was a strong pattern of clustering towards the North-Eastern region of the study area. Though the effects obtained through the models showed the reduction in the risk of prevalence of leprosy still a few pockets of high prevalence exist. The situation was similar in Tuscany [11] where the epidemic of lung cancer when analyzed using birth cohorts showed a decline but the spatial pattern was evident and strong towards the northwest/south-east gradient.
In the state of Ceara North-East Brazil [20], the spatial pattern of the leprosy disease was heterogeneous and municipalities with very high prevalence were clustered towards the North-South axis. Surprisingly the region with the highest incidences were most urbanized and economically developed. According to the authors the reasons for spatial clustering of disease rates might be related to a heterogeneous distribution of the factors such as crowding, social inequality and environmental characteristics which by themselves determine the transmission of Mycobacterium leprae. It could also be due to more efficient health system present in these regions to detect new cases of leprosy more efficiently. We tried to explore the authors' above perception and observed that the pockets identified by the Bayesian model had a population density of about 1,427 per sq km which is two times higher than the district and three times higher than the state population density [21]. Hence, possibly as result of this, environmental factors, such as urbanization and overcrowding due to inadequate housing, could have led to more frequent close contact with the source of infection and favoured the spread of leprosy. We also observed that nearly 37% of the people from this pocket belong to the economically poorer strata. Generally people from economically poorer strata are more prone to infectious diseases like leprosy as they live in close proximity to one another resulting in higher risk of contracting the disease [22]. Dharmendra [23] emphasizes that one cannot con-   Panchayats are arranged in descending order of median RR values for sake of comparison. Significant panchayats and RR greater than one are highlighted. The median relative risk of leprosy prevalence across 148 panchayats estimated using the Bayesian model in two taluks of Tamil Nadu, South India (1991-2003) Figure 1 The median relative risk of leprosy prevalence across 148 panchayats estimated using the Bayesian model in two taluks of Tamil Nadu, South India (1991-2003). trol, eliminate or eradicate leprosy without improving the socio-economic status or changing the in sanitary habits of the common people. Hence these few pockets or strata need greater care to bring down the leprosy prevalence to much greater extent and to make the study area free from leprosy.

Appendix-1
The step by step algorithms and winbug codes can be downloaded from annexes of Arbyn etal [11].
Let  where ξ ij is a linear predictor, rr ij is the relative risk of the i th panchayat and the j th cohort.
is the estimated cohort effects and similar to the standardized cohort morbidity ratios described by Beral [24].
ξ ij , the linear predictor can be specified in two ways.
Model without interactions: ξ ij = a + β i str + β i unstr + β j coh , Model with interactions: ξ ij = a + β i str + β i unstr + β j coh + SS ij ac , Where SS ij ac = S ij ac -S 1j ac -S i1 ac + S 11 ac and The first term S ij ac represents the interaction term of panchayat and cohort and SS ij ac is the centering used to improve a convergence of the Markov Chain Monte Carlo Simulation (MCMC) same as discussed in detail by Arbyn et al [11].
Where β i str represents structured spatial variability; β i unstr represents unstructured spatial variability; β j coh represents the effect of the j th cohort; Prior distribution for the model The structured spatial term β str and cohort effect β coh are assigned the Gaussian Conditional Autoregression (CAR) prior distribution. They followed the closer specification of matrices σ str , σ coh and σ ac as mentioned by Lagazio [12].
They are implemented using WinBUGS' function car.normal().
The above function constraints, the random effects to add up to zero, so that the following constraints are satisfied in the model.
The prior for the interaction vector S ac is a Markov random field.
The intercept term 'a' was given a flat prior through Win-BUGS function dflat().
The precision terms μ str and μ coh and μ unstr were given Gamma priors.
The interaction precision parameter μ ac discussed in detail [11] was also given the gamma prior closely to Lagazio [12].
The space period model is similar as above instead of cohorts, periods used (t = 4). The algorithmic steps for the four models using WINBUGS with and without interactions similar to the space cohort model are discussed in detail elsewhere [11].
Deviance Information Criterion (DIC) is defined as where -the posterior expectation of the deviance and summarizes the fit of the model, -the deviance evaluated at the posterior expectations of parameters.
-the effective number of parameters.
In the model the spatial effect (autocorrelation) depends on (ii) the number of shared neighbours (panchayats).
For fitting the models with and without interactions, two chains, with 1:10 thinning was used to obtain a sample of 10 000 values.
In case of the model without interactions a burn-in of 50,000 iterations and an additional 50,000 iterations were used. For fitting the models with interactions, a burn-in of 100 000 iterations and an additional 50,000 were used.
Convergence was checked using procedures mentioned by Gelman and Geweke [25,26] test and partial correlation plots to check for achieved convergence, of relative risks and hyper-parameters.