International Journal of Health Geographics Spatiotemporal Modelling and Mapping of the Bubonic Plague Epidemic in India

Background: This work studies the spatiotemporal evolution of bubonic plague in India during 1896–1906 using stochastic concepts and geographical information science techniques. In the past, most investigations focused on selected cities to conduct different kinds of studies, such as the ecology of rats. No detailed maps existed incorporating the space-time dependence structure and uncertainty sources of the epidemic system and providing a composite space-time picture of the disease propagation characteristics.


Background
Epidemics can have profound effects upon human populations, states and governments [1]. The bubonic plague started in southern China in 1855, but the situation was rather ignored by the rest of the world (in part because of the 1851-1864 Taiping rebellion in the area; [2]). But, when the plague reached Hong Kong and Canton in 1894, western authorities decided to intervene to avoid a worldwide epidemic [3]. Unfortunately, they were only partly successful, as the epidemic spread primarily in India with serious consequences [4,5]. It took more than 10 years to conclude that the bubonic plague was a disease transmit-ted by fleas. About 10% of the infected fleas develop esophagus and gizzard blockage due to the highly abnormal reproduction of bacteria after sucking blood from infected rats. The blockage makes the fleas hungry, since the blood cannot reach the intestines for digestion, which implies that the fleas start biting at a higher rate. The bubonic plague mechanism requires at least one species of rodents that is resistant to the bacteria and another one that is not. The resistant rodents keep the disease endemic by hosting the infested fleas. The infection is passed from the fleas to their victims by means of the blocked fleas that regurgitate infected blood at biting. As the sensitive rodents are infected, they die. Then the fleas, facing a shortage of their preferred host, move onto people, thus triggering the epidemic. The disease makes its appearance in humans within a week following the biting by infected fleas. The causative agent of the bubonic plague is the Yersinia pestis (Y. pestis) bacillus. Multiplication of the bacteria produces the characteristic "bubo" (swollen, painful lymph node). Bacteremia follows, causing death in about 75% of those affected.
The Indian bubonic plague characteristics (seasonal effects, the role of rats and fleas etc.) have been well documented [6]. However, most investigations focused on selected cities to conduct different kinds of studies, such as the ecology of rats. As a result, the monthly or yearly mortality information contained in the bubonic plague database refers only to these cities; also, most of the plague data were organized at the provincial scale (the data distribution is shown in Figure 1). To our knowledge, no detailed maps exist providing a composite space-time picture of the disease propagation characteristics across space-time. In order to produce systematic space-time maps of the epidemic, it is necessary to use a rigorous methodology that makes optimal use of the provincial monthly mortality information to support a monthly database construction. At the same time, the various uncertainty sources characterizing the propagation of the epidemic must be adequately accounted by the space-time methodology.

Spatiotemporal maps of bubonic plague mortality
Monthly mortality rates for the period September 1896-October 1906 were collected or generated using the proposed space-time method, leading to a complete set of 122 mortality maps in space-time. The maps allow displaying and studying in detail the geographical distribution, temporal evolution and seasonal effects of the deadly epidemic. Due to space limitations, a small subset of these maps is plotted in Figure 2 (although a complete set is available). The space-time method generates the complete mortality pdf, f K , at each space-time point over India; from these pdf, characteristic values (mode, mean, median) can be derived. The maps in Figure 2 represent the mean values, , of the corresponding pdf.
The territory shown in these maps is India in the 1900s, excluding a part of Burma. For historic reasons, the provincial information in the database is slightly different from the information shown in the maps. E.g., the Bengal presidency was divided into 3 provinces: Bihar/Orissa, Bengal, and Assam [7]. The reason for this database division is that in 1900 these provinces were still parts of the Bengal presidency. However, in 1905 the Bengal presidency was partitioned into Bengal and Assam, whereas in 1912 Bihar and Orissa formed a new province. The monthly provincial mortality provided in [8] was in accordance with the later partition of provinces. The generated bubonic plague maps describe the varying mortality spread throughout India and during different time periods. On the basis of these maps, several observations are made below concerning the space-time distribution and propagation of the bubonic plague epidemic throughout India.
The bubonic plague started at a port: Bombay City (1896). Then, it gradually moved inland, like a cloud of varying size, depending on the geographical area (the cloud analogy aims to describe an epidemic propagation advancing but at the same time disappearing and reappearing in infected areas). Although most of India was visited by the plague (also, Figure 3 below), during the first decade of the epidemic a few regions were spared: (i) Assam (it is believed that life style and the local population condi-, m k mean The bubonic plague data distribution Figure 1 The bubonic plague data distribution. Circles denote yearly city data, triangles denote yearly or monthly provincial data information and stars denote monthly city data. Noticeably, during the epidemic years the Nagpur city displayed a different behavior than the rest of the region. Nagpur city was the capital of the Central Provinces and the trade centre of middle India; it also was one of the few cities connected with the Great India Peninsula Railway (GIP) system, [9], at the time. The convenience provided by the developed transportation means may have been an influential factor in the space-time spread of bubonic plague. The railway map of India in 1893 offers some hints concerning the close association between the railway system and the epidemic distribution. In the early 1900s, India was divided into British India (Bombay Presidency, Punjab, United Provinces, Bengal Presidency etc.), as well as other native India states (Hyderabad, Mysore, Central India etc.). GIP offered good connections for most of the British India region; however, the native India parts were more difficult to reach than the British India region. Therefore, the transportation infrastructure may provide some insight on why British India generally suffered more by bubonic plague than the native India. The railway also increased the spreading speed of the bubonic plague during 1902-1903, as is noticed in the space-time mortality maps of these years. Cawnpore (a railway junction in the United Provinces) and Bhagalpur (the administrative and marketing center in Bihar with railroad connection) were both infected by more severe bubonic plague attacks than the remaining areas in the region.

Spatiotemporal maps of geographical evolution
The selection of maps in Figure 3 shows the geographical evolution of the epidemic during different times (a complete set of more than 120 maps is available but is not plotted here, due to space limitations According to the geographical evolution maps the dispersion of the bubonic plague in India would either slow down or come to a complete halt from April to August in each epidemic year. Another feature of the disease evolution made apparent from the mortality maps is that the epidemic tended to stop or slow down during certain periods of the year, even if at the time the epidemic would tend to restart or would still be going through a severe stage in a region.

Spatiotemporal epidemic pattern
The temporal variation of the total infected area indicator, T(t), for the Indian epidemic is shown in Figure 4. During its entire duration, the bubonic plague invaded new geographical areas at an average rate of about 4 × 10 4 km 2 / month. Considered in the context of local time intervals, the T-curve of the bubonic plague spread consists of two parts ( Figure 4): The slope of the one part is close to zero, which signifies a halt in the plague spread; the second part, however, exhibits a steep slope indicating that the bubonic plague would spread very fast at a time of high activity. In fact, the propagation rate during the months of December 1901-May 1902 is the fastest observed for the epidemic during any time period.
An interesting phenomenon is observed in Figure 5 that According to the geographic evolution maps of the epidemic, it is evident that the bubonic plague ravaged across almost its entire space -the subcontinent of India. From the plot of A(t) vs. t (Figure 7) it is clear that bubonic plague grew larger and larger during the years, and at certain times it occupied most regions of India. The reason for this is that the plague followed a re-emergence pattern from its prior infection in the geographical locations. The plot of N(t) vs. t in Figure 8 gives some insight about the relocation mechanism for the bubonic plague epidemic.
The ragged bubonic plague curve shows that the plague would change its location twice a year, i.e., almost every spring and fall. The area-averaged monthly mortality variation plot for bubonic plague is shown in Figure 9. The plot is in agreement with the results of [11], which had concluded that the plague mortality during the years 1900-1905 did not exceed 0.4% per year.
Moreover, the epidemic indicators allow some interesting comparisons of the Indian epidemic vs. the medieval Black Death epidemic in Europe (for a detailed discussion of the spatiotemporal characteristics of Black Death, see [10]). E.g., during the bubonic plague's first 40 months d dt d dt there were only 3 times when the mortality rate was raised to levels comparable to those of the medieval Black Death; during the rest of the time, the plague either remained static geographically or it completely disappeared. As was discussed in previous sections, bubonic plague was stale in the summer and active in the spring and fall, whereas the geographical Black Death evolution tended to be slower during winter. Bubonic plague fatalities in India were minimal to none during the dry and hot months of April-August [7]. On the contrary, the Black Death casualties merely slowed down during the cold winter months; moreover, after the first summer, Black Death reached a global maximum that was followed by maxima of decreasing levels, whereas bubonic plague kept reaching new heights for several years.

Discussion -conclusion
We introduced rigorous stochastic modelling in the spatiotemporal analysis of the bubonic plague epidemic in India during the years 1896-1906. We discussed geographical information science concepts and techniques underlying the study of the bubonic plague patterns of growth.
A variety of informative space-time maps of mortality rates and geographical disease evolution were generated that accounted for multi-sourced databases and uncertainty conditions. A series of epidemic indicators were plotted that offered meaningful characterizations of the spatiotemporal disease distribution.
The bubonic plague in India exhibited strong seasonal and geographical features. During its entire duration, the plague continued to invade new geographical areas, while it followed a re-emergence pattern at many localities; its rate changed significantly during each year and the mortality distribution exhibited space-time heterogeneous patterns; prevalence usually occurred in the autumn and spring, whereas the plague stopped moving towards new locations during the summers.
Among other things, the epidemic maps and indicators make possible the comparison of the spatiotemporal characteristics of different diseases. Such comparisons could detect discrepancies in the spatiotemporal patterns of the epidemics; it is then left to specialized investigations to seek substantive and biologically plausible explanations of these discrepancies.

Stochastic spatiotemporal modelling
Methodologically, the bubonic plague distribution across space-time is conceptualized as a stochastic epidemic system and the bubonic mortality rates are represented as spatiotemporal random fields (S/TRF). The geographical information technology is subsequently implemented to map these rates in the composite space-time domain of the epidemic. This is a synthetic methodology ( [10]) that provides the means to understand the nature of spacetime disease processes among populations of individuals.
Our understanding of past epidemics is uncertain because the information available is inconclusive and to some degree inaccurate (on occasion the available evidence appears contradictory). Therefore, in modern epidemic studies the S/TRF theory is the main tool of stochastic modelling. The S/TRF theory adequately contextualizes and evaluates the uncertainty sources concerning the space-time distribution of an epidemic. The values of a Temporal variation of the dT/dt-indicator for bubonic plague Figure 5 Temporal variation of the dT/dt-indicator for bubonic plague.
Temporal variation of the T-indicator for bubonic plague Figure 4 Temporal variation of the T-indicator for bubonic plague.
disease distribution are attributed to spatiotemporal points under conditions of uncertainty, and multiple realizations are considered that account for composite spacetime disease variations together with the associated probability models. A mathematical presentation of the S/TRF theory can be found in [12,13]. Before proceeding with our discussion of the theory, let us note that mortality distribution was selected as the space-time variable characterizing the bubonic plague epidemic in India, since the available information is related (directly or indirectly) to mortality. In particular, based on data availability, the monthly mortality rate is defined as M monthly = D monthly /TP = RD monthly , where D monthly , D yearly are the numbers of plague deaths per month and per year, respectively, TP is the total population and R is the ratio of yearly plague deaths over the total population, and can be represented as R = D yearly /TP. where the subscript KB denotes the knowledge base utilized to construct the pdf (the pdf depends explicitly on the points p 1 , p 2 , ..., but we do not explicitly show this for convenience). Stochastic modelling of disease distributions with complicated space-time patterns entails determining the S/TRF from a single realization. Trend-free (spatially homogeneous and temporally stationary) S/TRF have been assumed widely in health geographics, because they are efficient for explicit and numerical calculations. However, the homogeneous-stationary model is not the best option for phenomena with large and complicated space-time variabilities.
In light of these issues, a class of heterogeneous S/TRF has been proposed by [12] that is considerably more general than the restricted class of homogeneous-stationary S/ TRF. This new class is capable of handling complicated space-time variabilities of any size based on the following intuitive idea: The variability of an S/TRF can be characterized by means of its degree of departure from homogeneity and stationarity. This departure can be determined by a mathematical operation, in the following sense. Let Q ν/ µ be a space-time operator that transforms the S/TRF M p into a homogeneous-stationary field Y p by annihilating heterogeneities of degrees ν in space and µ in time, i.e., In this case, the mortality field M P is said to be an S/TRF with spatial and temporal heterogeneity orders ν and µ, respectively (S/TRF-ν /µ ; [12,14]). The S/TRF-ν /µ offers a  Figure 7 Temporal variation of the A-indicator for bubonic plague.

Temporal variation of the A-indicator for bubonic plague
Temporal variation of the dT 1/2 /dt-indicator for bubonic plague Figure 6 Temporal variation of the dT 1/2 /dt-indicator for bubonic plague.
general theoretical model of the mortality distribution that expresses the way causal influence is propagated in space-time and gives information about the bubonic plague dynamics at the scale of interest. For epidemic systems that evolve within domains containing complicated boundaries and trends, the departure of the random field from homogeneity-stationarity is expected to vary geographically. It is, thus, meaningful to construct local Q ν/µ -operators that generate homogeneous-stationary random fields Y p within local neighbourhoods Λ, instead of seeking global representations.
In practice, various choices of Q ν/µ -operators are possible, thus allowing considerable flexibility in the definition of the S/TRF-ν/µ. Indeed, the spatiotemporal trend of the mortality distribution M p inside each local neighborhood Λ of the epidemic system may involve various functions with space-time coordinates (polynomial, exponential, trigonometric etc. functions). For the purposes of the current bubonic plague study, an efficient Q ν/µ -operator is one that annihilates local heterogeneities involving composite space-time polynomials, P ν/µ , of degrees ν in space and µ in time. The parameters ν and µ provide a quantitative assessment of the rate of change of the functions modelling mortality patterns. E.g., the smaller the P ν/µ , degrees, the smaller the values of ν and µ The ν, µ also offer information about the stochastic model underlying the epidemic system. E.g., these parameters may determine how "far away" in space and "deep" in time the operator searches for information about the mortality field.

Knowledge bases
In epidemic studies one often needs to integrate multiple sources of information, depending on the situation. In this work, the following knowledge bases (KB) are considered: the general KB, (core knowledge about the disease), the specificatory KB, (site-specific hard and soft, instrument-and/or survey-based data), and the integration KB, (the union of the -and -KB).
In principle, the -KB accounts for existing scientific knowledge (theoretical models, epidemic laws etc.), but also other kinds of useful knowledge that are relevant to the epidemic system under consideration. In the present study, the scientific part of the -KB includes the S/TRFν/µ, theoretical model representing dependence of bubonic plague mortality across space and time (in terms of the heterogeneity orders and generalized covariance).
The areal-averaged monthly mortality rates of bubonic plague vs time Figure 9 The areal-averaged monthly mortality rates of bubonic plague vs time.
Temporal variation of the N-indicator for bubonic plague Figure 8 Temporal variation of the N-indicator for bubonic plague. Region-wide S/TRF parameters ν, and µ, offer insight into space-time epidemic variations across India. The calculation of the parameters in practice is made in terms of a computational procedure discussed in detail in previous publications (e.g., [13,15]). Maps showing the regionalized νµ, distribution over India (e.g., during the year 1903) are shown in Figure 10, thus providing information about the relative trends of mortality in space-time (ν -µ >0 implies higher degree spatial trends; νµ <0 implies higher degree temporal trends). Not only the spatial order ν changes in space, but the temporal order µ varies in space, as well. Spatial and temporal trends are interrelated, since the geographical propagation of the plague is also affected by temporal mechanisms. Theoretical κ X models assumed to express the space-time dependence of bubonic plague mortality are shown in Eq. (4) of Table 1.
The coefficients c, a ζ , b ρ and a ρζ , of the models (4) must satisfy the permissibility conditions of Table 2. The values   http://www.ij-healthgeographics.com/content/5/1/12 expresses the uncertainty due to extrapolation from monthly provincial to monthly city ratios (respectively, mortality) at a given year (respectively, during 1899-1906); is due to differences between the available yearly mortality city data and the mortality values approximated by means of the charts; and is the uncertainty due to differences between monthly mortality city values and the mean monthly mortality provincial values. The BME-ν/µ technique The geographical information technology involves a variety of advanced space-time mapping techniques (for details, see [17]). In this study, the BME-ν/µ, technique (Bayesian Maximum Entropy of order ν/µ,) has been employed, which has a number of distinguished features including the following [15,18]: -it accounts for the uncertainty features of the epidemic system in a composite space-time domain (unlike, e.g., spatial prediction techniques that focus on purely spatial domains); -it imposes no restriction on the shape of the probability laws and the predictor forms (non-Gaussian laws and non-linear predictors are automatically incorporated); -it assimilates a variety of core knowledge bases and site specific, multi-sourced datasets; and -it derives several previous techniques of epidemic modelling (statistical regression, spatial prediction, kriging etc.) as its limited cases.
In light of these important features, the mapping of the mortality distribution can be more informative and realistic than the maps generated by conventional methods that do not possess the theoretical and applied qualities of BME-ν/µ, (most of the conventional methods can use limited datasets, the linearity and normality restrictions apply etc.).
Within each neighborhood Λ of the geographical domain of the epidemic, the BME-ν/µ, mapping of the bubonic plague follows three knowledge synthesis stages: a. At the so-called structural stage, a pdf is con-   b. At the specificatory stage, the -KB considers hard and/or soft data (in the form of intervals and local probability distributions). Operationally, transformations of the following type are considered , where and D denote, respectively, a specificatory operator and information range determined by -KB. For illustration, Table 4 presents some examples of -KB together with the , D associated with the displayed soft data; is the cumulative distribution function derived from at the soft data points, I is a vector of mortality intervals at these points, and I k is a vector of interval values at the estimation points themselves.   where A is a normalization constant. From the integration pdf (6), various kinds of bubonic mortality estimates can be readily derived across space-time. E.g., the mean estimate, ; and the mode estimate, . The uncertainty of the mortality estimates may be assessed in terms of the mean squared estimation error, σ k , at each space-time mapping point.