We first chose to characterise the environment around each household, and then averaged such characterisation for each hamlet. Direct landscape characterisation around hamlets could appear more straightforward but presented two main drawbacks: firstly it demands to clearly materialise the hamlet entity as a geographic object, which is not easy and would result of arbitrary choices (point corresponding to the barycentre of hamlet dwellings, or surface or line corresponding to the convex hull of the dwellings, etc.); secondly, the environment surrounding the hamlet can differ from the one surrounding the individual dwellings and, among them, the dwellings which are inhabited by children included in the cohort.
From a methodological standpoint, we propose here a simple and general framework for an objective and informative multivariate landscape characterisation and when very few background knowledge are available on involved processes. This is a frequent problem in landscape ecology and solutions are often determined arbitrarily, particularly when there is no available ethological knowledge (e.g. dispersion capacity) to guide an objective choice. In this context, we initially proposed tools that consider environmental variable features only: the multivariate variogram evaluating the capacity of the landscape characterisation to discriminate sites in the geographic space; the mean absolute Pearson correlation coefficient for pairs of environmental variables, which provides an indication of the redundancy within the environmental variable space. However, the most original part of the data analysis methodology relates to the objective selection of the best landscape characterisation, by means of a data-driven model selection procedure based on multiple linear regression and the Akaike information criterion.
The proposed method is not restricted to the case of discoid buffers or to studies of malaria incidence. For instance, it could be applied to the parameterisation of IFM-like (incidence function model-like) measurements, as described in Moilanen and Nieminen . A comparable application of the proposed methodology was described by Roux et al.  for selection of the most appropriate spatial weighted structure for modelling the presence and abundance of the insect vector of Chagas disease.
The methodology provides results that may be sensitive to many factors other than the outcome variable (in this case, malaria incidence). The most important of these factors is the set of environmental variables used for landscape characterisation. Moreover, the environmental data preprocessing may affect the results. In our case, a logarithmic transformation was applied, and the results were compared with those of square-root transformation. No significant difference was found in model structures, but the results, in terms of model accuracy and Pearson correlation coefficients, were poorer.
In the context of our application, the first steps of the proposed methodology tended to eliminate large buffers (radius > 400 m), which gave poor spatial discrimination of hamlets and displayed high levels of information redundancy for environmental variables. The data-driven selection model was then used to identify the optimal observation horizons: 100 and 400 m buffers were found to be the most appropriate for characterising the environment when considering P. vivax and P. falciparum malaria incidences, respectively.
The four primary PCs of the PCA for environmental variables were included in the P. falciparum regression model selected. The hamlets were thus similarly structured (or ordered) both environmentally and epidemiologically. There was therefore a strong link between environmental features and the incidence attributed to this malaria species. By contrast, the association between P. vivax malaria incidence and environmental characteristics seemed to be weaker.
In Pearson's linear correlation analysis, the proportion of bare soil within the 400 m buffer zone was found to be associated with protection against P. falciparum malaria. This land-cover feature was not favourable for the rest of adult mosquitoes or for the maintenance of breeding sites. It was closely linked to the density of dwellings, which was also found to be predictive of the incidence of P. falciparum malaria. Children living in isolated houses therefore had an increased risk of P. falciparum malaria. The proportions of primary forest and high vegetation were correlated with a higher incidence of P. falciparum malaria. This finding is consistent with previously published results [18, 19, 53]. According to Tadei and coworkers, An. darlingi returns to the forest after feeding when houses are located close to forest [54–57]. The composition of the high vegetation class requires confirmation in the field, but includes plants, shrubs and relatively small trees, contrasting with the composition of primary forest, at the interface of crop areas and secondary or primary forest. It may correspond to the vegetation present at least five years after deforestation described by Olson et al. . Moreover, the length of creeks was positively correlated with P. falciparum incidence, whereas the length of river banks was negatively correlated with this incidence. Thus, vector breeding sites are probably located mostly along small streams (creeks) rather than along the banks of the main river. Moreover, deep water appeared to be a factor protective against malaria, probably because it provides neither suitable breeding sites nor resting sites for adult mosquito vectors. This counterbalances the contribution of short distance to the main river as a risk factor for transmission  and justifies further investigations of the role of river banks in the development of breeding sites for Anopheles.
The percentage of burnt land was negatively correlated to malaria incidence. However, this land use is very transitory in space and time, replacing primary forest, secondary forest or high vegetation and preceding soils with poor vegetation cover and low vegetation over a period of a few months. Traditionally, Amerindians in Camopi burn their crop lands from the middle to the end of the dry season (i.e. from the end of August to the end of November). At the time at which the image was taken, the burnt lands were linked to villages with a low malaria incidence. We therefore suspect that there may be confounding factors linked to spatial distribution, as burning activity did not occur at the same time at all the hamlets. It is therefore not possible to determine the real effect of burning.
Landscape division within 100 and 400 m buffers was associated with higher incidences of P. vivax and P. falciparum malaria, respectively. Greater fragmentation of the landscape was therefore associated with a higher incidence of malaria, suggesting that anthropogenic presence and activity, which tend to increase landscape fragmentation and ecological changes, probably increase malaria incidence by favouring the presence and development of malaria [57, 59–61].
Classification was processed from an image taken in the dry season. Due to the topography of the study area and particularly of the river banks, the water level of the main rivers (Camopi and Oyapock) does not influence river bank positions to an extent that could be characterised by the 10-metre spatial resolution optic images (except during extreme and not representative events). However, some rocks appear in the rivers during low water level periods. They could increase the bare soil proportion in some buffers but to a negligible proportion. On the other hand, Camopi is located in humid tropical forest and the dense and almost permanent cloud cover in rainy season simply prevents us to obtain exploitable optic images during this period. In such a context, high resolution radar images could provide useful information.
The links between clinical bouts of malaria and the periods and sites of contamination are simpler and more direct in young children. Indeed, this population has little specific immunity (especially younger children) and their exposure is limited to their dwelling or to the village, depending on their age. Furthermore, in this study, malaria data were collected by following up an exhaustive cohort in a "captive" general population (i.e. all the children are followed from birth, with diagnosis occurring at only one place, and the access to diagnosis sources being unlimited). However, although the environment directly accounts for the abundance of the vector and, thus, the sites and extent of transmission, it cannot entirely account for the clinical data registered at the health centre, even for the children in Camopi. Several biases must be taken into account, such as i) individual genetic susceptibility to malaria and its clinical expression; ii) the protective measures used (nets, repellents, etc.) and iii) whether consultation at the health centre was systematic for the diagnosis of all episodes of fever (self-medication, traditional treatment, etc.). A difference in genetic susceptibility between the two ethnic groups has been reported . More than 75% of the children of the cohort spend all their nights under mosquito nets and more than 70% of the families use insecticides or topical repellents (personal results). Finally, we assumed that all bouts of malaria were recorded at the local health centre, due to the isolation of the population and its limited mobility . Moreover, with the chosen rule for identifying P. vivax relapses , some false-negative and false-positive new P. vivax infections may remain in the database. A bias in the exclusion of relapses and, thus, in the quality of P. vivax data might account for the weak link between environmental data and this malaria incidence for this species.
The behavioural habits of the families such as the use of bed nets or insecticides could have introduced a bias into the analysis. Nevertheless, in a multivariate Cox modelling approach , these variables were not risk factors for malaria attacks in children. Consequently, in the present study, we decided not to take into account these parameters.
Anopheles darlingi has not been implicated with certainty in the bouts of malaria occurring in Camopi, but this study nonetheless focused on young children, based on a hypothesis of nightly transmission at home, due to the characteristics of An. darlingi. However, other studies have reported An. darlingi to be active 24 hours per day and to be found outside during the day , suggesting that some transmission may occur in places frequented by children during the day. Furthermore, other anopheline species may be involved in malaria transmission, including during the morning [4, 62]. The age composition of the An. darlingi population may depend on season and environment [63, 64]. Thus, the involvement of another anopheline species or of different populations of An. darlingi in P. vivax transmission than in P. falciparum transmission may account for the weak relationship between the environment and P. vivax incidence at the peridomestic scale of observation.
There are limitations to the usefulness of RS for epidemiology , but this tool has several advantages: it objectively characterises the landscape features associated with malaria incidence and makes it possible to assess the sensitivity of the results to buffer size. It also provides access to past information and can be used for mapping and spatial analysis, which are useful for control measures. Furthermore, RS may provide additional information not collected by field surveys, in which observations are limited to short distances.
Our results suggest that the use of buffers of 100 and 400 m around houses is the most appropriate, in this specific case, for demonstrating a characteristic land-cover pattern accounting for differences in incidence rates as a function of the species concerned. This is greater than the radius of observation that can be covered by the human eye in the field. On the basis of this modelling, it is possible to establish a predictive map of P. falciparum malaria risk in Camopi. However, for this to be achieved correctly, we must consider not only buffer-based landscape features, but also factors such as distance to each land-cover class, and non-environmental data, such as the socio-economic and behavioural characteristics of the local populations.