Data
Analysis
The names of the districts have
changed in Karnataka over the time period I studied.
By using maps, I could follow changes the in names of districts in the
malaria data and find the corresponding districts in the climate data.
The names of the districts referred to in this study are the names given
to these districts in a website published by the Government of India (1996).
I could not find Upper Krishna Project on any of the district maps, so I
could not get any climate data to compare with the malaria data that the NMEP
published for this area. Also, the
ABER in this district was greater than 100%, which means that more blood slides
were examined than there are people in the district.
Because of these reasons, I decided to eliminate this district from my
study despite high malaria rates there.
All statistical analyses for this study were done using the statistical
software package, Stata (1984-1997: Computing
Resource Center, 1640 Fifth Street, Santa Monica, California, USA). I calculated the average yearly mean temperature (average of
January through December mean temperatures for each year for each district) and
the total yearly precipitation (sum of the January through December total
precipitation for each year and each district).
In order to assess the degree to which
the malaria data and the climate data correlated, I performed linear regression
analyses. A linear regression
calculates the degree to which the independent variable (climatic variables)
predict the variability in the dependent variable (API or SPR).
The R2 value can be interpreted as the fraction of the
variability in the dependent variable (the specific malaria rate) that can be
predicted by the independent variable (the specific climatic variable).
The p-value expresses the probability that the correlation is due to
chance alone.
I calculated a simple linear regression between average yearly mean
temperature and API and average yearly mean temperature and SPR, for each
district. I did the same for total yearly precipitation and both
malaria rates by district. I then
performed simple linear regression analyses for SPR and API with each monthly
mean temperature and monthly precipitation for each district, in order to
determine if the mean temperature or precipitation of certain months predicted a
large amount of the variability in API or SPR.
I applied linear regression models for November and December mean
temperature and precipitation with a one year time lag between the climate
variable and the malaria prevalence rate because changes in these variables are
most likely to influence malaria rates in the next calendar year than in the
current year.
I
assumed linearity in applying these regressions because the malaria data I used
are not very sophisticated and are highly variable, and also because not enough
is known about the relationship between malaria and climate to assume
non-linearity with any reasoning (Bradley, 2000, personal communication).
I tested this assumption, however, by applying logarithmic, exponential,
polynomial and power curves to some of the scatter plots of climate versus
malaria, but none of these models increased the R2 value over that
was found when doing a linear
regression model.
I selected the climate variables
that had statistically significant correlation (p-values of less than 0.10) in
the largest number of districts and applied multi-variable linear regression
models to groups of these climate variables as the independent (predictor)
variables and the malaria rates as the dependent (outcome) variables.
I found the combination of climatic variables that together gave the
highest R2 value of SPR and of API for each district.
When doing a multi-linear regression, an R2 and an adjusted R2
value are calculated. The adjusted
R2 value deducts the amount to which the predictive power of the
regression analysis is decreased by adding the extra climate variables.
By looking at the individual p-value for a t-test for each climate
variable, I could assess which variable was more detrimental than beneficial to
the regression model. I could then eliminate such variables one by one until
reaching a maximum R2 because removing any more variables would
result in a decrease in R2.
From my assumption that malaria control
efforts decouple the relationship between malaria and climate, I also performed
linear regression analyses of the correlation between API, SPR, and different
climatic variables only during years when malaria rates did not appear to be as
controlled by insecticides and other malaria control programs.
I also grouped some of the districts together by their API level over
time and performed regression analyses of the climate variables that had the
highest correlation to assess whether there was more significant correlation by
grouping due to malaria trend. I
also looked at the yearly change trends in API and SPR by subtracting the
previous year’s rate from that of the current year.
I then performed linear regression analyses between the yearly change in
API or SPR and monthly mean temperature and monthly precipitation to assess
whether climate predicts the amount of change in malaria rates from year to year
better than it predicts actual rates.
I calculated the Pearson’s correlation coefficient, which measures the
extent to which the two variables vary together, between SPR and API, and
between selected precipitation and mean temperatures to assess the correlation
between different climate variables.