Data Analysis

            The names of the districts have changed in Karnataka over the time period I studied.  By using maps, I could follow changes the in names of districts in the malaria data and find the corresponding districts in the climate data.  The names of the districts referred to in this study are the names given to these districts in a website published by the Government of India (1996).  I could not find Upper Krishna Project on any of the district maps, so I could not get any climate data to compare with the malaria data that the NMEP published for this area.  Also, the ABER in this district was greater than 100%, which means that more blood slides were examined than there are people in the district.  Because of these reasons, I decided to eliminate this district from my study despite high malaria rates there.

            All statistical analyses for this study were done using the statistical software package, Stata (1984-1997:  Computing Resource Center, 1640 Fifth Street, Santa Monica, California, USA).  I calculated the average yearly mean temperature (average of January through December mean temperatures for each year for each district) and the total yearly precipitation (sum of the January through December total precipitation for each year and each district). 

           In order to assess the degree to which the malaria data and the climate data correlated, I performed linear regression analyses.  A linear regression calculates the degree to which the independent variable (climatic variables) predict the variability in the dependent variable (API or SPR).  The R2 value can be interpreted as the fraction of the variability in the dependent variable (the specific malaria rate) that can be predicted by the independent variable (the specific climatic variable).  The p-value expresses the probability that the correlation is due to chance alone.

            I calculated a simple linear regression between average yearly mean temperature and API and average yearly mean temperature and SPR, for each district.  I did the same for total yearly precipitation and both malaria rates by district.  I then performed simple linear regression analyses for SPR and API with each monthly mean temperature and monthly precipitation for each district, in order to determine if the mean temperature or precipitation of certain months predicted a large amount of the variability in API or SPR.  I applied linear regression models for November and December mean temperature and precipitation with a one year time lag between the climate variable and the malaria prevalence rate because changes in these variables are most likely to influence malaria rates in the next calendar year than in the current year.

            I assumed linearity in applying these regressions because the malaria data I used are not very sophisticated and are highly variable, and also because not enough is known about the relationship between malaria and climate to assume non-linearity with any reasoning (Bradley, 2000, personal communication).  I tested this assumption, however, by applying logarithmic, exponential, polynomial and power curves to some of the scatter plots of climate versus malaria, but none of these models increased the R2 value over that  was found when doing a linear regression model.

            I selected the climate variables that had statistically significant correlation (p-values of less than 0.10) in the largest number of districts and applied multi-variable linear regression models to groups of these climate variables as the independent (predictor) variables and the malaria rates as the dependent (outcome) variables.  I found the combination of climatic variables that together gave the highest R2 value of SPR and of API for each district.  When doing a multi-linear regression, an R2 and an adjusted R2 value are calculated.  The adjusted R2 value deducts the amount to which the predictive power of the regression analysis is decreased by adding the extra climate variables.  By looking at the individual p-value for a t-test for each climate variable, I could assess which variable was more detrimental than beneficial to the regression model.  I could then eliminate such variables one by one until reaching a maximum R2 because removing any more variables would result in a decrease in R2.

            From my assumption that malaria control efforts decouple the relationship between malaria and climate, I also performed linear regression analyses of the correlation between API, SPR, and different climatic variables only during years when malaria rates did not appear to be as controlled by insecticides and other malaria control programs.  I also grouped some of the districts together by their API level over time and performed regression analyses of the climate variables that had the highest correlation to assess whether there was more significant correlation by grouping due to malaria trend.  I also looked at the yearly change trends in API and SPR by subtracting the previous year’s rate from that of the current year.  I then performed linear regression analyses between the yearly change in API or SPR and monthly mean temperature and monthly precipitation to assess whether climate predicts the amount of change in malaria rates from year to year better than it predicts actual rates.

           I calculated the Pearson’s correlation coefficient, which measures the extent to which the two variables vary together, between SPR and API, and between selected precipitation and mean temperatures to assess the correlation between different climate variables. 

Back Home

 

Last Updated May 17, 2000