Zoonotic diseases are infectious diseases that can be transmitted from or through animals to humans, and arthropods often act as vectors for transmission. Emerging infectious diseases have been increasing both in prevalence and geographic range at alarming rates the last 30 years, and the majority of these diseases are zoonotic in nature. Many zoonotic diseases are considered notifiable by the Centers for Disease Control and Prevention (CDC). However, though state regulations or contractual obligations may require the reporting of certain diseases, significant underreporting is known to exist. Because of the rich volume of information captured in health insurance plan databases, administrative medical claims data could supplement the current reporting systems and allow for more comprehensive spatio-temporal analyses of zoonotic infections.
The purpose of this dissertation is to introduce the use of electronic administrative medical claims data as a potential new source that could be leveraged in ecological field studies in the surveillance of arthropod-borne zoonotic diseases. If using medical claims data to study zoonoses is a viable approach, it could be used to improve both the temporal and spatial scale of study through the use of long-term longitudinal data covering large geographic expansions and more geographically refined ZIP code scales. Additionally, claims data could supplement the current reporting of notifiable diseases to the CDC. This effort may help bridge the disease incidence gap created by health care providers' underreporting and thus allow for more effective tracking and monitoring of infectious zoonotic diseases across time and space.
I specifically examined 5 tick-borne (Lyme disease [LD], babesiosis, ehrlichiosis, Rocky Mountain spotted fever [RMSF], and tularemia) and 2 mosquito-borne (West Nile virus, La Crosse viral encephalitis) diseases known to occur in the southeastern US. I first compared disease incidence rates from cases reported to the Tennessee Department of Health (TDH) state registry system with medically diagnosed cases captured in a southeastern managed care organization (MCO) claims data warehouse. I determined that LD and RMSF are significantly underreported in Tennessee. Three (3) cases of babesiosis were discovered in the claims data, a significant finding as this disease has never been reported in Tennessee. Next, I used a cluster scan statistic to statistically validate when (temporal) and where (spatial) these data sources differ. Findings highlight how the data sources do not overlap in their significant cluster results, supporting the need to integrate administrative and state registry data sources in order to provide a more comprehensive set of case information. Once the usefulness of administrative data was demonstrated, I focused on how these data could improve spatio-temporal macro-scale analyses by examining information at the ZIP code level as opposed to traditional county level assessments. I expanded on the current literature related to spatially explicit modeling by employing more advanced data mining modeling techniques. Four separate modeling techniques were compared (stepwise logistic regression, classification and regression tree, gradient boosted tree, and neural network) to describe the occurrence of tick-borne diseases as they relate to socio-demographic, geographic, and habitat characteristics. Covariates most useful in explaining LD and RMSF were similar and included co-occurrences of RMSF and LD, respectively, amount of forested and non-forested wetlands, pasture/grasslands, and urbanized/developed lands, population counts, and median income levels. Finally, I conclude with a ZIP code level spatio-temporal modeling exercise to determine areas and time periods in Tennessee where significant clusters of the studied diseases occurred. ZIP code level clusters were compared to the previously defined county-level clusters to discuss the importance of spatial scale. The findings suggest that focused disease/vector prevention efforts in non-endemic areas are warranted.
Very little work exists using administrative claims data in the study of zoonotic diseases. This body of work thus adds to an area void of much knowledge. Administrative medical claims data are relatively easy to access given the appropriate permissions, have relatively no cost once access is granted, and provides the researcher with a volume rich dataset from which to study. This data source should be properly considered in the wildlife and biological sciences fields of research.