Flu season in the united states takes between 3,000 to 49,000 lives anually. Due to the lack of reliable forecasting methods, policy makers and public health officials can't optimally prepare for these deadly epidemics.
With the advent of the internet, and consequently global internet traffic datasets, we can now begin to feasbily create models that are capable of epidemiological forecasting.
This project examines the feasbility of creating a model that is capable of predicting epidemiological trends, specifically the US influenza season, using a widely availble free data source: Wikipedia pageview access logs.