Changing conditions + static modeling = less accuracy
Lessons to learn & a research code of ethics
Google Flu Trends, three questions to ask:
1. Was the original Google Flu Trends model flawed?The original Google Flu Trends methodology and validation was sound. The method began by examining nine different locations in the US and comparing all Google searches from those areas with the proportion of physician visits reported to be from patients displaying Influenza-like illnesses, as reported by the CDC with a lag of some weeks. By examining the time series of weekly search data and physician data from 2003-2007, the researchers were able to learn which terms that people searched for, out of 50 million, correlated strongly with the flu. The model findings could fairly be judged robust since they are based on a long time period and across 9 different locations. The more widely applicable a finding is, the better.The selection of the search terms to go into the model represents one of the first stumbling blocks in Big Data research; when confronted with a large number of variables, it is possible to find seemingly significant correlations which simply occur by chance! (this is a point made very well by Nasim Taleb) However the authors were careful to compile a list of terms which predicted the data well, but were also logically connected with the flu outbreak. Crucially, however, the exact terms were not specified, making the findings unreproducible.
2. Could the problems with Google Flu Trends have been avoided with better testing of the model?The model testing was robust. Once the Google researchers had found the search terms which best fit the data, they tested their model on new unseen data for 2007-8; this type of ‘blind test’ is known as a validation set in machine learning. This avoided another pitfall in models learning from data; fitting too closely to known data so that your model doesn’t perform well when applied to new data that looks slightly different.
3. How could Google Flu Trends been deployed differently to continue to make accurate predictions?One mistake that was made was to leave the Google Flu Trends algorithm unchecked for several years. The online landscape is a dynamic one; the way online services are used is liable to change considerably in the space of a few years and Google constantly makes changes to its platform. Therefore such findings shouldn’t be considered as fundamental and unchanging such as those discovered in physics or mathematics.Instead the model should have been systematically re-calibrated to ensure it was still making accurate predictions (tools such as the Kalman filter allow for continuous updating of the parameters). Perhaps the continuous underlying changes over time, such as growth in overall searches, could have been incorporated into the model with an extra linear term. However it is likely that single one-off changes to the underlying search engine algorithm or platform would require a complete recalibration.