Knowing who has the flu is hard. Doctors assess the small minority of sick people who make it into a clinic and send the test results to the CDC every few weeks. The CDC models this information and retroactively corrects their national influenza like illness (ILI) report every two weeks.
Knowing who’s going to have the flu in the future is much, much harder. An outbreak in one region can spread to surrounding areas, factors such as vaccination and hand washing dramatically affect how quickly illness spreads inside a local region, sick people from one region can travel and infect a different region. Modeling the spread of disease is so difficult the CDC has an annual contest to attempt to improve their models. Many teams have attempted this, but an effective solution must include modern analytical techniques of real time illness signals.
Expero built a long term forecast for modeling the spread of influenza like illness across the United States using a cutting edge deep learning model. To test this (and other) model’s accuracy, we built a recursive temporal validation suite, which looks back in time to measure model outputs against known historical disease spread.
Expero then built the ML Ops system for deployment of this model using a cloud architecture, scalable to the needs of Kinsa’s business. The system automatically trains and deploys new models every time it detects a noteworthy change in the incoming data streams, including a full data aggregation, processing, model training, and model deployment.
Health care providers run out of flu vaccine and flu treatment medicines all the time. They’re perishable and expensive. This means people, especially those in economically depressed regions, don’t get reliable treatment on demand, have to suffer through flu symptoms, and have a higher chance of contracting illness (read: herd immunity). Kinsa’s mission is to enable all people to have the treatment they deserve. Expero’s technology helps Kinsa deliver on that mission.
As a side benefit, visibility into the future state of disease in local regions means that health clinics don’t wildly overstock this inventory, saving them significant operating capital.
The development of an accurate disease-spread forecasting model is rooted in an understanding of how disease has spread in the past. Kinsa’s data collection happens at the zip code level through the use of smart thermometers, so there is an exceptional level of resolution for monitoring the disease spread. Additionally, the data is collected in real time. Combining these two aspects, it’s no wonder Kinsa is more accurate (and weeks faster) than the CDC at understanding where disease has spread in the US.
Using this data, Expero built up a list of external factors which have statistical significance in modeling the spread of disease. We then built up a suite of models to predict the future state of disease in a spatiotemporal setting (meaning disease spread both in space and time). Through recursive temporal validation, we were able to hone in on the most accurate model architecture to build a forecast which is accurate up to a year in the future!
For a more detailed description see our blog post here.