Without reliable data, many people struggle to make sense of the COVID-19 figures that are emerging.
“There are common pitfalls in understanding the COVID-19 data in the news about how many people are infected, how fast the virus will spread and when the outbreak might end,” says Ejaz Ahmed, Dean of Brock University’s Faculty of Mathematics and Science and Professor in the Department of Statistics.
Ahmed, a Fellow of the American Statistical Association, is an expert in predictive modelling.
“A key issue in COVID-19 data is the difference between projections and predictions,” he says.
Projections are made using available data but do not require a high degree of confidence in any one specific outcome. Weather forecasts, for example, are based on assumptions regarding the future state of climate conditions, such as temperature or rainfall.
“They are a series of guesses and conjectures with varying levels of confidence in their accuracy and shouldn’t be relied upon heavily,” says Ahmed.
Often, even an accurate projection will result in a range of outcomes where no single result is likely to occur. There may be a little rain, or a mix of clouds and sunshine.
In the case of COVID-19, there are many projections surrounding the virus, and care should be used to understand the accuracy of the claims, says Ahmed.
If the data says projection, it’s very likely to change.
Predictions are a subset of projections with a higher degree of confidence in one result. If one has to choose between the two, predictions should be prioritized, but even then, they aren’t a guarantee.
“Predictive models are mathematically involved and rigorous,” says Ahmed. “However, like projections, predictive models are built on certain assumptions. If the assumptions are stringent, then conclusions based on predictive models can be misleading, albeit mathematically correct.”
When studying COVID-19, a data scientist may eventually be able to make predictions, such as when it’s safe to return to normal community behaviour, including returning to work and discontinuing physical distancing.
Ahmed says data has become available using American health sources to implement the statistical models for prediction rather than projection, and the hope is to use those models to start formulating predictions for Canada.
A savvy data enthusiast should look for the key terms of projection and prediction when assuming the model’s accuracy, he says.
Understanding the difference will ensure a person can better rely on their knowledge and make informed decisions about the future of the COVID-19 pandemic.