Curse of Small Sample Size in Forecasting of the Active Cases in COVID-19 Outbreak

Main Article Content

Mert Nakıp
Onur C¸ opur
C¨uneyt G¨uzelis¸

Keywords

COVID-19, forecasting, machine learning, feature selection, generalization

Abstract

The COVID-19 pandemic has affected almost all countries from 2020 to 2022. During this period,
numerous attempts have been made to predict the number of cases and other future trends of the pandemic.
However, they fail to reliably predict the medium and long term evolution of the key features
of the COVID-19 pandemic. This paper explains the possible reasons for insufficiency of machine
learning models in this particular prediction/forecasting problem. The experimental results in this
paper show that simple linear regression models reliably provide high prediction accuracy for a period
of only 2-weeks. On the other hand, relatively complex machine learning models, which have
the potential of learning long term predictions with low errors, cannot both achieve good prediction
results and have a high generalization ability. This paper argues that the insufficiently small sample
size is the source of the poor performance of the forecasting models. In our experimental study, we
measure the generalization ability of the models through the cross-validation errors. To this end, we
first select the most relevant features to forecast active cases among various features using Pairwise
Correlation, Recursive Feature Selection, and feature selection by using the Lasso regression. We
also compare the performances of Linear Regression, Multi-Layer Perceptron and Long-Short Term
Memory models, each combined with the feature selection methods to predict the number of active
cases. Our results show that accurate forecasting of the active cases with high generalization ability
is possible up to 3 days only due to the small sample size of the COVID-19 data. We observe that
the linear regression model has much better prediction performance with high generalization ability
compared to the complex models, but its performance decays sharply for prediction horizon longer
than 14-days, as expected.