Author
Abstract
Reliable forecasts of infectious disease trajectories are indispensable for timely public health action and allocation of medical resources. However, most time‐series forecasting frameworks still rely solely on historical case counts and thus struggle to capture sudden shifts in population behavior. Therefore, to quantify the value of external behavioral signals during the COVID‐19 pandemic, this research assembled a 124‐week (from May 31, 2020, to October 9, 2022) panel that fuses Google Community‐Mobility indices with standard surveillance indicators such as new cases, deaths, tests, and vaccinations plus information about population density and the Oxford policy‐stringency score for 20 countries spanning six continents. We proceed to assess two forecasting methodological families for predicting new cases using an 8‐week hold‐out window. The target‐variable‐only family comprised models using a 4‐week rolling average, autoregressive integrated moving average (ARIMA), Prophet, and long short‐term memory (LSTM) approaches. In contrast, the data‐integration family employs distinct light gradient boosting machine (LightGBM) variants: LightGBM‐Direct, which learns a single multi‐output mapping for all periods in the horizon, and LightGBM‐Recursive, which updates a one‐step model and rolls its predictions forward. Performance is evaluated using root mean square error (RMSE) and two optimized weight indices (OWIs), which benchmark improvements over the rolling‐average baseline and ARIMA, respectively. The results demonstrate that a mobility‐enhanced LightGBM achieves the lowest RMSE in every country, reducing the overall median error by 83% compared with the baseline and by 87% against ARIMA. LightGBM‐Direct excels in twelve nations, characterized by smoother trends, whereas LightGBM‐Recursive dominates in the remaining eight, which exhibit rapid fluctuations in incidence. Notably, SHapley Additive exPlanations (TreeSHAP) identifies workplace and transit‐station mobility, testing intensity, vaccinations, and policy stringency as the most influential predictors, denoting the importance of external behavioral signals in improving pandemic forecast accuracy.
Suggested Citation
Milton Soto‐Ferrari, 2025.
"Integrating Google Mobility Indices for Forecasting Infectious Diseases Incidence: A Multi‐Country Study on COVID‐19 With LightGBM,"
Journal of Forecasting, John Wiley & Sons, Ltd., vol. 44(8), pages 2405-2424, December.
Handle:
RePEc:wly:jforec:v:44:y:2025:i:8:p:2405-2424
DOI: 10.1002/for.70006
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wly:jforec:v:44:y:2025:i:8:p:2405-2424. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www3.interscience.wiley.com/cgi-bin/jhome/2966 .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.