Monitoring Machine Learning Forecasts for Platform Data Streams

Jeroen Rombouts, Ines Wilms*

*Corresponding author for this work

Research output: Working paper / PreprintPreprint

Abstract

Data stream forecasts are essential inputs for decision making at digital platforms. Machine learning algorithms are appealing candidates to produce such forecasts. Yet, digital platforms require a large-scale forecast framework that can flexibly respond to sudden performance drops. Re-training ML algorithms at the same speed as new data batches enter is usually computationally too costly. On the other hand, infrequent re-training requires specifying the re-training frequency and typically comes with a severe cost of forecast deterioration. To ensure accurate and stable forecasts, we propose a simple data-driven monitoring procedure to answer the question when the ML algorithm should be re-trained. Instead of investigating instability of the data streams, we test if the incoming streaming forecast loss batch differs from a well-defined reference batch. Using a novel dataset constituting 15-min frequency data streams from an on-demand logistics platform operating in London, we apply the monitoring procedure to popular ML algorithms including random forest, XGBoost and lasso. We show that monitor-based re-training produces accurate forecasts compared to viable benchmarks while preserving computational feasibility. Moreover, the choice of monitoring procedure is more important than the choice of ML algorithm, thereby permitting practitioners to combine the proposed monitoring procedure with one's favorite forecasting algorithm.
Original languageEnglish
PublisherCornell University - arXiv
Number of pages38
DOIs
Publication statusPublished - 2024

Publication series

SeriesarXiv.org
Number2401.09144
ISSN2331-8422

Keywords

  • e-commerce
  • platform econometrics
  • machine learning
  • streaming data
  • monitoring forecasts

Fingerprint

Dive into the research topics of 'Monitoring Machine Learning Forecasts for Platform Data Streams'. Together they form a unique fingerprint.

Cite this