Dealing with incomplete data: A structured tensor-completion approach for recurrence plots

Research output: Contribution to conferenceAbstractAcademic

Abstract

Incomplete data, i.e., data with missing and/or unknown values, is a ubiquitous problem, exacerbated by the rise of big data, in a wide range of applications within signal processing, machine learning, and scientific computing. The origin of incompleteness can be unintentional, such as in the case of faulty sensors, but it can also be deliberate, e.g., whenever measurements are expensive or difficult to obtain. In any case, practical algorithms have to be able to tackle data with gaps and/or irregular sampling. There generally exist two approaches: imputation or expectation-maximization strategies and direct algorithms that only take into account the known data. In this work, we propose a tensor completion-based approach for recurrence plots constructed from incomplete time series data, which is an important and upcoming problem in recurrence analysis, as mentioned in the review paper of Marwan and Kraemer, 2023.

Handling incomplete data through a tensor-completion strategy involves the assumption that the data admits a low-rank representation, in the exact case, or approximation, in the non-exact/noisy case. This is often the case in a variety of applications, and especially in large-scale problems, thanks to latent structures in the data, such as sparsity and low-rankness, given an adequate representation. For example, signals that can be approximated well by (exponential) polynomials adhere a low-rank factorization when embedded in a Hankel matrix. Imputation through tensor completion then involves two steps: 1) compute a low-rank factorization by using only the known values, and 2) fill-in the unknown values by means of the low-rank factorization.

In this work, we cast the computation of the values of an unthresholded recurrence plot into a structured tensor-completion problem, inside the Euclidean norm. This is possible if the embedding matrix, constructed from the incomplete time series, allows a low-rank representation/approximation, which will depend on the embedding parameters. In that case, our proposed approach allows for severe under-sampling and wide data gaps, whether that be an unintentional or deliberate aspect of a particular application. We compare our approach with direct imputation of the incomplete time series instead of approximating the latent structure of the embedding.
Original languageEnglish
Publication statusPublished - 28 Aug 2023
Event10th International Symposium on Recurrence Plots - University of Tsukuba, Faculty of Engineering, Information and Systems, Tsukuba, Japan
Duration: 28 Aug 202330 Aug 2023
http://symposium.recurrence-plot.tk/?a=workshop

Symposium

Symposium10th International Symposium on Recurrence Plots
Abbreviated titleISRP
Country/TerritoryJapan
CityTsukuba
Period28/08/2330/08/23
Internet address

Keywords

  • recurrence plots
  • incomplete data
  • Tensor factorization
  • TENSOR
  • completion

Cite this