Dealing with incomplete data: A structured tensor-completion approach for recurrence plots

Martijn Boussé; Philippe Dreesen; Pietro Bonizzi; Joël Karel; Ralf Peeters

Dealing with incomplete data: A structured tensor-completion approach for recurrence plots

Martijn Boussé, Philippe Dreesen, Pietro Bonizzi, Joël Karel, Ralf Peeters

Research output: Contribution to conference › Abstract › Academic

Abstract

Incomplete data, i.e., data with missing and/or unknown values, is a ubiquitous problem, exacerbated by the rise of big data, in a wide range of applications within signal processing, machine learning, and scientific computing. The origin of incompleteness can be unintentional, such as in the case of faulty sensors, but it can also be deliberate, e.g., whenever measurements are expensive or difficult to obtain. In any case, practical algorithms have to be able to tackle data with gaps and/or irregular sampling. There generally exist two approaches: imputation or expectation-maximization strategies and direct algorithms that only take into account the known data. In this work, we propose a tensor completion-based approach for recurrence plots constructed from incomplete time series data, which is an important and upcoming problem in recurrence analysis, as mentioned in the review paper of Marwan and Kraemer, 2023.

Handling incomplete data through a tensor-completion strategy involves the assumption that the data admits a low-rank representation, in the exact case, or approximation, in the non-exact/noisy case. This is often the case in a variety of applications, and especially in large-scale problems, thanks to latent structures in the data, such as sparsity and low-rankness, given an adequate representation. For example, signals that can be approximated well by (exponential) polynomials adhere a low-rank factorization when embedded in a Hankel matrix. Imputation through tensor completion then involves two steps: 1) compute a low-rank factorization by using only the known values, and 2) fill-in the unknown values by means of the low-rank factorization.

In this work, we cast the computation of the values of an unthresholded recurrence plot into a structured tensor-completion problem, inside the Euclidean norm. This is possible if the embedding matrix, constructed from the incomplete time series, allows a low-rank representation/approximation, which will depend on the embedding parameters. In that case, our proposed approach allows for severe under-sampling and wide data gaps, whether that be an unintentional or deliberate aspect of a particular application. We compare our approach with direct imputation of the incomplete time series instead of approximating the latent structure of the embedding.

Original language	English
Publication status	Published - 28 Aug 2023
Event	10th International Symposium on Recurrence Plots - University of Tsukuba, Faculty of Engineering, Information and Systems, Tsukuba, Japan Duration: 28 Aug 2023 → 30 Aug 2023 http://symposium.recurrence-plot.tk/?a=workshop

Symposium

Symposium	10th International Symposium on Recurrence Plots
Abbreviated title	ISRP
Country/Territory	Japan
City	Tsukuba
Period	28/08/23 → 30/08/23
Internet address	http://symposium.recurrence-plot.tk/?a=workshop

Keywords

recurrence plots
incomplete data
Tensor factorization
TENSOR
completion

Cite this

@conference{c527d2fc239a4d1fbff1e3c133fdae53,

title = "Dealing with incomplete data: A structured tensor-completion approach for recurrence plots",

abstract = "Incomplete data, i.e., data with missing and/or unknown values, is a ubiquitous problem, exacerbated by the rise of big data, in a wide range of applications within signal processing, machine learning, and scientific computing. The origin of incompleteness can be unintentional, such as in the case of faulty sensors, but it can also be deliberate, e.g., whenever measurements are expensive or difficult to obtain. In any case, practical algorithms have to be able to tackle data with gaps and/or irregular sampling. There generally exist two approaches: imputation or expectation-maximization strategies and direct algorithms that only take into account the known data. In this work, we propose a tensor completion-based approach for recurrence plots constructed from incomplete time series data, which is an important and upcoming problem in recurrence analysis, as mentioned in the review paper of Marwan and Kraemer, 2023.Handling incomplete data through a tensor-completion strategy involves the assumption that the data admits a low-rank representation, in the exact case, or approximation, in the non-exact/noisy case. This is often the case in a variety of applications, and especially in large-scale problems, thanks to latent structures in the data, such as sparsity and low-rankness, given an adequate representation. For example, signals that can be approximated well by (exponential) polynomials adhere a low-rank factorization when embedded in a Hankel matrix. Imputation through tensor completion then involves two steps: 1) compute a low-rank factorization by using only the known values, and 2) fill-in the unknown values by means of the low-rank factorization. In this work, we cast the computation of the values of an unthresholded recurrence plot into a structured tensor-completion problem, inside the Euclidean norm. This is possible if the embedding matrix, constructed from the incomplete time series, allows a low-rank representation/approximation, which will depend on the embedding parameters. In that case, our proposed approach allows for severe under-sampling and wide data gaps, whether that be an unintentional or deliberate aspect of a particular application. We compare our approach with direct imputation of the incomplete time series instead of approximating the latent structure of the embedding.",

keywords = "recurrence plots, incomplete data, Tensor factorization, TENSOR, completion",

author = "Martijn Bouss{\'e} and Philippe Dreesen and Pietro Bonizzi and Jo{\"e}l Karel and Ralf Peeters",

year = "2023",

month = aug,

day = "28",

language = "English",

note = "10th International Symposium on Recurrence Plots, ISRP ; Conference date: 28-08-2023 Through 30-08-2023",

url = "http://symposium.recurrence-plot.tk/?a=workshop",

}

TY - CONF

T1 - Dealing with incomplete data: A structured tensor-completion approach for recurrence plots

AU - Boussé, Martijn

AU - Dreesen, Philippe

AU - Bonizzi, Pietro

AU - Karel, Joël

AU - Peeters, Ralf

PY - 2023/8/28

Y1 - 2023/8/28

N2 - Incomplete data, i.e., data with missing and/or unknown values, is a ubiquitous problem, exacerbated by the rise of big data, in a wide range of applications within signal processing, machine learning, and scientific computing. The origin of incompleteness can be unintentional, such as in the case of faulty sensors, but it can also be deliberate, e.g., whenever measurements are expensive or difficult to obtain. In any case, practical algorithms have to be able to tackle data with gaps and/or irregular sampling. There generally exist two approaches: imputation or expectation-maximization strategies and direct algorithms that only take into account the known data. In this work, we propose a tensor completion-based approach for recurrence plots constructed from incomplete time series data, which is an important and upcoming problem in recurrence analysis, as mentioned in the review paper of Marwan and Kraemer, 2023.Handling incomplete data through a tensor-completion strategy involves the assumption that the data admits a low-rank representation, in the exact case, or approximation, in the non-exact/noisy case. This is often the case in a variety of applications, and especially in large-scale problems, thanks to latent structures in the data, such as sparsity and low-rankness, given an adequate representation. For example, signals that can be approximated well by (exponential) polynomials adhere a low-rank factorization when embedded in a Hankel matrix. Imputation through tensor completion then involves two steps: 1) compute a low-rank factorization by using only the known values, and 2) fill-in the unknown values by means of the low-rank factorization. In this work, we cast the computation of the values of an unthresholded recurrence plot into a structured tensor-completion problem, inside the Euclidean norm. This is possible if the embedding matrix, constructed from the incomplete time series, allows a low-rank representation/approximation, which will depend on the embedding parameters. In that case, our proposed approach allows for severe under-sampling and wide data gaps, whether that be an unintentional or deliberate aspect of a particular application. We compare our approach with direct imputation of the incomplete time series instead of approximating the latent structure of the embedding.

AB - Incomplete data, i.e., data with missing and/or unknown values, is a ubiquitous problem, exacerbated by the rise of big data, in a wide range of applications within signal processing, machine learning, and scientific computing. The origin of incompleteness can be unintentional, such as in the case of faulty sensors, but it can also be deliberate, e.g., whenever measurements are expensive or difficult to obtain. In any case, practical algorithms have to be able to tackle data with gaps and/or irregular sampling. There generally exist two approaches: imputation or expectation-maximization strategies and direct algorithms that only take into account the known data. In this work, we propose a tensor completion-based approach for recurrence plots constructed from incomplete time series data, which is an important and upcoming problem in recurrence analysis, as mentioned in the review paper of Marwan and Kraemer, 2023.Handling incomplete data through a tensor-completion strategy involves the assumption that the data admits a low-rank representation, in the exact case, or approximation, in the non-exact/noisy case. This is often the case in a variety of applications, and especially in large-scale problems, thanks to latent structures in the data, such as sparsity and low-rankness, given an adequate representation. For example, signals that can be approximated well by (exponential) polynomials adhere a low-rank factorization when embedded in a Hankel matrix. Imputation through tensor completion then involves two steps: 1) compute a low-rank factorization by using only the known values, and 2) fill-in the unknown values by means of the low-rank factorization. In this work, we cast the computation of the values of an unthresholded recurrence plot into a structured tensor-completion problem, inside the Euclidean norm. This is possible if the embedding matrix, constructed from the incomplete time series, allows a low-rank representation/approximation, which will depend on the embedding parameters. In that case, our proposed approach allows for severe under-sampling and wide data gaps, whether that be an unintentional or deliberate aspect of a particular application. We compare our approach with direct imputation of the incomplete time series instead of approximating the latent structure of the embedding.

KW - recurrence plots

KW - incomplete data

KW - Tensor factorization

KW - TENSOR

KW - completion

M3 - Abstract

T2 - 10th International Symposium on Recurrence Plots

Y2 - 28 August 2023 through 30 August 2023

ER -