TSLiNGAM: DirectLiNGAM under heavy tails

Sarah Leyder; Jakob Raymaekers; Tim Verdonck

TSLiNGAM: DirectLiNGAM under heavy tails

Sarah Leyder, Jakob Raymaekers, Tim Verdonck

Research output: Working paper / Preprint › Preprint

8 Downloads (Pure)

Abstract

One of the established approaches to causal discovery consists of combining directed acyclic graphs (DAGs) with structural causal models (SCMs) to describe the functional dependencies of effects on their causes. Possible identifiability of SCMs given data depends on assumptions made on the noise variables and the functional classes in the SCM. For instance, in the LiNGAM model, the functional class is restricted to linear functions and the disturbances have to be non-Gaussian. In this work, we propose TSLiNGAM, a new method for identifying the DAG of a causal model based on observational data. TSLiNGAM builds on DirectLiNGAM, a popular algorithm which uses simple OLS regression for identifying causal directions between variables. TSLiNGAM leverages the non-Gaussianity assumption of the error terms in the LiNGAM model to obtain more efficient and robust estimation of the causal structure. TSLiNGAM is justified theoretically and is studied empirically in an extensive simulation study. It performs significantly better on heavy-tailed and skewed data and demonstrates a high small-sample efficiency. In addition, TSLiNGAM also shows better robustness properties as it is more resilient to contamination.

Original language	English
Publisher	Cornell University - arXiv
Number of pages	35
Publication status	Published - 10 Aug 2023

Publication series

Series	arXiv.org
Number	2308.05422
ISSN	2331-8422

Keywords

causal discovery
efficiency
LiNGAM
structural causal models

Access to Document

2308.05422v1Final published version, 533 KB

Cite this

@techreport{801e33cc396e45a4982fbf611957d820,

title = "TSLiNGAM: DirectLiNGAM under heavy tails",

abstract = "One of the established approaches to causal discovery consists of combining directed acyclic graphs (DAGs) with structural causal models (SCMs) to describe the functional dependencies of effects on their causes. Possible identifiability of SCMs given data depends on assumptions made on the noise variables and the functional classes in the SCM. For instance, in the LiNGAM model, the functional class is restricted to linear functions and the disturbances have to be non-Gaussian. In this work, we propose TSLiNGAM, a new method for identifying the DAG of a causal model based on observational data. TSLiNGAM builds on DirectLiNGAM, a popular algorithm which uses simple OLS regression for identifying causal directions between variables. TSLiNGAM leverages the non-Gaussianity assumption of the error terms in the LiNGAM model to obtain more efficient and robust estimation of the causal structure. TSLiNGAM is justified theoretically and is studied empirically in an extensive simulation study. It performs significantly better on heavy-tailed and skewed data and demonstrates a high small-sample efficiency. In addition, TSLiNGAM also shows better robustness properties as it is more resilient to contamination.",

keywords = "causal discovery, efficiency, LiNGAM, structural causal models",

author = "Sarah Leyder and Jakob Raymaekers and Tim Verdonck",

note = "35 pages, 10 figures",

year = "2023",

month = aug,

day = "10",

language = "English",

series = "arXiv.org",

number = "2308.05422",

publisher = "Cornell University - arXiv",

address = "United States",

type = "WorkingPaper",

institution = "Cornell University - arXiv",

}

TY - UNPB

T1 - TSLiNGAM

T2 - DirectLiNGAM under heavy tails

AU - Leyder, Sarah

AU - Raymaekers, Jakob

AU - Verdonck, Tim

N1 - 35 pages, 10 figures

PY - 2023/8/10

Y1 - 2023/8/10

N2 - One of the established approaches to causal discovery consists of combining directed acyclic graphs (DAGs) with structural causal models (SCMs) to describe the functional dependencies of effects on their causes. Possible identifiability of SCMs given data depends on assumptions made on the noise variables and the functional classes in the SCM. For instance, in the LiNGAM model, the functional class is restricted to linear functions and the disturbances have to be non-Gaussian. In this work, we propose TSLiNGAM, a new method for identifying the DAG of a causal model based on observational data. TSLiNGAM builds on DirectLiNGAM, a popular algorithm which uses simple OLS regression for identifying causal directions between variables. TSLiNGAM leverages the non-Gaussianity assumption of the error terms in the LiNGAM model to obtain more efficient and robust estimation of the causal structure. TSLiNGAM is justified theoretically and is studied empirically in an extensive simulation study. It performs significantly better on heavy-tailed and skewed data and demonstrates a high small-sample efficiency. In addition, TSLiNGAM also shows better robustness properties as it is more resilient to contamination.

AB - One of the established approaches to causal discovery consists of combining directed acyclic graphs (DAGs) with structural causal models (SCMs) to describe the functional dependencies of effects on their causes. Possible identifiability of SCMs given data depends on assumptions made on the noise variables and the functional classes in the SCM. For instance, in the LiNGAM model, the functional class is restricted to linear functions and the disturbances have to be non-Gaussian. In this work, we propose TSLiNGAM, a new method for identifying the DAG of a causal model based on observational data. TSLiNGAM builds on DirectLiNGAM, a popular algorithm which uses simple OLS regression for identifying causal directions between variables. TSLiNGAM leverages the non-Gaussianity assumption of the error terms in the LiNGAM model to obtain more efficient and robust estimation of the causal structure. TSLiNGAM is justified theoretically and is studied empirically in an extensive simulation study. It performs significantly better on heavy-tailed and skewed data and demonstrates a high small-sample efficiency. In addition, TSLiNGAM also shows better robustness properties as it is more resilient to contamination.

KW - causal discovery

KW - efficiency

KW - LiNGAM

KW - structural causal models

M3 - Preprint

T3 - arXiv.org

BT - TSLiNGAM

PB - Cornell University - arXiv

ER -