Analysis of Graphical Causal Models with Discretized Data

Ofir Hanoch*, Nalan Bastürk, Rui Jorge Almeida, Tesfa Dejenie Habtewold

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review


In several fields, sample data are observed at discrete instead of continuous levels. For example, in psychology an individual’s disease level is typically observed as ‘mild’, ‘moderate’ or ‘strong’, while the underlying mental disorder intensity is potentially a continuous variable. Implications of such discretization in linear regression are well-known: uncertainty increases and estimated causal relations become biased and inconsistent. For more complex models, implications of discretization are not theoretically studied. This paper considers an empirical study of complex models where causal relationships are unknown, some variables are discretized and graphical causal models are used to estimate causal relationships and effects. We study the implications of discretization on the obtained results using simulations. We show that discretization affects the correct estimation of causal relations and the uncertainty of obtained causal relations between discretized variables and non-discretized variables. In addition, we show that discretization influences estimated causal effects and we relate this influence to the properties of discretized data and sample size.
Original languageEnglish
Title of host publicationInformation Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2022
Subtitle of host publication19th International Conference, IPMU 2022 Milan, Italy, July 11-15, 2022 Proceedings, Part II
EditorsDavide Ciucci, Inés Couso, Dominik Slezak, Davide Petturiti, Bernadette Bouchon-Meunier, Ronald R. Yager
PublisherSpringer, Cham
ISBN (Electronic)978-3-031-08974-9
ISBN (Print)978-3-031-08973-2
Publication statusPublished - 2022

Publication series

SeriesCommunications in Computer and Information Science


  • causal discovery
  • discretized data
  • graphical causal models
  • mixed data

Cite this