Graphical Causal Models and Imputing Missing Data: A Preliminary Study

Rui Jorge Almeida*, Greetje Adriaans, Yuliya Shapovalova

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

Abstract

Real-world datasets often contain many missing values due to several reasons. This is usually an issue since many learning algorithms require complete datasets. In certain cases, there are constraints in the real world problem that create difficulties in continuously observing all data. In this paper, we investigate if graphical causal models can be used to impute missing values and derive additional information on the uncertainty of the imputed values. Our goal is to use the information from a complete dataset in the form of graphical causal models to impute missing values in an incomplete dataset. This assumes that the datasets have the same data generating process. Furthermore, we calculate the probability of each missing data value belonging to a specified percentile. We present a preliminary study on the proposed method using synthetic data, where we can control the causal relations and missing values.
Original languageEnglish
Title of host publicationInformation Processing and Management of Uncertainty in Knowledge-Based Systems - 18th International Conference, IPMU 2020, Proceedings
EditorsMarie-Jeanne Lesot, Susana Vieira, Marek Z. Reformat, João Paulo Carvalho, Anna Wilbik, Bernadette Bouchon-Meunier, Ronald R. Yager
PublisherSpringer
Pages485-496
Number of pages12
Volume1237 CCIS
ISBN (Print)9783030501457
DOIs
Publication statusPublished - 1 Jan 2020
Event18th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems - Lisbon, Portugal
Duration: 15 Jun 202019 Jun 2020
Conference number: 18

Publication series

SeriesCommunications in Computer and Information Science
Volume1237 CCIS
ISSN1865-0929

Conference

Conference18th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems
Abbreviated titleIPMU 2020
Country/TerritoryPortugal
CityLisbon
Period15/06/2019/06/20

Keywords

  • Graphical causal models
  • Missing data
  • Uncertainty in missing values

Cite this