Introducing an ontology-driven pipeline for the identification of common data elements

Anas Elghafari*, Joseph Finkelstein

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

Abstract

Common Data Elements (CDEs) are necessary for ensuring data sharing across studies, providing comparability, and enabling aggregation and meta-analyses. The process of developing a set of CDEs for a given clinical research area has typically been arduous and time-consuming. In this work we introduce an automated pipeline that can greatly aid the process by identifying, aggregating, and ranking relevant CDEs from the outcomes of studies registered on clinicaltrials.gov (CTG). The pipeline uses the Medical Subject Headings (MeSH) ontology to group and rank candidate CDEs by specific diseases. The initial CDE pipeline has been tested using an emerging research domain. The resulting CDEs output was aligned with the current recommendations in the corresponding subject area. Further development of automated means for CDE generation based on structured information from CTG and MeSH is warranted.

Original languageEnglish
Title of host publicationTHE IMPORTANCE OF HEALTH INFORMATICS IN PUBLIC HEALTH DURING A PANDEMIC
EditorsJohn Mantas, Arie Hasman, Mowafa S. Househ, Parisis Gallos, Emmanouil Zoulias
PublisherIOS Press
Pages379-382
Number of pages4
ISBN (Electronic)9781643680927
DOIs
Publication statusPublished - 2020
Externally publishedYes

Publication series

SeriesStudies in Health Technology and Informatics
Volume272
ISSN0926-9630

Keywords

  • automated data extraction
  • clinical trials
  • clinicaltrials.gov
  • Common Data Elements
  • MeSH tree
  • outcomes
  • xml

Fingerprint

Dive into the research topics of 'Introducing an ontology-driven pipeline for the identification of common data elements'. Together they form a unique fingerprint.

Cite this