The center for expanded data annotation and retrieval

Mark A. Musen; Carol A. Bean; Kei-Hoi Cheung; Michel Dumontier; Kim A. Durante; Olivier Gevaert; Alejandra Gonzalez-Beltran; Purvesh Khatri; Steven H. Kleinstein; Martin J. O'Connor; Yannick Pouliot; Philippe Rocca-Serra; Susanna-Assunta Sansone; Jeffrey A. Wiser

doi:10.1093/jamia/ocv048

The center for expanded data annotation and retrieval

Mark A. Musen^*, Carol A. Bean, Kei-Hoi Cheung, Michel Dumontier, Kim A. Durante, Olivier Gevaert, Alejandra Gonzalez-Beltran, Purvesh Khatri, Steven H. Kleinstein, Martin J. O'Connor, Yannick Pouliot, Philippe Rocca-Serra, Susanna-Assunta Sansone, Jeffrey A. Wiser

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

The Center for Expanded Data Annotation and Retrieval is studying the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse. We take advantage of emerging community-based standard templates for describing different kinds of biomedical datasets, and we investigate the use of computational techniques to help investigators to assemble templates and to fill in their values. We are creating a repository of metadata from which we plan to identify metadata patterns that will drive predictive data entry when filling in metadata templates. The metadata repository not only will capture annotations specified when experimental datasets are initially created, but also will incorporate links to the published literature, including secondary analyses and possible refinements or retractions of experimental interpretations. By working initially with the Human Immunology Project Consortium and the developers of the ImmPort data repository, we are developing and evaluating an end-to-end solution to the problems of metadata authoring and management that will generalize to other data-management environments. ? The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Original language	English
Pages (from-to)	1148-1152
Journal	Journal of the American Medical Informatics Association
Volume	22
Issue number	6
DOIs	https://doi.org/10.1093/jamia/ocv048
Publication status	Published - Nov 2015
Externally published	Yes

Keywords

datasets as topic
data curation
data collection
standards
biological ontologies

Access to Document

10.1093/jamia/ocv048

Cite this

Musen, M. A., Bean, C. A., Cheung, K.-H., Dumontier, M., Durante, K. A., Gevaert, O., Gonzalez-Beltran, A., Khatri, P., Kleinstein, S. H., O'Connor, M. J., Pouliot, Y., Rocca-Serra, P., Sansone, S.-A., & Wiser, J. A. (2015). The center for expanded data annotation and retrieval. Journal of the American Medical Informatics Association, 22(6), 1148-1152. https://doi.org/10.1093/jamia/ocv048

@article{abf7ad9cdbb74c49aeb0888f598a4301,

title = "The center for expanded data annotation and retrieval",

abstract = "The Center for Expanded Data Annotation and Retrieval is studying the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse. We take advantage of emerging community-based standard templates for describing different kinds of biomedical datasets, and we investigate the use of computational techniques to help investigators to assemble templates and to fill in their values. We are creating a repository of metadata from which we plan to identify metadata patterns that will drive predictive data entry when filling in metadata templates. The metadata repository not only will capture annotations specified when experimental datasets are initially created, but also will incorporate links to the published literature, including secondary analyses and possible refinements or retractions of experimental interpretations. By working initially with the Human Immunology Project Consortium and the developers of the ImmPort data repository, we are developing and evaluating an end-to-end solution to the problems of metadata authoring and management that will generalize to other data-management environments. ? The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.",

keywords = "datasets as topic, data curation, data collection, standards, biological ontologies",

author = "Musen, {Mark A.} and Bean, {Carol A.} and Kei-Hoi Cheung and Michel Dumontier and Durante, {Kim A.} and Olivier Gevaert and Alejandra Gonzalez-Beltran and Purvesh Khatri and Kleinstein, {Steven H.} and O'Connor, {Martin J.} and Yannick Pouliot and Philippe Rocca-Serra and Susanna-Assunta Sansone and Wiser, {Jeffrey A.}",

year = "2015",

month = nov,

doi = "10.1093/jamia/ocv048",

language = "English",

volume = "22",

pages = "1148--1152",

journal = "Journal of the American Medical Informatics Association",

issn = "1067-5027",

publisher = "Oxford University Press",

number = "6",

}

Musen, MA, Bean, CA, Cheung, K-H, Dumontier, M, Durante, KA, Gevaert, O, Gonzalez-Beltran, A, Khatri, P, Kleinstein, SH, O'Connor, MJ, Pouliot, Y, Rocca-Serra, P, Sansone, S-A & Wiser, JA 2015, 'The center for expanded data annotation and retrieval', Journal of the American Medical Informatics Association, vol. 22, no. 6, pp. 1148-1152. https://doi.org/10.1093/jamia/ocv048

TY - JOUR

T1 - The center for expanded data annotation and retrieval

AU - Musen, Mark A.

AU - Bean, Carol A.

AU - Cheung, Kei-Hoi

AU - Dumontier, Michel

AU - Durante, Kim A.

AU - Gevaert, Olivier

AU - Gonzalez-Beltran, Alejandra

AU - Khatri, Purvesh

AU - Kleinstein, Steven H.

AU - O'Connor, Martin J.

AU - Pouliot, Yannick

AU - Rocca-Serra, Philippe

AU - Sansone, Susanna-Assunta

AU - Wiser, Jeffrey A.

PY - 2015/11

Y1 - 2015/11

N2 - The Center for Expanded Data Annotation and Retrieval is studying the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse. We take advantage of emerging community-based standard templates for describing different kinds of biomedical datasets, and we investigate the use of computational techniques to help investigators to assemble templates and to fill in their values. We are creating a repository of metadata from which we plan to identify metadata patterns that will drive predictive data entry when filling in metadata templates. The metadata repository not only will capture annotations specified when experimental datasets are initially created, but also will incorporate links to the published literature, including secondary analyses and possible refinements or retractions of experimental interpretations. By working initially with the Human Immunology Project Consortium and the developers of the ImmPort data repository, we are developing and evaluating an end-to-end solution to the problems of metadata authoring and management that will generalize to other data-management environments. ? The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

AB - The Center for Expanded Data Annotation and Retrieval is studying the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse. We take advantage of emerging community-based standard templates for describing different kinds of biomedical datasets, and we investigate the use of computational techniques to help investigators to assemble templates and to fill in their values. We are creating a repository of metadata from which we plan to identify metadata patterns that will drive predictive data entry when filling in metadata templates. The metadata repository not only will capture annotations specified when experimental datasets are initially created, but also will incorporate links to the published literature, including secondary analyses and possible refinements or retractions of experimental interpretations. By working initially with the Human Immunology Project Consortium and the developers of the ImmPort data repository, we are developing and evaluating an end-to-end solution to the problems of metadata authoring and management that will generalize to other data-management environments. ? The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

KW - datasets as topic

KW - data curation

KW - data collection

KW - standards

KW - biological ontologies

U2 - 10.1093/jamia/ocv048

DO - 10.1093/jamia/ocv048

M3 - Article

C2 - 26112029

SN - 1067-5027

VL - 22

SP - 1148

EP - 1152

JO - Journal of the American Medical Informatics Association

JF - Journal of the American Medical Informatics Association

IS - 6

ER -