TY - JOUR
T1 - Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science
AU - Unni, D.R.
AU - Moxon, S.A.T.
AU - Bada, M.
AU - Brush, M.
AU - Bruskiewich, R.
AU - Caufield, J.H.
AU - Clemons, P.A.
AU - Dancik, V.
AU - Dumontier, M.
AU - Fecho, K.
AU - Glusman, G.
AU - Hadlock, J.J.
AU - Harris, N.L.
AU - Joshi, A.
AU - Putman, T.
AU - Qin, G.R.
AU - Ramsey, S.A.
AU - Shefchek, K.A.
AU - Solbrig, H.
AU - Soman, K.
AU - Thessen, A.E.
AU - Haendel, M.A.
AU - Bizon, C.
AU - Mungall, C.J.
AU - The Biomedical Data Translator Consortium
AU - Celebi, Remzi
N1 - Funding Information:
The authors are grateful to members of the Publications Committees at the National Center for Advancing Translational Sciences, the National Institute of Environmental Health Sciences, and the National Institute on Aging for their review and approval of the manuscript for publication. Moreover, the authors are appreciative of the unwavering leadership and support provided by the Extramural Leadership Team and the Intramural Research Program at National Center for Advancing Translational Sciences (NCATS).
Funding Information:
This work was supported in part by the NCATS Biomedical Data Translator program (Other Transaction Awards OT2TR003434, OT2TR003436, OT2TR003428, OT2TR003448, OT2TR003427, OT2TR003430, OT2TR003433, OT2TR003450, OT2TR003437, OT2TR003443, OT2TR003441, OT2TR003449, OT2TR003445, OT2TR003422, OT2TR003435, OT3TR002026, OT3TR002020, OT3TR002025, OT3TR002019, OT3TR002027, OT2TR002517, OT2TR002514, OT2TR002515, OT2TR002584, and OT2TR002520; Contract number 75N95021P00636). Additional funding was provided by the Office of the Director, National Institutes of Health (grant award R24‐OD011883), the National Human Genome Research Institute (grant award 7RM1HG010860‐02), and the Director, Office of Science, Office of Basic Energy Sciences, of the U.S. Department of Energy under Contract No. DE‐AC0205CH11231.
Publisher Copyright:
© 2022 The Authors. Clinical and Translational Science published by Wiley Periodicals LLC on behalf of American Society for Clinical Pharmacology and Therapeutics.
PY - 2022/8
Y1 - 2022/8
N2 - Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph-based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these "knowledge graphs" (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally accepted, open-access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open-source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object-oriented classification and graph-oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates) representing biomedical entities such as gene, disease, chemical, anatomic structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science.
AB - Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph-based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these "knowledge graphs" (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally accepted, open-access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open-source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object-oriented classification and graph-oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates) representing biomedical entities such as gene, disease, chemical, anatomic structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science.
U2 - 10.1111/cts.13302
DO - 10.1111/cts.13302
M3 - Article
C2 - 36125173
SN - 1752-8054
VL - 15
SP - 1848
EP - 1855
JO - Clinical and Translational Science
JF - Clinical and Translational Science
IS - 8
ER -