Drug prioritization using the semantic properties of a knowledge graph

Tareq B. Malas; Wytze J. Vlietstra; Roman Kudrin; Sergey Starikov; Mohammed Charrout; Marco Roos; Dorien J. M. Peters; Jan A. Kors; Rein Vos; Peter A. C. 't Hoen; Erik M. van Mulligen; Kristina M. Hettne

doi:10.1038/s41598-019-42806-6

Drug prioritization using the semantic properties of a knowledge graph

Tareq B. Malas, Wytze J. Vlietstra, Roman Kudrin, Sergey Starikov, Mohammed Charrout, Marco Roos, Dorien J. M. Peters, Jan A. Kors, Rein Vos, Peter A. C. 't Hoen, Erik M. van Mulligen, Kristina M. Hettne^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

Compounds that are candidates for drug repurposing can be ranked by leveraging knowledge available in the biomedical literature and databases. This knowledge, spread across a variety of sources, can be integrated within a knowledge graph, which thereby comprehensively describes known relationships between biomedical concepts, such as drugs, diseases, genes, etc. Our work uses the semantic information between drug and disease concepts as features, which are extracted from an existing knowledge graph that integrates 200 different biological knowledge sources. RepoDB, a standard drug repurposing database which describes drug-disease combinations that were approved or that failed in clinical trials, is used to train a random forest classifier. The 10-times repeated 10-fold cross-validation performance of the classifier achieves a mean area under the receiver operating characteristic curve (AUC) of 92.2%. We apply the classifier to prioritize 21 preclinical drug repurposing candidates that have been suggested for Autosomal Dominant Polycystic Kidney Disease (ADPKD). Mozavaptan, a vasopressin V2 receptor antagonist is predicted to be the drug most likely to be approved after a clinical trial, and belongs to the same drug class as tolvaptan, the only treatment for ADPKD that is currently approved. We conclude that semantic properties of concepts in a knowledge graph can be exploited to prioritize drug repurposing candidates for testing in clinical trials.

Original language	English
Article number	6281
Number of pages	10
Journal	Scientific Reports
Volume	9
DOIs	https://doi.org/10.1038/s41598-019-42806-6
Publication status	Published - 18 Apr 2019

Keywords

POLYCYSTIC KIDNEY-DISEASE
PROLIFERATION
GROWTH

Access to Document

10.1038/s41598-019-42806-6Licence: CC BY

Cite this

@article{b56a08d8ea014720a16099e2e9504a14,

title = "Drug prioritization using the semantic properties of a knowledge graph",

abstract = "Compounds that are candidates for drug repurposing can be ranked by leveraging knowledge available in the biomedical literature and databases. This knowledge, spread across a variety of sources, can be integrated within a knowledge graph, which thereby comprehensively describes known relationships between biomedical concepts, such as drugs, diseases, genes, etc. Our work uses the semantic information between drug and disease concepts as features, which are extracted from an existing knowledge graph that integrates 200 different biological knowledge sources. RepoDB, a standard drug repurposing database which describes drug-disease combinations that were approved or that failed in clinical trials, is used to train a random forest classifier. The 10-times repeated 10-fold cross-validation performance of the classifier achieves a mean area under the receiver operating characteristic curve (AUC) of 92.2%. We apply the classifier to prioritize 21 preclinical drug repurposing candidates that have been suggested for Autosomal Dominant Polycystic Kidney Disease (ADPKD). Mozavaptan, a vasopressin V2 receptor antagonist is predicted to be the drug most likely to be approved after a clinical trial, and belongs to the same drug class as tolvaptan, the only treatment for ADPKD that is currently approved. We conclude that semantic properties of concepts in a knowledge graph can be exploited to prioritize drug repurposing candidates for testing in clinical trials.",

keywords = "POLYCYSTIC KIDNEY-DISEASE, PROLIFERATION, GROWTH",

author = "Malas, {Tareq B.} and Vlietstra, {Wytze J.} and Roman Kudrin and Sergey Starikov and Mohammed Charrout and Marco Roos and Peters, {Dorien J. M.} and Kors, {Jan A.} and Rein Vos and {'t Hoen}, {Peter A. C.} and {van Mulligen}, {Erik M.} and Hettne, {Kristina M.}",

note = "Funding Information: We acknowledge the support by the European Community{\textquoteright}s Seventh Framework Programme (FP7/2007–2013) under grant agreement 305444 {\textquoteleft}RD‐Connect{\textquoteright} and People Program (Marie Curie Actions) under Research Executive Agency Grant Agreement 317246 {\textquoteleft}TranCYST{\textquoteright}. Publisher Copyright: {\textcopyright} 2019, The Author(s).",

year = "2019",

month = apr,

day = "18",

doi = "10.1038/s41598-019-42806-6",

language = "English",

volume = "9",

journal = "Scientific Reports",

issn = "2045-2322",

publisher = "Nature Publishing Group",

}

TY - JOUR

T1 - Drug prioritization using the semantic properties of a knowledge graph

AU - Malas, Tareq B.

AU - Vlietstra, Wytze J.

AU - Kudrin, Roman

AU - Starikov, Sergey

AU - Charrout, Mohammed

AU - Roos, Marco

AU - Peters, Dorien J. M.

AU - Kors, Jan A.

AU - Vos, Rein

AU - 't Hoen, Peter A. C.

AU - van Mulligen, Erik M.

AU - Hettne, Kristina M.

N1 - Funding Information: We acknowledge the support by the European Community’s Seventh Framework Programme (FP7/2007–2013) under grant agreement 305444 ‘RD‐Connect’ and People Program (Marie Curie Actions) under Research Executive Agency Grant Agreement 317246 ‘TranCYST’. Publisher Copyright: © 2019, The Author(s).

PY - 2019/4/18

Y1 - 2019/4/18

N2 - Compounds that are candidates for drug repurposing can be ranked by leveraging knowledge available in the biomedical literature and databases. This knowledge, spread across a variety of sources, can be integrated within a knowledge graph, which thereby comprehensively describes known relationships between biomedical concepts, such as drugs, diseases, genes, etc. Our work uses the semantic information between drug and disease concepts as features, which are extracted from an existing knowledge graph that integrates 200 different biological knowledge sources. RepoDB, a standard drug repurposing database which describes drug-disease combinations that were approved or that failed in clinical trials, is used to train a random forest classifier. The 10-times repeated 10-fold cross-validation performance of the classifier achieves a mean area under the receiver operating characteristic curve (AUC) of 92.2%. We apply the classifier to prioritize 21 preclinical drug repurposing candidates that have been suggested for Autosomal Dominant Polycystic Kidney Disease (ADPKD). Mozavaptan, a vasopressin V2 receptor antagonist is predicted to be the drug most likely to be approved after a clinical trial, and belongs to the same drug class as tolvaptan, the only treatment for ADPKD that is currently approved. We conclude that semantic properties of concepts in a knowledge graph can be exploited to prioritize drug repurposing candidates for testing in clinical trials.

AB - Compounds that are candidates for drug repurposing can be ranked by leveraging knowledge available in the biomedical literature and databases. This knowledge, spread across a variety of sources, can be integrated within a knowledge graph, which thereby comprehensively describes known relationships between biomedical concepts, such as drugs, diseases, genes, etc. Our work uses the semantic information between drug and disease concepts as features, which are extracted from an existing knowledge graph that integrates 200 different biological knowledge sources. RepoDB, a standard drug repurposing database which describes drug-disease combinations that were approved or that failed in clinical trials, is used to train a random forest classifier. The 10-times repeated 10-fold cross-validation performance of the classifier achieves a mean area under the receiver operating characteristic curve (AUC) of 92.2%. We apply the classifier to prioritize 21 preclinical drug repurposing candidates that have been suggested for Autosomal Dominant Polycystic Kidney Disease (ADPKD). Mozavaptan, a vasopressin V2 receptor antagonist is predicted to be the drug most likely to be approved after a clinical trial, and belongs to the same drug class as tolvaptan, the only treatment for ADPKD that is currently approved. We conclude that semantic properties of concepts in a knowledge graph can be exploited to prioritize drug repurposing candidates for testing in clinical trials.

KW - POLYCYSTIC KIDNEY-DISEASE

KW - PROLIFERATION

KW - GROWTH

U2 - 10.1038/s41598-019-42806-6

DO - 10.1038/s41598-019-42806-6

M3 - Article

SN - 2045-2322

VL - 9

JO - Scientific Reports

JF - Scientific Reports

M1 - 6281

ER -