WeGET: predicting new genes for molecular systems by weighted co-expression

Radek Szklarczyk*, Wout Megchelenbrink, Pavel Cizek, Marie Ledent, Gonny Velemans, Damian Szklarczyk, Martijn A. Huynen*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

17 Citations (Web of Science)

Abstract

We have developed the Weighted Gene Expression Tool and database (WeGET, http://weget.cmbi.umcn.nl) for the prediction of new genes of a molecular system by correlated gene expression. WeGET utilizes a compendium of 465 human and 560 murine gene expression datasets that have been collected from multiple tissues under a wide range of experimental conditions. It exploits this abundance of expression data by assigning a high weight to datasets in which the known genes of a molecular system are harmoniously up-and down-regulated. WeGET ranks new candidate genes by calculating their weighted co-expression with that system. A weighted rank is calculated for human genes and their mouse orthologs. Then, an integrated gene rank and p-value is computed using a rank-order statistic. We applied our method to predict novel genes that have a high degree of co-expression with Gene Ontology terms and pathways from KEGG and Reactome. For each query set we provide a list of predicted novel genes, computed weights for transcription datasets used and cell and tissue types that contributed to the final predictions. The performance for each query set is assessed by 10-fold cross-validation. Finally, users can use the WeGET to predict novel genes that coexpress with a custom query set.
Original languageEnglish
Pages (from-to)D567-D573
JournalNucleic Acids Research
Volume44
Issue numberD1
DOIs
Publication statusPublished - 4 Jan 2016

Cite this