Importance of collection in gene set enrichment analysis of drug response in cancer cell lines

Alain R. Bateman, Nehme El-Hachem, Andrew H. Beck, Hugo J. W. L. Aerts, Benjamin Haibe-Kains*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


Gene set enrichment analysis (GSEA) associates gene sets and phenotypes, its use is predicated on the choice of a pre-defined collection of sets. The defacto standard implementation of GSEA provides seven collections yet there are no guidelines for the choice of collections and the impact of such choice, if any, is unknown. Here we compare each of the standard gene set collections in the context of a large dataset of drug response in human cancer cell lines. We define and test a new collection based on gene co-expression in cancer cell lines to compare the performance of the standard collections to an externally derived cell line based collection. The results show that GSEA findings vary significantly depending on the collection chosen for analysis. Henceforth, collections should be carefully selected and reported in studies that leverage GSEA.
Original languageEnglish
Article number4092
JournalScientific Reports
Publication statusPublished - 13 Feb 2014

Cite this