Comparison of supervised clustering methods to discriminate genotoxic from non-genotoxic carcinogens by gene expression profiling

J.H. van Delft*, E. van Agen, S.G.J. van Breda, M.H.M. van Herwijnen, Y. Staal, J.C. Kleinjans

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


Prediction of the toxic properties of chemicals based on modulation of gene expression profiles in exposed cells or animals is one of the major applications of toxicogenomics. Previously, we demonstrated that by Pearson correlation analysis of gene expression profiles from treated HepG2 cells it is possible to correctly discriminate and predict genotoxic from non-genotoxic carcinogens. Since to date many different supervised clustering methods for discrimination and prediction tests are available, we investigated whether application of the methods provided by the Whitehead Institute and Stanford University improved our initial prediction. Four different supervised clustering methods were applied for this comparison, namely Pearson correlation analysis (Pearson), nearest shrunken centroids analysis (NSC), K-nearest neighbour analysis (KNN) and Weighted voting (WV). For each supervised clustering method, three different approaches were followed: (1) using all the data points for all treatments, (2) exclusion of the samples with marginally affected gene expression profiles and (3) filtering out the gene expression signals that were hardly altered. On the complete data set, NSC, KNN and WV outperformed the Pearson test, but on the reduced data sets no clear difference was observed. Exclusion of samples with marginally affected profiles improved the prediction by all methods. For the various prediction models, gene sets of different compositions were selected; in these 27 genes appeared three times or more. These 27 genes are involved in many different biological processes and molecular functions, such as apoptosis, cell cycle control, regulation of transcription, and transporter activity, many of them related to the carcinogenic process. One gene, BAX, was selected in all 10 models, while ZFP36 was selected in 9, and AHR, MT1E and TTR in 8. Summarising, this study demonstrates that several supervised clustering methods can be used to discriminate certain genotoxic from non-genotoxic carcinogens by gene expression profiling in vitro in HepG2 cells. None of the methods clearly outperforms the others.
Original languageEnglish
Pages (from-to)17-33
JournalMutation Research-Fundamental and Molecular Mechanisms of Mutagenesis
Issue number1-2
Publication statusPublished - 1 Jan 2005


Dive into the research topics of 'Comparison of supervised clustering methods to discriminate genotoxic from non-genotoxic carcinogens by gene expression profiling'. Together they form a unique fingerprint.

Cite this