Whole Genome Prediction of Bladder Cancer Risk With the Bayesian LASSO

Evangelina Lopez de Maturana, Stephen J. Chanok, Antoni C. Picornell, Nathaniel Rothman, Jesus Herranz, M. Luz Calle, Montserrat Garcia-Closas, Gaelle Marenne, Angela Brand, Adonina Tardon, Alfredo Carrato, Debra T. Silverman, Manolis Kogevinas, Daniel Gianola, Francisco X. Real, Nuria Malats*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

To build a predictive model for urothelial carcinoma of the bladder (UCB) risk combining both genomic and nongenomic data, 1,127 cases and 1,090 controls from the Spanish Bladder Cancer/EPICURO study were genotyped using the HumanHap 1M SNP array. After quality control filters, genotypes from 475,290 variants were available. Nongenomic information comprised age, gender, region, and smoking status. Three Bayesian threshold models were implemented including: (1) only genomic information, (2) only nongenomic data, and (3) both sources of information. The three models were applied to the whole population, to only nonsmokers, to male smokers, and to extreme phenotypes to potentiate the UCB genetic component. The area under the ROC curve allowed evaluating the predictive ability of each model in a 10-fold cross-validation scenario. Smoking status showed the highest predictive ability of UCB risk (AUC(test) = 0.62). On the other hand, the AUC of all genetic variants was poorer (0.53). When the extreme phenotype approach was applied, the predictive ability of the genomic model improved 15%. This study represents a first attempt to build a predictive model for UCB risk combining both genomic and nongenomic data and applying state-of-the-art statistical approaches. However, the lack of genetic relatedness among individuals, the complexity of UCB etiology, as well as a relatively small statistical power, may explain the low predictive ability for UCB risk. The study confirms the difficulty of predicting complex diseases using genetic data, and suggests the limited translational potential of findings from this type of data into public health interventions. Genet Epidemiol 38: 467-476, 2014.
Original languageEnglish
Pages (from-to)467-476
JournalGenetic Epidemiology
Volume38
Issue number5
DOIs
Publication statusPublished - Jul 2014

Keywords

  • Bayesian shrinkage method
  • area under the ROC curve
  • urothelial carcinoma of the bladder
  • genomic predictive model

Cite this