Integration of genome-wide association studies with biological knowledge identifies six novel genes related to kidney function

Daniel I Chasman*, Christian Fuchsberger, Cristian Pattaro, Alexander Teumer, Carsten A Böger, Karlhans Endlich, Matthias Olden, Ming-Huei Chen, Adrienne Tin, Daniel Taliun, Man Li, Xiaoyi Gao, Mathias Gorski, Qiong Yang, Claudia Hundertmark, Meredith C Foster, Conall M O'Seaghdha, Nicole Glazer, Aaron Isaacs, Ching-Ti LiuAlbert V Smith, Jeffrey R O'Connell, Maksim Struchalin, Toshiko Tanaka, Guo Li, Andrew D Johnson, Hinco J Gierman, Mary F Feitosa, Shih-Jen Hwang, Elizabeth J Atkinson, Kurt Lohman, Marilyn C Cornelis, Asa Johansson, Anke Tönjes, Abbas Dehghan, Jean-Charles Lambert, Elizabeth G Holliday, Rossella Sorice, Zoltan Kutalik, Terho Lehtimäki, Tõnu Esko, Harshal Deshmukh, Sheila Ulivi, Audrey Y Chu, Federico Murgia, Stella Trompet, Medea Imboden, Stefan Coassin, Giorgio Pistis, Jie Jin Wang, CARDIoGRAM Consortium

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


In conducting genome-wide association studies (GWAS), analytical approaches leveraging biological information may further understanding of the pathophysiology of clinical traits. To discover novel associations with estimated glomerular filtration rate (eGFR), a measure of kidney function, we developed a strategy for integrating prior biological knowledge into the existing GWAS data for eGFR from the CKDGen Consortium. Our strategy focuses on single nucleotide polymorphism (SNPs) in genes that are connected by functional evidence, determined by literature mining and gene ontology (GO) hierarchies, to genes near previously validated eGFR associations. It then requires association thresholds consistent with multiple testing, and finally evaluates novel candidates by independent replication. Among the samples of European ancestry, we identified a genome-wide significant SNP in FBXL20 (P = 5.6 × 10(-9)) in meta-analysis of all available data, and additional SNPs at the INHBC, LRP2, PLEKHA1, SLC3A2 and SLC7A6 genes meeting multiple-testing corrected significance for replication and overall P-values of 4.5 × 10(-4)-2.2 × 10(-7). Neither the novel PLEKHA1 nor FBXL20 associations, both further supported by association with eGFR among African Americans and with transcript abundance, would have been implicated by eGFR candidate gene approaches. LRP2, encoding the megalin receptor, was identified through connection with the previously known eGFR gene DAB2 and extends understanding of the megalin system in kidney function. These findings highlight integration of existing genome-wide association data with independent biological knowledge to uncover novel candidate eGFR associations, including candidates lacking known connections to kidney-specific pathways. The strategy may also be applicable to other clinical phenotypes, although more testing will be needed to assess its potential for discovery in general.

Original languageEnglish
Pages (from-to)5329-43
Number of pages15
JournalHuman Molecular Genetics
Issue number24
Publication statusPublished - 15 Dec 2012
Externally publishedYes


  • Amino Acid Transport Systems, Basic/genetics
  • Fusion Regulatory Protein 1, Heavy Chain/genetics
  • Genetic Predisposition to Disease/genetics
  • Genome-Wide Association Study/methods
  • Glomerular Filtration Rate/genetics
  • Humans
  • Inhibin-beta Subunits/genetics
  • Intracellular Signaling Peptides and Proteins/genetics
  • Low Density Lipoprotein Receptor-Related Protein-2/genetics
  • Membrane Proteins/genetics
  • Polymorphism, Single Nucleotide/genetics

Cite this