TY - JOUR
T1 - Big data and other challenges in the quest for orthologs
AU - Sonnhammer, Erik L L
AU - Gabaldón, Toni
AU - Sousa da Silva, Alan W
AU - Martin, Maria
AU - Robinson-Rechavi, Marc
AU - Boeckmann, Brigitte
AU - Thomas, Paul D
AU - Dessimoz, Christophe
AU - Quest for Orthologs consortium
N1 - © The Author 2014. Published by Oxford University Press.
PY - 2014/11/1
Y1 - 2014/11/1
N2 - UNLABELLED: Given the rapid increase of species with a sequenced genome, the need to identify orthologous genes between them has emerged as a central bioinformatics task. Many different methods exist for orthology detection, which makes it difficult to decide which one to choose for a particular application. Here, we review the latest developments and issues in the orthology field, and summarize the most recent results reported at the third 'Quest for Orthologs' meeting. We focus on community efforts such as the adoption of reference proteomes, standard file formats and benchmarking. Progress in these areas is good, and they are already beneficial to both orthology consumers and providers. However, a major current issue is that the massive increase in complete proteomes poses computational challenges to many of the ortholog database providers, as most orthology inference algorithms scale at least quadratically with the number of proteomes. The Quest for Orthologs consortium is an open community with a number of working groups that join efforts to enhance various aspects of orthology analysis, such as defining standard formats and datasets, documenting community resources and benchmarking.AVAILABILITY AND IMPLEMENTATION: All such materials are available at http://questfororthologs.org.
AB - UNLABELLED: Given the rapid increase of species with a sequenced genome, the need to identify orthologous genes between them has emerged as a central bioinformatics task. Many different methods exist for orthology detection, which makes it difficult to decide which one to choose for a particular application. Here, we review the latest developments and issues in the orthology field, and summarize the most recent results reported at the third 'Quest for Orthologs' meeting. We focus on community efforts such as the adoption of reference proteomes, standard file formats and benchmarking. Progress in these areas is good, and they are already beneficial to both orthology consumers and providers. However, a major current issue is that the massive increase in complete proteomes poses computational challenges to many of the ortholog database providers, as most orthology inference algorithms scale at least quadratically with the number of proteomes. The Quest for Orthologs consortium is an open community with a number of working groups that join efforts to enhance various aspects of orthology analysis, such as defining standard formats and datasets, documenting community resources and benchmarking.AVAILABILITY AND IMPLEMENTATION: All such materials are available at http://questfororthologs.org.
KW - Algorithms
KW - Genomics
KW - Protein Structure, Tertiary
KW - Proteome
KW - Sequence Analysis, DNA
KW - Sequence Analysis, Protein
KW - Sequence Homology
U2 - 10.1093/bioinformatics/btu492
DO - 10.1093/bioinformatics/btu492
M3 - Article
C2 - 25064571
SN - 1367-4803
VL - 30
SP - 2993
EP - 2998
JO - Bioinformatics
JF - Bioinformatics
IS - 21
ER -