Big data and other challenges in the quest for orthologs

Erik L L Sonnhammer; Toni Gabaldón; Alan W Sousa da Silva; Maria Martin; Marc Robinson-Rechavi; Brigitte Boeckmann; Paul D Thomas; Christophe Dessimoz; Quest for Orthologs consortium

doi:10.1093/bioinformatics/btu492

Big data and other challenges in the quest for orthologs

Erik L L Sonnhammer^*, Toni Gabaldón, Alan W Sousa da Silva, Maria Martin, Marc Robinson-Rechavi, Brigitte Boeckmann, Paul D Thomas, Christophe Dessimoz, Quest for Orthologs consortium

^*Corresponding author for this work

GROW - Reproductive and Perinatal Medicine

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

UNLABELLED: Given the rapid increase of species with a sequenced genome, the need to identify orthologous genes between them has emerged as a central bioinformatics task. Many different methods exist for orthology detection, which makes it difficult to decide which one to choose for a particular application. Here, we review the latest developments and issues in the orthology field, and summarize the most recent results reported at the third 'Quest for Orthologs' meeting. We focus on community efforts such as the adoption of reference proteomes, standard file formats and benchmarking. Progress in these areas is good, and they are already beneficial to both orthology consumers and providers. However, a major current issue is that the massive increase in complete proteomes poses computational challenges to many of the ortholog database providers, as most orthology inference algorithms scale at least quadratically with the number of proteomes. The Quest for Orthologs consortium is an open community with a number of working groups that join efforts to enhance various aspects of orthology analysis, such as defining standard formats and datasets, documenting community resources and benchmarking.

AVAILABILITY AND IMPLEMENTATION: All such materials are available at http://questfororthologs.org.

Original language	English
Pages (from-to)	2993-8
Number of pages	6
Journal	Bioinformatics
Volume	30
Issue number	21
DOIs	https://doi.org/10.1093/bioinformatics/btu492
Publication status	Published - 1 Nov 2014

Keywords

Algorithms
Genomics
Protein Structure, Tertiary
Proteome
Sequence Analysis, DNA
Sequence Analysis, Protein
Sequence Homology

Access to Document

10.1093/bioinformatics/btu492Licence: CC BY

Cite this

@article{9744b7a033994b6da61c6529357bf0cf,

title = "Big data and other challenges in the quest for orthologs",

abstract = "UNLABELLED: Given the rapid increase of species with a sequenced genome, the need to identify orthologous genes between them has emerged as a central bioinformatics task. Many different methods exist for orthology detection, which makes it difficult to decide which one to choose for a particular application. Here, we review the latest developments and issues in the orthology field, and summarize the most recent results reported at the third 'Quest for Orthologs' meeting. We focus on community efforts such as the adoption of reference proteomes, standard file formats and benchmarking. Progress in these areas is good, and they are already beneficial to both orthology consumers and providers. However, a major current issue is that the massive increase in complete proteomes poses computational challenges to many of the ortholog database providers, as most orthology inference algorithms scale at least quadratically with the number of proteomes. The Quest for Orthologs consortium is an open community with a number of working groups that join efforts to enhance various aspects of orthology analysis, such as defining standard formats and datasets, documenting community resources and benchmarking.AVAILABILITY AND IMPLEMENTATION: All such materials are available at http://questfororthologs.org.",

keywords = "Algorithms, Genomics, Protein Structure, Tertiary, Proteome, Sequence Analysis, DNA, Sequence Analysis, Protein, Sequence Homology",

author = "Sonnhammer, {Erik L L} and Toni Gabald{\'o}n and {Sousa da Silva}, {Alan W} and Maria Martin and Marc Robinson-Rechavi and Brigitte Boeckmann and Thomas, {Paul D} and Christophe Dessimoz and {Quest for Orthologs consortium}",

note = "{\textcopyright} The Author 2014. Published by Oxford University Press.",

year = "2014",

month = nov,

day = "1",

doi = "10.1093/bioinformatics/btu492",

language = "English",

volume = "30",

pages = "2993--8",

journal = "Bioinformatics",

issn = "1367-4803",

publisher = "Oxford University Press",

number = "21",

}

TY - JOUR

T1 - Big data and other challenges in the quest for orthologs

AU - Sonnhammer, Erik L L

AU - Gabaldón, Toni

AU - Sousa da Silva, Alan W

AU - Martin, Maria

AU - Robinson-Rechavi, Marc

AU - Boeckmann, Brigitte

AU - Thomas, Paul D

AU - Dessimoz, Christophe

AU - Quest for Orthologs consortium

N1 - © The Author 2014. Published by Oxford University Press.

PY - 2014/11/1

Y1 - 2014/11/1

N2 - UNLABELLED: Given the rapid increase of species with a sequenced genome, the need to identify orthologous genes between them has emerged as a central bioinformatics task. Many different methods exist for orthology detection, which makes it difficult to decide which one to choose for a particular application. Here, we review the latest developments and issues in the orthology field, and summarize the most recent results reported at the third 'Quest for Orthologs' meeting. We focus on community efforts such as the adoption of reference proteomes, standard file formats and benchmarking. Progress in these areas is good, and they are already beneficial to both orthology consumers and providers. However, a major current issue is that the massive increase in complete proteomes poses computational challenges to many of the ortholog database providers, as most orthology inference algorithms scale at least quadratically with the number of proteomes. The Quest for Orthologs consortium is an open community with a number of working groups that join efforts to enhance various aspects of orthology analysis, such as defining standard formats and datasets, documenting community resources and benchmarking.AVAILABILITY AND IMPLEMENTATION: All such materials are available at http://questfororthologs.org.

AB - UNLABELLED: Given the rapid increase of species with a sequenced genome, the need to identify orthologous genes between them has emerged as a central bioinformatics task. Many different methods exist for orthology detection, which makes it difficult to decide which one to choose for a particular application. Here, we review the latest developments and issues in the orthology field, and summarize the most recent results reported at the third 'Quest for Orthologs' meeting. We focus on community efforts such as the adoption of reference proteomes, standard file formats and benchmarking. Progress in these areas is good, and they are already beneficial to both orthology consumers and providers. However, a major current issue is that the massive increase in complete proteomes poses computational challenges to many of the ortholog database providers, as most orthology inference algorithms scale at least quadratically with the number of proteomes. The Quest for Orthologs consortium is an open community with a number of working groups that join efforts to enhance various aspects of orthology analysis, such as defining standard formats and datasets, documenting community resources and benchmarking.AVAILABILITY AND IMPLEMENTATION: All such materials are available at http://questfororthologs.org.

KW - Algorithms

KW - Genomics

KW - Protein Structure, Tertiary

KW - Proteome

KW - Sequence Analysis, DNA

KW - Sequence Analysis, Protein

KW - Sequence Homology

U2 - 10.1093/bioinformatics/btu492

DO - 10.1093/bioinformatics/btu492

M3 - Article

C2 - 25064571

SN - 1367-4803

VL - 30

SP - 2993

EP - 2998

JO - Bioinformatics

JF - Bioinformatics

IS - 21

ER -