TY - JOUR
T1 - An interconnected data infrastructure to support large-scale rare disease research
AU - Johansson, Lennart F.
AU - Laurie, Steve
AU - Spalding, Dylan
AU - Gibson, Spencer
AU - Ruvolo, David
AU - Thomas, Coline
AU - Piscia, Davide
AU - de Andrade, Fernanda
AU - Been, Gerieke
AU - Bijlsma, Marieke
AU - Brunner, Han
AU - Cimerman, Sandi
AU - Dizjikan, Farid Yavari
AU - Ellwanger, Kornelia
AU - Fernandez, Marcos
AU - Freeberg, Mallory
AU - van de Geijn, Gert Jan
AU - Kanninga, Roan
AU - Maddi, Vatsalya
AU - Mehtarizadeh, Mehdi
AU - Neerincx, Pieter
AU - Ossowski, Stephan
AU - Rath, Ana
AU - Roelofs-Prins, Dieuwke
AU - Stok-Benjamins, Marloes
AU - van der Velde, K. Joeri
AU - Veal, Colin
AU - van der Vries, Gerben
AU - Wadsley, Marc
AU - Warren, Gregory
AU - Zurek, Birte
AU - Keane, Thomas
AU - Graessner, Holm
AU - Beltran, Sergi
AU - Swertz, Morris A.
AU - Brookes, Anthony J.
AU - SOLVE-RD Consortium
N1 - Publisher Copyright:
© The Author(s) 2024. Published by Oxford University Press GigaScience.
PY - 2024/1/2
Y1 - 2024/1/2
N2 - The Solve-RD project brings together clinicians, scientists, and patient representatives from 51 institutes spanning 15 countries to collaborate on genetically diagnosing ("solving") rare diseases (RDs). The project aims to significantly increase the diagnostic success rate by co-analyzing data from thousands of RD cases, including phenotypes, pedigrees, exome/genome sequencing, and multiomics data. Here we report on the data infrastructure devised and created to support this co-analysis. This infrastructure enables users to store, find, connect, and analyze data and metadata in a collaborative manner. Pseudonymized phenotypic and raw experimental data are submitted to the RD-Connect Genome-Phenome Analysis Platform and processed through standardized pipelines. Resulting files and novel produced omics data are sent to the European Genome-Phenome Archive, which adds unique file identifiers and provides long-term storage and controlled access services. MOLGENIS "RD3" and Café Variome "Discovery Nexus" connect data and metadata and offer discovery services, and secure cloud-based "Sandboxes" support multiparty data analysis. This successfully deployed and useful infrastructure design provides a blueprint for other projects that need to analyze large amounts of heterogeneous data.
AB - The Solve-RD project brings together clinicians, scientists, and patient representatives from 51 institutes spanning 15 countries to collaborate on genetically diagnosing ("solving") rare diseases (RDs). The project aims to significantly increase the diagnostic success rate by co-analyzing data from thousands of RD cases, including phenotypes, pedigrees, exome/genome sequencing, and multiomics data. Here we report on the data infrastructure devised and created to support this co-analysis. This infrastructure enables users to store, find, connect, and analyze data and metadata in a collaborative manner. Pseudonymized phenotypic and raw experimental data are submitted to the RD-Connect Genome-Phenome Analysis Platform and processed through standardized pipelines. Resulting files and novel produced omics data are sent to the European Genome-Phenome Archive, which adds unique file identifiers and provides long-term storage and controlled access services. MOLGENIS "RD3" and Café Variome "Discovery Nexus" connect data and metadata and offer discovery services, and secure cloud-based "Sandboxes" support multiparty data analysis. This successfully deployed and useful infrastructure design provides a blueprint for other projects that need to analyze large amounts of heterogeneous data.
KW - bioinformatics
KW - computational biology
KW - fair data
KW - genetics
KW - infrastructure
KW - rare disease
U2 - 10.1093/gigascience/giae058
DO - 10.1093/gigascience/giae058
M3 - Article
VL - 13
JO - GigaScience
JF - GigaScience
M1 - giae058
ER -