Robustness and reproducibility for AI learning in biomedical sciences: RENOIR

Alessandro Barberis; Hugo J.W.L. Aerts; Francesca M. Buffa

doi:10.1038/s41598-024-51381-4

Robustness and reproducibility for AI learning in biomedical sciences: RENOIR

Alessandro Barberis^*, Hugo J.W.L. Aerts, Francesca M. Buffa^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

Artificial intelligence (AI) techniques are increasingly applied across various domains, favoured by the growing acquisition and public availability of large, complex datasets. Despite this trend, AI publications often suffer from lack of reproducibility and poor generalisation of findings, undermining scientific value and contributing to global research waste. To address these issues and focusing on the learning aspect of the AI field, we present RENOIR (REpeated random sampliNg fOr machIne leaRning), a modular open-source platform for robust and reproducible machine learning (ML) analysis. RENOIR adopts standardised pipelines for model training and testing, introducing elements of novelty, such as the dependence of the performance of the algorithm on the sample size. Additionally, RENOIR offers automated generation of transparent and usable reports, aiming to enhance the quality and reproducibility of AI studies. To demonstrate the versatility of our tool, we applied it to benchmark datasets from health, computer science, and STEM (Science, Technology, Engineering, and Mathematics) domains. Furthermore, we showcase RENOIR's successful application in recently published studies, where it identified classifiers for SET2D and TP53 mutation status in cancer. Finally, we present a use case where RENOIR was employed to address a significant pharmacological challenge-predicting drug efficacy. RENOIR is freely available at https://github.com/alebarberis/renoir .

Original language	English
Article number	1933
Number of pages	13
Journal	Scientific Reports
Volume	14
Issue number	1
DOIs	https://doi.org/10.1038/s41598-024-51381-4
Publication status	Published - 1 Dec 2024

Access to Document

10.1038/s41598-024-51381-4Licence: CC BY

Cite this

@article{c0aa7c7814994baa8aca12c7dbb24423,

title = "Robustness and reproducibility for AI learning in biomedical sciences: RENOIR",

abstract = "Artificial intelligence (AI) techniques are increasingly applied across various domains, favoured by the growing acquisition and public availability of large, complex datasets. Despite this trend, AI publications often suffer from lack of reproducibility and poor generalisation of findings, undermining scientific value and contributing to global research waste. To address these issues and focusing on the learning aspect of the AI field, we present RENOIR (REpeated random sampliNg fOr machIne leaRning), a modular open-source platform for robust and reproducible machine learning (ML) analysis. RENOIR adopts standardised pipelines for model training and testing, introducing elements of novelty, such as the dependence of the performance of the algorithm on the sample size. Additionally, RENOIR offers automated generation of transparent and usable reports, aiming to enhance the quality and reproducibility of AI studies. To demonstrate the versatility of our tool, we applied it to benchmark datasets from health, computer science, and STEM (Science, Technology, Engineering, and Mathematics) domains. Furthermore, we showcase RENOIR's successful application in recently published studies, where it identified classifiers for SET2D and TP53 mutation status in cancer. Finally, we present a use case where RENOIR was employed to address a significant pharmacological challenge-predicting drug efficacy. RENOIR is freely available at https://github.com/alebarberis/renoir .",

author = "Alessandro Barberis and Aerts, {Hugo J.W.L.} and Buffa, {Francesca M.}",

note = "Funding Information: This study was funded by Cancer Research UK (No. 23969), Prostate Cancer UK (No. MA-CT20-006) and European Research Council Award microC (No. 772970). Funding Information: We wish to acknowledge Prof Marc Mezard, {\'E}cole normale sup{\'e}rieure, Paris, France, and Bocconi University, Milan, Italy, for his comments on the manuscript and fruitfull discussion on AI methodology and reproducibility. Publisher Copyright: {\textcopyright} 2024, The Author(s).",

year = "2024",

month = dec,

day = "1",

doi = "10.1038/s41598-024-51381-4",

language = "English",

volume = "14",

journal = "Scientific Reports",

issn = "2045-2322",

publisher = "Nature Publishing Group",

number = "1",

}

TY - JOUR

T1 - Robustness and reproducibility for AI learning in biomedical sciences

T2 - RENOIR

AU - Barberis, Alessandro

AU - Aerts, Hugo J.W.L.

AU - Buffa, Francesca M.

N1 - Funding Information: This study was funded by Cancer Research UK (No. 23969), Prostate Cancer UK (No. MA-CT20-006) and European Research Council Award microC (No. 772970). Funding Information: We wish to acknowledge Prof Marc Mezard, École normale supérieure, Paris, France, and Bocconi University, Milan, Italy, for his comments on the manuscript and fruitfull discussion on AI methodology and reproducibility. Publisher Copyright: © 2024, The Author(s).

PY - 2024/12/1

Y1 - 2024/12/1

N2 - Artificial intelligence (AI) techniques are increasingly applied across various domains, favoured by the growing acquisition and public availability of large, complex datasets. Despite this trend, AI publications often suffer from lack of reproducibility and poor generalisation of findings, undermining scientific value and contributing to global research waste. To address these issues and focusing on the learning aspect of the AI field, we present RENOIR (REpeated random sampliNg fOr machIne leaRning), a modular open-source platform for robust and reproducible machine learning (ML) analysis. RENOIR adopts standardised pipelines for model training and testing, introducing elements of novelty, such as the dependence of the performance of the algorithm on the sample size. Additionally, RENOIR offers automated generation of transparent and usable reports, aiming to enhance the quality and reproducibility of AI studies. To demonstrate the versatility of our tool, we applied it to benchmark datasets from health, computer science, and STEM (Science, Technology, Engineering, and Mathematics) domains. Furthermore, we showcase RENOIR's successful application in recently published studies, where it identified classifiers for SET2D and TP53 mutation status in cancer. Finally, we present a use case where RENOIR was employed to address a significant pharmacological challenge-predicting drug efficacy. RENOIR is freely available at https://github.com/alebarberis/renoir .

AB - Artificial intelligence (AI) techniques are increasingly applied across various domains, favoured by the growing acquisition and public availability of large, complex datasets. Despite this trend, AI publications often suffer from lack of reproducibility and poor generalisation of findings, undermining scientific value and contributing to global research waste. To address these issues and focusing on the learning aspect of the AI field, we present RENOIR (REpeated random sampliNg fOr machIne leaRning), a modular open-source platform for robust and reproducible machine learning (ML) analysis. RENOIR adopts standardised pipelines for model training and testing, introducing elements of novelty, such as the dependence of the performance of the algorithm on the sample size. Additionally, RENOIR offers automated generation of transparent and usable reports, aiming to enhance the quality and reproducibility of AI studies. To demonstrate the versatility of our tool, we applied it to benchmark datasets from health, computer science, and STEM (Science, Technology, Engineering, and Mathematics) domains. Furthermore, we showcase RENOIR's successful application in recently published studies, where it identified classifiers for SET2D and TP53 mutation status in cancer. Finally, we present a use case where RENOIR was employed to address a significant pharmacological challenge-predicting drug efficacy. RENOIR is freely available at https://github.com/alebarberis/renoir .

U2 - 10.1038/s41598-024-51381-4

DO - 10.1038/s41598-024-51381-4

M3 - Article

SN - 2045-2322

VL - 14

JO - Scientific Reports

JF - Scientific Reports

IS - 1

M1 - 1933

ER -