Abstract
Biomarker discovery, i.e., finding disease or condition-specific biological markers, is a crucial aspect of biomedical research. Volatile organic compounds (VOCs) are excreted by various biofluids, cells and tissues, and bacteria, and these have been investigated extensively for their potential as markers of malfunctioning status in human. The number of VOCs excreted by those media - typically detected using sophisticated analytical instrumentation - are numerically large and biologically complex. Therefore, data preprocessing and analysis are crucial for successful identification of valid VOC markers for their application in clinical practice. This chapter provides an overview of various preprocessing approaches suitable for volatilome data of diverse nature. The importance of normalization and scaling, often neglected in the field, is discussed. The most common and promising machine learning techniques are presented and discussed, including unsupervised and supervised approaches, followed by a rarely used strategy in the volatilomics field, data fusion. The chapter aims to equip the reader with a basic overview of suitable techniques for treating and successfully exploiting volatilome data.
Original language | English |
---|---|
Title of host publication | Breathborne Biomarkers and the Human Volatilome |
Editors | Jonathan Beauchamp, Cristina Davis, Joachim Pleil |
Publisher | Elsevier |
Chapter | 38 |
Pages | 633-647 |
Number of pages | 15 |
ISBN (Electronic) | 9780128199671 |
ISBN (Print) | 9780128223970 |
DOIs | |
Publication status | Published - 1 Jan 2020 |
Keywords
- Data fusion
- Machine learning
- Multivariate
- Supervised
- Unsupervised
- Volatile organic compounds (VOCs)