Abstract
Recent expansions of technology led to growth and availability of different types of data. This, thus gave various opportunities for the machine learning, data mining, chemometrics and data science fields. Both fields have been consequently developing new approaches and algorithms in a wide range of applications in biomedical, medical, -omics but also from daily-life to national security areas. Ensemble techniques become the backbone of the machine learning field. The phrase refers to an approach in which multiple, independent, aka uncorrelated, predictive models are combined. Those multiple models can be combined for instance by simple averaging or voting. The advantage of ensemble techniques is their ability to yield very high performance model. The use of ensemble techniques is present in our daily lives. We tend to ask or check the opinion of several specialists before making the final decision for instance before purchasing an item or before hiring a new employee we search for judgment of several referees. In this book article, the theoretical and practical demonstration of three ensembles techniques, adaptive boosting, random forest and gradient boosting are shown. Each technique is discussed from its theoretical perspective followed by presentation of pro and cons of each method. The last part of the chapter is focused on the comparison between the techniques using two simulated data sets.
Original language | English |
---|---|
Title of host publication | Comprehensive Chemometrics |
Subtitle of host publication | Chemical and Biochemical Data Analysis |
Editors | Steven Brown, Roma Tauler, Beata Walczak |
Publisher | Elsevier BV |
Chapter | 3.32 |
Pages | 661-672 |
Number of pages | 12 |
Volume | 3 |
Edition | 2 |
ISBN (Electronic) | 9780444641656 |
ISBN (Print) | 978-0-444-64166-3 |
DOIs | |
Publication status | Published - 1 Jan 2020 |