Abstract
In this paper, we address several aspects of applying classical machine learning algorithms to a regression problem. We compare the predictive power to validate our approach on a data about revenue of a large Russian restaurant chain. We pay special attention to solve two problems: data heterogeneity and a high number of correlated features. We describe methods for considering heterogeneity—observations weighting and estimating models on subsamples. We define a weighting function via Mahalanobis distance in the space of features and show its predictive properties on following methods: ordinary least squares regression, elastic net, support vector regression, and random forest.
Original language | English |
---|---|
Title of host publication | Analysis of Images, Social Networks and Texts - 8th International Conference, AIST 2019, Revised Selected Papers |
Editors | Wil M.P. van der Aalst, Vladimir Batagelj, Dmitry I. Ignatov, Valentina Kuskova, Sergei O. Kuznetsov, Irina A. Lomazova, Michael Khachay, Andrey Kutuzov, Natalia Loukachevitch, Amedeo Napoli, Panos M. Pardalos, Marcello Pelillo, Andrey V. Savchenko, Elena Tutubalina |
Publisher | Springer, Cham |
Pages | 27-36 |
Number of pages | 10 |
ISBN (Print) | 9783030395742 |
DOIs | |
Publication status | Published - 2020 |
Externally published | Yes |
Event | 8th International Conference on Analysis of Images, Social Networks and Texts - Kazan Federal University, Kazan, Russian Federation Duration: 17 Jul 2019 → 19 Jul 2019 Conference number: 8 https://2019.aistconf.org/ |
Publication series
Series | Communications in Computer and Information Science |
---|---|
Volume | 1086CCIS |
ISSN | 1865-0929 |
Conference
Conference | 8th International Conference on Analysis of Images, Social Networks and Texts |
---|---|
Abbreviated title | AIST 2019 |
Country/Territory | Russian Federation |
City | Kazan |
Period | 17/07/19 → 19/07/19 |
Internet address |
Keywords
- Machine learning
- Revenue prediction
- Weighted regression