Abstract
By focusing on the online-reviews domain, this study aims to provide a complete solution to the sentiment-analysis task consisting off its three constituent components: opinion holder, polarity of the underlying sentiment and target. For the purposes of this research, several challenges and issues related to the nature of the problem are addressed such as class imbalance and the need for meaningful linguistic data-augmentation techniques to increase the size of the training set and make the use of Long Short-Term Memory models (LSTMs) possible. For both of them, new effective approaches are proposed and evaluated. As a means of quantifying class imbalance, the Minority-to-Majority Ratio (M2MR) is introduced. The two sub tasks of target and polarity detection are tackled using machine-learning means. To support the training process, a new data set, which combined sentences from two different review-based corpora, was constructed. In our research, the best-performing LSTM-based models make use of the context-sensitive BERT embeddings and yield F1-Scores of 0.9263 and 0.8911 over all possible classes for the polarity and target components respectively.
Original language | English |
---|---|
Number of pages | 15 |
Publication status | Published - 1 Nov 2019 |
Event | BNAIC 2019 - VU, Brussels, Belgium Duration: 7 Nov 2019 → 8 Nov 2019 |
Conference
Conference | BNAIC 2019 |
---|---|
Country/Territory | Belgium |
City | Brussels |
Period | 7/11/19 → 8/11/19 |