Publication Details

FIT BUT at SemEval-2023 Task 12: Sentiment Without Borders - Multilingual Domain Adaptation for Low-Resource Sentiment Classification

APAROVICH Maksim, KESIRAJU Santosh, DUFKOVÁ Aneta and SMRŽ Pavel. FIT BUT at SemEval-2023 Task 12: Sentiment Without Borders - Multilingual Domain Adaptation for Low-Resource Sentiment Classification. In: Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023). Toronto (online): Association for Computational Linguistics, 2023, pp. 1518-1524. ISBN 978-1-959429-99-9. Available from: https://aclanthology.org/2023.semeval-1.209/
Czech title
FIT BUT at SemEval-2023 Task 12: Sentiment bez hranic - Multilinguální adaptace domény pro klasifikaci sentimentu s nízkými zdroji
Type
conference paper
Language
english
Authors
Aparovich Maksim (DCGM FIT BUT)
Kesiraju Santosh (DCGM FIT BUT)
Dufková Aneta, Ing. (FIT BUT)
Smrž Pavel, doc. RNDr., Ph.D. (DCGM FIT BUT)
URL
Keywords

sentiment analysis, cross-lingual sentiment analysis, domain adaptation, adversarial training, low-resource languages, African languages, transformer, feed-forward neural network

Abstract

This paper presents our proposed method for SemEval-2023 Task 12, which focuses on sentiment analysis for low-resource African lan- guages. Our method utilizes a language-centric domain adaptation approach which is based on adversarial training, where a small version of Afro-XLM-Roberta serves as a generator model and a feed-forward network as a discriminator. We participated in all three subtasks: monolingual (12 tracks), multilingual (1 track), and zero-shot (2 tracks). Our results show an improvement in weighted F1 for 13 out of 15 tracks with a maximum increase of 4.3 points for Moroccan Arabic compared to the baseline. We observed that using language family-based labels along with sequence-level input representations for the discriminator model improves the quality of the cross-lingual sentiment analysis for the languages unseen during the training. Additionally, our experimental results suggest that training the system on languages that are close in a language families tree enhances the quality of sentiment analysis for low-resource languages. Lastly, the computational complexity of the prediction step was kept at the same level which makes the approach to be interesting from a practical perspective. The code of the approach can be found in our repository.

Published
2023
Pages
1518-1524
Proceedings
Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023)
Conference
The 61st Annual Meeting of the Association for Computational Linguistics, Toronto, CA
ISBN
978-1-959429-99-9
Publisher
Association for Computational Linguistics
Place
Toronto (online), CA
DOI
EID Scopus
BibTeX
@INPROCEEDINGS{FITPUB13043,
   author = "Maksim Aparovich and Santosh Kesiraju and Aneta Dufkov\'{a} and Pavel Smr\v{z}",
   title = "FIT BUT at SemEval-2023 Task 12: Sentiment Without Borders - Multilingual Domain Adaptation for Low-Resource Sentiment Classification",
   pages = "1518--1524",
   booktitle = "Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023)",
   year = 2023,
   location = "Toronto (online), CA",
   publisher = "Association for Computational Linguistics",
   ISBN = "978-1-959429-99-9",
   doi = "10.18653/v1/2023.semeval-1.209",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/13043"
}
Back to top