Sentiment analysis of user texts based on the tuning of the training parameters of a distilled model of the BERT family
DOI:
https://doi.org/10.17308/sait/1995-5499/2022/3/139-151Keywords:
sentiment analysis, text sentiment classification, distillation, learning model, data preprocessing, data normalization, BERT, ruBert, PythonAbstract
The increasing complexity of neural network architectures and the increasing volume of processed data in machine learning raises the question of the need to apply more productive approaches that would optimize the development of text classification models to solve the tasks of sentiment analysis. The aim of this work is to train and optimize a approaches for data classification as part of the solution of sentiment analysis of the Russian-speaking text. This research proposes the application of pre-trained BERT bidirectional coding models as well as the ruBERT-tiny knowledge distillation model to perform multiclass text classification for sentiment analysis of user text. The application of the data compaction step for knowledge distillation models allows to optimize the training phase of the text classification models. A program is developed in the Python using machine learning libraries. The technical solution allows to test the pre-trained models of data classification, on the basis of which to create optimized models of classification for the analysis of the sentiment of the user texts, taking into account the specifics of the subject area.
References
Downloads
Published
Issue
Section
License
Условия передачи авторских прав in English













