Analysis of the approaches to classifying emotions in non-verbal communications based on machine learning

Authors

DOI:

https://doi.org/10.17308/sait.2020.4/3206

Keywords:

computation of emotions, emotiondata recognition, visualisation of multidimensional, support vector machine, k-nearest neighbours

Abstract

Due to the active development of human-machine interaction systems and digital communication channels, emotion recognition is a very important problem. Researchers working on automated emotion recognition usually focus on the behavioural component of emotions, since it can be analysed remotely, not involving the subject. The expressive component can be represented by various modalities: facial expressions, posture and body movements, verbal and non-verbal behaviour. Non-verbal behaviour, alongside with other modalities, can be used for the indirect recognition of emotions. Analysis of this modality becomes particularly relevant, when there is little or no data from the other modalities, as well in multimodal recognition models. The article considers an approach to emotion recognition during communications, based on the processing of feature representations of speech recordings in the eGeMAPS feature set, which allows to determine the most relevant information about non-verbal emotion expression in an audio signal. Emotion recognition was performed using the following datasets: CREMA-D, IEMOCAP, Emo-DB, RAVDESS, SAVEE, and TESS, as well as their combinations. For the preliminary assessment of applicability of a certain data set in the feature space considered, preliminary data visualisation with t-SNE algorithm was used. For classification purposes the methods were selected based on the metric assessment of mutual data distribution: the method of k-nearest neighbours and support vector machines method. The article presents the results of classification of the analysed algorithms, based on the following metrics: percentage of correct answers, accuracy, and completeness. The conducted experiments demonstrated that the support vector machines method performs better for multiclass classification, whereas the k-nearest neighbour method is better for binary classification. When recognising individual classes both methods yield the maximum accuracy (0.55 or higher) for “anger”; the minimum accuracy was observed for “happiness” and “disgust”.

Author Biographies

  • Mikhail Yu. Uzdiaev, St. Petersburg Federal Research Center of the Russian Academy of Sciences, St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences

    research assistant, Laboratory of Big Data and Socio-Cyberphysical Systems, St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS), St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences

  • Artem V. Ryabinov, St. Petersburg Federal Research Center of the Russian Academy of Sciences, St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences

    software engineer, Laboratory of Autonomous Robotic Systems, St. Petersburg Federal Research Centre of the Russian Academy of Sciences (SPC RAS), St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences

References

Downloads

Published

2021-02-02

Issue

Section

Intelligent Information Systems, Data Analysis and Machine Learning

How to Cite

Analysis of the approaches to classifying emotions in non-verbal communications based on machine learning. (2021). Proceedings of Voronezh State University. Series: Systems Analysis and Information Technologies, 4, 81-97. https://doi.org/10.17308/sait.2020.4/3206