Identification of metaphors with the help of machine learning

Authors

DOI:

https://doi.org/10.17308/lic/1680-5755/2022/4/128-143

Keywords:

machine learning, Text Mining, Natural Language Processing, metaphor identification, cryptotype analysis, CNN, supervised learning

Abstract

The article discusses the possibilities of creating a classifier for the automatic metaphor identification with the help of machine learning. The model was trained on the basis of a representative dataset of 389 857 examples which was marked up by us manually. The article describes a series of experiments, the difficulties encountered, as well as ways we used to solve them. The following machine learning methods were used: naive Bayes, logistic regression and artificial neural networks. The following parameters were changed: stop words, lemmatization, stemming, the number of N-grams; for neural networks, the parameters were also adjusted: the number of epochs, batch size, the number of examples for training and validation, etc. The best results (Accuracy = 0.88, F1-score = 0.87) were achieved using a convolutional neural network with the following parameters: epochs = 10, layers = 6 (including 2 dropout layers), batch_size = 500, training – 70 % of data, validation – 30 % of data, vectorization = 2 and 3 characters, activation function = relu and sigmoid, optimizer = Adamax, loss_func = binary_crossentropy. As a result we developed automation tools for the classification of corpus examples of metaphorical compatibility, which in the future should contribute to the intensification and popularization of research in this area, due to the reduction of labor and time spent by researchers on processing corpus queries and their classification.

Author Biography

  • O. V. Donina, Voronezh State University

    Candidate of Philology, Associate Professor of the Theoretical and Applied Linguistics Department

References

Downloads

Published

2023-01-14

Issue

Section

Computational Linguistics

How to Cite

Identification of metaphors with the help of machine learning. (2023). Proceedings of Voronezh State University. Series: Linguistics and Intercultural Communication, 4, 128-143. https://doi.org/10.17308/lic/1680-5755/2022/4/128-143

Most read articles by the same author(s)