Semantic analysis and synthesis of text data
DOI:
https://doi.org/10.17308/sait/1995-5499/2023/4/182-208Keywords:
semantic analysis, data synthesis, automatic text processing, analysis of heterogeneous dataAbstract
This article is of an overview nature. The study of the ideas of domestic and foreign researchers is of great importance, which is determined by modern requirements for the study of data processing systems. The goal is to try to determine what a machine understanding of text/ speech might be. In addition, LLMs such as ChatGPT emphasize the importance and timeliness of such a review. On the other side, despite the daily increase in the global volume of data, their use in raw (raw) form is usually not possible. To solve a number of applied problems, it is required to some extent to process them. The solution of applied problems of natural language processing is impossible without the use of methods of semantic analysis and data synthesis. The growing volumes of information generated by users and the digitalization of society require the improvement of these methods, which makes the review on this topic relevant. The aim of the work is to consider the main trends in the field of natural language processing, the use of semantic analysis, ontologies and data synthesis. The essence of semantic analysis, its application and existing approaches to implementation both in traditional ways and with the use of artificial intelligence methods are described. The main advantages of using semantic analysis when working with data are determined. The work is based on the method of data analysis and processing, so, a review of approaches to the classification of texts in information systems was carried out. The issues of providing access to generalized information from various databases with the help of a semantic approach and data ontology are considered. Variants of data synthesis both from structured data sets and using meta-data are described. As a result of the study, the main problems in natural language processing were identified, such as access to data, the openness of research data, the definition of sentiment, irony and sarcasm. The presented information can be used in planning the solution of natural language processing problems, developing software products for automating this process, developing relational databases, decision support systems, information and analytical systems.
References
Downloads
Published
Issue
Section
License
Условия передачи авторских прав in English













