Модели и методы N-Shot обучения и их применение в задачах семантической сегментации изображений: систематизированный обзор, часть i – Zero-Shot обучение

Ростислав Русланович Отырба; Александр Анатольевич Сирота

doi:10.17308/sait/1995-5499/2024/1/149-165

Authors

Rostislav R. Otyrba Voronezh State University https://orcid.org/0000-0002-0412-2465 (unauthenticated)
Alexander A. Sirota Voronezh State University https://orcid.org/0000-0002-5785-8513 (unauthenticated)

DOI:

https://doi.org/10.17308/sait/1995-5499/2024/1/149-165

Keywords:

N-Shot Learning, Zero-Shot Learning, One-Shot/Few-Shot Learning, semantic segmentation, deep neural networks

Abstract

The paper provides a systematic review of N-Shot Learning models and methods in the context of semantic segmentation task using deep neural networks. N-Shot Learning encompasses a set of deep learning methods and algorithms primarily applied in image processing tasks, aimed at enabling a neural network to quickly and efficiently adapt to a new task in the absence of training examples (Zero-Shot learning) or with very few examples (One-Shot/FewShot learning). It is worth noting that domestic scientific publications lack a sufficiently comprehensive and systematic analysis of the results obtained in this direction. This first part of the review is devoted to Zero-Shot Learning, which is one of the directions of the N-Shot methodology, that performing image segmentation of objects of new classes based solely on the target image and its textual description. The paper outlines the problem formulation of Zero-Shot learning and analyzes in detail the most well-known approaches and implementations, ranging from initial concepts to the latest innovative research. The deep neural network models presented in the figures are shown with the preservation of the most significant components that reflect the prin-ciples of the proposed approach implementation in each case. If an accurate reproduction of the ar-chitecture is required, the reader should refer to the original source. To better understand the ad-vantages and disadvantages of the analyzed models, a comparison of the test results obtained by the authors on the common data sets Pascal-ABC 2012 and COCO-Stuff was carried out. The conduct-ed comparative analysis identified the most promising and effective models that can be recom-mended for practical applications in semantic segmentation tasks. The subsequent second part of the review will present the study of One-Shot and Few-Shot Learning methods in semantic segmentation tasks. It will focus on methods capable of performing image segmentation with new object classes based on only a few training examples.

Author Biographies

Rostislav R. Otyrba, Voronezh State University

PhD student, Department of Information Security and Processing Technologies, Faculty of Computer Sciences, Voronezh State University
Alexander A. Sirota, Voronezh State University

DSc in Technical Sciences, Head of the Department of Information Security and Processing Technologies, Faculty of Computer Sciences, Voronezh State University