Модели и методы N-Shot обучения и их применение в задачах семантической сегментации изображений: систематизированный обзор, часть II, ONE-Shot и FEW-Shot обучение

Ростислав Русланович Отырба; Александр Анатольевич Сирота

doi:10.17308/sait/1995-5499/2024/2/152-172

Authors

Rostislav R. Otyrba Voronezh State University https://orcid.org/0000-0002-0412-2465 (unauthenticated)
Alexander A. Sirota Voronezh State University https://orcid.org/0000-0002-5785-8513 (unauthenticated)

DOI:

https://doi.org/10.17308/sait/1995-5499/2024/2/152-172

Keywords:

N-Shot Learning, Zero-Shot Learning, One-Shot/Few-Shot Learning, semantic segmentation, deep neural networks

Abstract

The paper provides a systematic review of N-Shot Learning models and methods in the context of semantic segmentation task using deep neural networks. N-Shot Learning encompasses a set of deep learning methods and algorithms primarily applied in image processing tasks, aimed at enabling a neural network to quickly and efficiently adapt to a new task in the absence of training examples (Zero-Shot Learning) or with very few examples (One-Shot/FewShot Learning). It is worth noting that domestic scientific publications lack a sufficiently comprehensive and systematic analysis of the results obtained in this direction. In a paper previously published in this journal, the first part of the review was presented, focusing exclusively on methods and algorithms of Zero-Shot Learning, i.e. learning in the absence of training examples. This paper is the second part of the review and it is dedicated to One-Shot/ Few-Shot learning methods. On the one hand, it is closely related to the previously published first part, but, on the other hand, it reveals the essence of a fundamentally different approach. Unlike Zero-Shot Learning, the model does not have training samples for new classes, this approach involves training either based on one training sample for each class (One-Shot Learning) or a small number of training examples (Few-Shot Learning). Currently, this direction is evolving even more actively than Zero-Shot Learning and demonstrates impressive results. The paper outlines the problem formulation of One-Shot and Few-Shot Learning and analyzes in detail the most well-known approaches and implementations, ranging from initial concepts to the latest innovative research. The deep neural network models presented in the figures are shown with the preservation of the most significant components that reflect the principles of the proposed approach implementation in each case. If an accurate reproduction of the architecture is required, the reader should refer to the original source. To better understand the advantages and disadvantages of the analyzed models, a comparison of the test results obtained by the authors on the common data sets PASCAL-5i and COCO-20i was carried out. The conducted comparative analysis allowed identifying the most promising and effective models that can be recommended for practical applications in semantic segmentation tasks with limited training examples.

Author Biographies

Rostislav R. Otyrba, Voronezh State University

PhD student, Department of Information Security and Processing Technologies, Faculty of Computer Sciences, Voronezh State University
Alexander A. Sirota, Voronezh State University

д-р техн. наук, проф., заведующий кафедрой технологий обработки и защиты информации, факультета компьютерных наук, Воронежского государственного университета