Сжатие аудиоданных на основе психоакустических принципов восприятия звука человеком

Илья Игоревич Чижов; Татьяна Николаевна Балабанова

doi:10.17308/sait/1995-5499/2024/3/127-137

Authors

Ilya. I. Chizhov Sound Line LLC https://orcid.org/0009-0009-1127-8618 (unauthenticated)
Tatiana N. Balabanova Belgorod State Research University https://orcid.org/0000-0003-3547-3433 (unauthenticated)

DOI:

https://doi.org/10.17308/sait/1995-5499/2024/3/127-137

Keywords:

audio data, audio data compression, psychoacoustic model, spectrum, level quantization

Abstract

This article presents a new lossy audio compression method. The method is based on the psychoacoustic principles of human sound perception. Taking these principles into account allows us to obtain a method for compressing audio data of various natures: musical compositions, speech signals, various sounds of other origins. It is worth noting that each of them has its own characteristics. Speech signals contain pauses and have a less varied frequency range in relation to music, which leads to the development of specific methods for their compression. The purpose of constructing the presented theory of lossy audio data compression is to achieve equality of the original and reconstructed signals in a perceptual sense. It is this approach that allows us to obtain a method of audio data compression, which allows us to significantly reduce the bit representation of the audio signal, leaving it aurally very close to the original. When developing the method, much attention was paid to level quantization, and when quantizing the spectral components of the signal, the theory of barely noticeable changes in sound is used. It seems appropriate to take this theory into account, since it is significant in the processing of audio signals, however, it has not yet been used in the development of audio data compression methods. The level quantization procedure proposed in the article combines the advantages of both adaptive and uniform quantization. For adaptive quantization, the main advantage is the significantly smaller number of quantization levels required to achieve a quantization noise level comparable to uniform quantization. The presented quantization method, which, being essentially non-uniform (adaptive), does not require the transmission of the value of each of the quantization levels (or quantization step). In addition, the quantization error in the developed method does not exceed 1 dB, which is the threshold for barely audible changes in sound.

Author Biographies

Ilya. I. Chizhov, Sound Line LLC

PhD in Technical Sciences, Technical advisor, Sound Line LLC
Tatiana N. Balabanova, Belgorod State Research University

PhD in Technical Sciences, Associate Professor, Department of Information and Telecommunication Systems and Technologies, Belgorod State Research University