Audio data compression based on psychoacoustic principles of human sound perception
DOI:
https://doi.org/10.17308/sait/1995-5499/2024/3/127-137Keywords:
audio data, audio data compression, psychoacoustic model, spectrum, level quantizationAbstract
This article presents a new lossy audio compression method. The method is based on the psychoacoustic principles of human sound perception. Taking these principles into account allows us to obtain a method for compressing audio data of various natures: musical compositions, speech signals, various sounds of other origins. It is worth noting that each of them has its own characteristics. Speech signals contain pauses and have a less varied frequency range in relation to music, which leads to the development of specific methods for their compression. The purpose of constructing the presented theory of lossy audio data compression is to achieve equality of the original and reconstructed signals in a perceptual sense. It is this approach that allows us to obtain a method of audio data compression, which allows us to significantly reduce the bit representation of the audio signal, leaving it aurally very close to the original. When developing the method, much attention was paid to level quantization, and when quantizing the spectral components of the signal, the theory of barely noticeable changes in sound is used. It seems appropriate to take this theory into account, since it is significant in the processing of audio signals, however, it has not yet been used in the development of audio data compression methods. The level quantization procedure proposed in the article combines the advantages of both adaptive and uniform quantization. For adaptive quantization, the main advantage is the significantly smaller number of quantization levels required to achieve a quantization noise level comparable to uniform quantization. The presented quantization method, which, being essentially non-uniform (adaptive), does not require the transmission of the value of each of the quantization levels (or quantization step). In addition, the quantization error in the developed method does not exceed 1 dB, which is the threshold for barely audible changes in sound.
References
Downloads
Published
Issue
Section
License
Условия передачи авторских прав in English













