Application of the algorithm for approximation of the graphic of energy shares for determining pauses in a speech signal
DOI:
https://doi.org/10.17308/sait.2021.3/3740Keywords:
energy fractions, mixture of radial-basis functions, mixture of Gaussian functions, decision functionAbstract
In this paper, a speech signal is considered as a set of fragments containing speech components and fragments with noises corresponding to pauses between words. The task is to formulate a decisive function capable of accepting or rejecting the hypothesis of the absence of speech in a segment of a speech signal. On the basis of the subband method for a segment of a speech signal, its energy distribution over frequencies is compiled. For this distribution, in what follows, a mixture approximation procedure is applied by radial basis functions (Gaussian functions). The mixture is a weighted sum of radial basis functions and a uniformly distributed component. Based on the ratio of the maximum values of the components of the mixture, a decisive rule is drawn up. To carry out a computational experiment, the nonlinearity «dead zone» is introduced, the choice of which is due to the peculiarities of the electrical activity of the pathways and centers of the auditory system. The paper presents the result of applying the algorithm for determining pauses in a speech signal. The database of marked speech fragments of the American agency for advanced defense research projects DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus was used as a working material. In total, 100 sound recordings were processed, the size of the analysis segment was taken 9 milliseconds, the sampling rate was 16000Hz. To test the efficiency of the proposed algorithm, errors of the first kind “miss the target” were evaluated — when the algorithm did not start to mark a pause, but such a mark is present during manual placement, as well as errors of the second kind “false alarm” — when an erroneous setting of a pause occurred. The results obtained in the course of computational experiments make it possible to judge the sufficiently high efficiency of the proposed approach for determining pauses in a speech signal.
References
Downloads
Published
Issue
Section
License
Условия передачи авторских прав in English













