H. Tolba, S.-A. Selouani, and D. O'Shaughnessy (Canada)
Speech recognition, distinctive cues, robustness, hiddenMarkov models, multi-stream paradigm.
In this paper, we investigate the addition of the time deriva tives of the auditory-based acoustic distinctive features and spectral cues to their static values for automatic speech recognition (ASR). Comparative experiments have indi cated that combining the classical MFCCs with the in stantaneous and the dynamic features of the speech sig nal represented by both the auditory-based acoustic distinc tive cues and spectral cues using a multi-stream paradigm leads to an improvement in the recognition performance. To test the use of the new multi-stream feature vector, a series of experiments on speaker-independent continuous speech recognition have been carried out using the TIMIT database. Using such paradigm, we found that the use of the proposed static and dynamic features combined with MFCCs, outperforms the conventional recognition process that uses only the MFCCs and their first and second deriva tives.
Important Links:
Go Back