R. Sudirman, S.-H. Salleh, and S. Salleh (Malaysia)
Linear Predictive Coding, neural network, convergence test, speech pattern, conjugate gradient method
This paper presents the use of coefficients derived from linear predictive coding (LPC) based on the dynamic time warping (DTW). The derived coefficients are called the DTW frame-fixing coefficients (DTW-FF), they are used as input to the back-propagation neural network for speech pattern recognition. This paper also presents the study of pitch as a contributing input feature added to the DTW-FF coefficients. The results showed a good performance and improvement as high as 100% when using pitch along with DTW-FF feature. It is known that back-propagation NN is capable of handling large learning problems and is a very promising method due to its ability to train data and classify them. Current method of back- propagation is using steepest gradient descent whereby this method is exposed to bad local-minima. In this study, the network is designed to handle the parallel processing of multiple samples/words, therefore it caused the network to compute a large amount of connection weights and error updates at a time, therefore longer time is taken for network convergence to its global minima. Since the Conjugate Gradient method has been proven of being able to accelerate the network convergence, it is applied into the back-propagation mechanism to replace the steepest gradient descent algorithm. The outcome showed that the convergence rate was improved when conjugate gradient method is used in the back-propagation algorithm.
Important Links:
Go Back