A Study On Speech Recognition Technology And Speaker Identification Using Data Mining
Keywords:
MFCC, KNN, DCP, DWT, Cepstrum, Feature Extraction, LPC, Feature Matching, HMM, SVMAbstract
The purpose of this paper is to develop a Speaker Identification System which can recognize speakers by their acoustic characteristics of speech. The proposed system would be a text independent system means the user is free to speak any word or sentence. Speaker recognition is the process of automatically recognizing who is speaking on the basis of individual information include in speech waves. This technique makes it possible to use the speech’s voice to verify their identity and control access to services such as voice dialing, banking by telephone, telephone shopping, database access services, information services, voice mail, Forensic speaker recognition ,security control for confidential information areas, and remote access to computers. Wavelet Transform particularly Discrete Wavelet Transform (DWT) is used in order to extract the vocal characteristics of the speakers in speech signal whereas KNearest Neighbor (KNN) algorithm is used for feature matching, which shows a very much improvement in the identification rate. The feature extraction is done by six levels wavelet decomposition and these features are extracted from wavelet coefficients by mean, standard deviation and ratios between them.
Downloads
References
D.A.Reynoldsand R.C.Rose,“Robust text-independent Speaker identification using Gaussian mixture speaker models,”IEEE Transactionson Speech and Audio Processing, vol.3,no.1,pp.72–83,1995.
S.Melniko?,S.F.Quigley,andM.Russell,“Speech recognition On an FPGA using discrete and continuous hidden Markov models, ”inProceeding soft he International Workshopon Field- ProgrammableLogic,pp.202–211,2002.
S.Melniko?, S. F. Quigley, and M. Russell, “Implementing a Simple continuous speech recognition system on an FPGA,”inProceedings of IEEE Symposiumon Field Programmable Custom ComputingMachines, pp. 275–276, Los Alamitos, Calif, USA, 2002. [4]K.Miura,H.Noguchi,H.Kawaguchi,and M.Yoshimoto, “Alow memory bandwidth Gaussian mixture model(GMM) Processorfor 20,000-word real-time speech recognition FPGA system,”in Proceedingsofthe International Conferenceon Field-Programmable Technology (ICFPT’08), pp. 341–344, December2008.
S.Yoshizawa, N.Wada, N.Hayasaka, and Y.Miyanaga, “Scalable architecture for word HMM-based speech recog- Nitionand VL SIimple mentation in complete system,” IEEE Transactionson Circuits and SystemsI,vol.53,no.1,pp.70– 77,2006.
David Michael Graeme Watts, “Speaker Identification –Prototype Development and Performance” Research Project, University of Southern Queensland, Facultyof Engineering & Surveying, 2006.
D.A. Reynolds, “An overview of Automatic Speaker Recognition Technology”, international conference on Acoustic Speech and Signal processing, SignalProcessing Society IEEE 2002.
Tridibesh Dutta, “Dynamic Time Warping Based Approach to Text DependentSpeaker Identification Using Spectrograms” cisp,pp.354-360, Congress on Imageand Signal Processing, Vol. 2, 2008.
Rabbani N., “Novel approach in speaker identification using support vectormachines”, 9th International Symposium on Signal Processing and Its Applications, ISSPA 2007, Sharjah, UAE, 2007
Daqroug K., “Speaker Identification Wavelet Transform based method”, IEEE,5th Intenational Multi-Conference on Systems, Signals and Devices, Amman, Jordon, SSD-2008.