Accomplishments

Speaker Recognition


  • Details
  • Share
Category
Conference
Authors
Conference Name
International Conference on Global Technological Initiatives (ICGTI)
Conference From
21-Mar-2012
Conference To
22-Mar-2012
Conference Venue
Rizvi College of Engineering
  • Abstract

Digital processing of speech signal and voice recognition algorithm is very important for fast and accurate automatic voice recognition technology. The voice is a signal of infinite information. A direct analysis and synthesizing the complex voice signal is due to too much information contained in the signal. Therefore the digital signal processes such as Feature Extraction and Feature Matching are introduced to represent the voice signal. The extraction and matching process is implemented after the Pre Processing of signal. In this paper, speaker recognition is based on speech samples uttered by 4 different speakers. The speakers were made to utter different digits. The non-parametric method for modeling the human auditory perception system, Mel Frequency Cepstral Coefficients (MFCCs) is utilized as extraction technique. Since MFCC imitates the human hearing system, it provides better recognition rates than Linear Predictive Coefficients (LPC). For feature matching, the non linear sequence alignment known as Dynamic Time Warping (DTW) is used. Since voice signal tends to have different temporal rate, the alignment is important for correct recognition of speaker. This paper presents the viability of MFCC to extract features and DTW to compare the test patterns. It is found that the distance between feature vectors of two diffrent speakers uttering a word is significantly higher than that of the distance between feature vectors of same speakers. Keywords: Mel filter banks, Hamming Window, Dynamic Time Warping (DTW).

© Somaiya 2025 / All rights reserved.
Get in Touch