0% found this document useful (0 votes)

635 views4 pages

MATLAB Code For Speech Recognition

This MATLAB code performs speech recognition by calculating Mel frequency cepstrum coefficients (MFCCs) from an input audio signal. It takes in an audio signal, sampling rate, and optional frame rate. It applies a Hamming window, takes the Fourier transform, filters the magnitudes into mel-scale filter banks, takes the log, and applies a discrete cosine transform to reduce dimensionality. This results in MFCC features that can be used for speech recognition. It also optionally returns various intermediate processing steps for analysis. The code was originally written in 1993 and has been modified over time.

Uploaded by

Ravi Teja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

635 views4 pages

MATLAB Code For Speech Recognition

Uploaded by

Ravi Teja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

12/10/2015

MATLAB code for speech recognition

Re: MATLAB code for speech recognition

%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%

mfcc - Mel frequency cepstrum coefficient analysis.

[ceps,freqresp,fb,fbrecon,freqrecon] = ...
mfcc(input, samplingRate, [frameRate])
Find the cepstral coefficients (ceps) corresponding to the
input. Four other quantities are optionally returned that
represent:
the detailed fft magnitude (freqresp) used in MFCC calculation,
the mel-scale filter bank output (fb)
the filter bank output by inverting the cepstrals with a cosine
transform (fbrecon),
the smooth frequency response by interpolating the fb reconstruction
(freqrecon)
-- Malcolm Slaney, August 1993
Modified a bit to make testing an algorithm easier... 4/15/94
Fixed Cosine Transform (indices of cos() were swapped) - 5/26/95
Added optional frameRate argument - 6/8/95
Added proper filterbank reconstruction using inverse DCT - 10/27/95
Added filterbank inversion to reconstruct spectrum - 11/1/95

% (c) 1998 Interval Research Corporation

function [ceps,freqresp,fb,fbrecon,freqrecon] = ...
MFCC(input, samplingRate, frameRate)
global mfccDCTMatrix mfccFilterWeights
[r c] = size(input);
if (r > c)
input=input';
end
% Filter bank parameters
lowestFrequency = 133.3333;
linearFilters = 13;
linearSpacing = 66.66666666;
logFilters = 27;
logSpacing = 1.0711703;
fftSize = 512;
cepstralCoefficients = 13;
windowSize = 400;
windowSize = 256; % Standard says 400, but 256 makes more sense
% Really should be a function of the sample
% rate (and the lowestFrequency) and the
% frame rate.
if (nargin < 2) samplingRate = 16000; end;
if (nargin < 3) frameRate = 100; end;
% Keep this around for later....
totalFilters = linearFilters + logFilters;
%
%
%
%

Now figure the band edges. Interesting frequencies are spaced

by linearSpacing for a while, then go logarithmic. First figure
all the interesting frequencies. Lower, center, and upper band
edges are all consequtive interesting frequencies.

data:text/html;charset=utf-8,%3Ch2%20class%3D%22title%20icon%22%20style%3D%22margin%3A%200px%200px%205px%3B%20padding%3A%2010px%2

1/4

12/10/2015

MATLAB code for speech recognition

freqs = lowestFrequency + (0:linearFilters-1)*linearSpacing;

freqs(linearFilters+1:totalFilters+2) = ...
freqs(linearFilters) * logSpacing.^(1:logFilters+2);
lower = freqs(1:totalFilters);
center = freqs(2:totalFilters+1);
upper = freqs(3:totalFilters+2);
% We now want to combine FFT bins so that each filter has unit
% weight, assuming a triangular weighting function. First figure
% out the height of the triangle, then we can figure out each
% frequencies contribution
mfccFilterWeights = zeros(totalFilters,fftSize);
triangleHeight = 2./(upper-lower);
fftFreqs = (0:fftSize-1)/fftSize*samplingRate;
for chan=1:totalFilters
mfccFilterWeights(chan,:) = ...
(fftFreqs > lower(chan) & fftFreqs <= center(chan)).* ...
triangleHeight(chan).*(fftFreqs-lower(chan))/(center(chan)-lower(chan)) + ...
(fftFreqs > center(chan) & fftFreqs < upper(chan)).* ...
triangleHeight(chan).*(upper(chan)-fftFreqs)/(upper(chan)-center(chan));
end
%semilogx(fftFreqs,mfccFilterWeights')
%axis([lower(1) upper(totalFilters) 0 max(max(mfccFilterWeights))])
hamWindow = 0.54 - 0.46*cos(2*pi*(0:windowSize-1)/windowSize);
if 0 % Window it like ComplexSpectrum
windowStep = samplingRate/frameRate;
a = .54;
b = -.46;
wr = sqrt(windowStep/windowSize);
phi = pi/windowSize;
hamWindow = 2*wr/sqrt(4*a*a+2*b*b)* ...
(a + b*cos(2*pi*(0:windowSize-1)/windowSize + phi));
end
% Figure out Discrete Cosine Transform. We want a matrix
% dct(i,j) which is totalFilters x cepstralCoefficients in size.
% The i,j component is given by
% cos( i * (j+0.5)/totalFilters pi )
% where we have assumed that i and j start at 0.
mfccDCTMatrix = 1/sqrt(totalFilters/2)*cos((0:(cepstralCoefficients-1))' * ...
(2*(0:(totalFilters-1))+1) * pi/2/totalFilters);
mfccDCTMatrix(1,:) = mfccDCTMatrix(1,:) * sqrt(2)/2;
%imagesc(mfccDCTMatrix);
% Filter the input with the preemphasis filter. Also figure how
% many columns of data we will end up with.
if 1
preEmphasized = filter([1 -.97], 1, input);
else
preEmphasized = input;
end
windowStep = samplingRate/frameRate;
data:text/html;charset=utf-8,%3Ch2%20class%3D%22title%20icon%22%20style%3D%22margin%3A%200px%200px%205px%3B%20padding%3A%2010px%2

2/4

12/10/2015

MATLAB code for speech recognition

cols = fix((length(input)-windowSize)/windowStep);
% Allocate all the space we need for the output arrays.
ceps = zeros(cepstralCoefficients, cols);
if (nargout > 1) freqresp = zeros(fftSize/2, cols); end;
if (nargout > 2) fb = zeros(totalFilters, cols); end;
% Invert the filter bank center frequencies. For each FFT bin
% we want to know the exact position in the filter bank to find
% the original frequency response. The next block of code finds the
% integer and fractional sampling positions.
if (nargout > 4)
fr = (0:(fftSize/2-1))'/(fftSize/2)*samplingRate/2;
j = 1;
for i=1:(fftSize/2)
if fr(i) > center(j+1)
j = j + 1;
end
if j > totalFilters-1
j = totalFilters-1;
end
fr(i) = min(totalFilters-.0001, ...
max(1,j + (fr(i)-center(j))/(center(j+1)-center(j))));
end
fri = fix(fr);
frac = fr - fri;
freqrecon = zeros(fftSize/2, cols);
end
% Ok, now let's do the processing. For each chunk of data:
% * Window the data with a hamming window,
% * Shift it into FFT order,
% * Find the magnitude of the fft,
% * Convert the fft data into filter bank outputs,
% * Find the log base 10,
% * Find the cosine transform to reduce dimensionality.
for start=0:cols-1
first = floor(start*windowStep) + 1;
last = first + windowSize-1;
fftData = zeros(1,fftSize);
fftData(1:windowSize) = preEmphasized(first:last).*hamWindow;
fftMag = abs(fft(fftData));
earMag = log10(mfccFilterWeights * fftMag');
ceps(:,start+1) = mfccDCTMatrix * earMag;
if (nargout > 1) freqresp(:,start+1) = fftMag(1:fftSize/2)'; end;
if (nargout > 2) fb(:,start+1) = earMag; end
if (nargout > 3)
fbrecon(:,start+1) = ...
mfccDCTMatrix(1:cepstralCoefficients,:)' * ...
ceps(:,start+1);
end
if (nargout > 4)
f10 = 10.^fbrecon(:,start+1);
freqrecon(:,start+1) = samplingRate/fftSize * ...
(f10(fri).*(1-frac) + f10(fri+1).*frac);
data:text/html;charset=utf-8,%3Ch2%20class%3D%22title%20icon%22%20style%3D%22margin%3A%200px%200px%205px%3B%20padding%3A%2010px%2

3/4

12/10/2015

MATLAB code for speech recognition

end
end
% OK, just to check things, let's also reconstruct the original FB
% output. We do this by multiplying the cepstral data by the transpose
% of the original DCT matrix. This all works because we were careful to
% scale the DCT matrix so it was orthonormal.
if 1 && (nargout > 3)
fbrecon = mfccDCTMatrix(1:cepstralCoefficients,:)' * ceps;
% imagesc(mt(:,1:cepstralCoefficients)*mfccDCTMatrix );
end;

data:text/html;charset=utf-8,%3Ch2%20class%3D%22title%20icon%22%20style%3D%22margin%3A%200px%200px%205px%3B%20padding%3A%2010px%2

4/4

DSP For MATLAB & LabVIEW I Fundamentals of Discrete Signal Processing
100% (1)
DSP For MATLAB & LabVIEW I Fundamentals of Discrete Signal Processing
233 pages
Power System Analysis and Design 6th Edition J. Duncan Glover - Read the ebook now with the complete version and no limits
100% (2)
Power System Analysis and Design 6th Edition J. Duncan Glover - Read the ebook now with the complete version and no limits
59 pages
CS Lab Manual PDF
No ratings yet
CS Lab Manual PDF
118 pages
Fundamentals of Speech Recognitiony - Lawrence Rabiner - Biing-Hwang Juang PDF
No ratings yet
Fundamentals of Speech Recognitiony - Lawrence Rabiner - Biing-Hwang Juang PDF
546 pages
Direct Memory Access
100% (1)
Direct Memory Access
9 pages
Report - Phase Shifter
No ratings yet
Report - Phase Shifter
8 pages
MFCC PDF
No ratings yet
MFCC PDF
14 pages
FPGA Implementation of A Face Recognition System
No ratings yet
FPGA Implementation of A Face Recognition System
5 pages
DFT and FFT Implementation
100% (1)
DFT and FFT Implementation
10 pages
Image Processing
No ratings yet
Image Processing
18 pages
Time Table For M.E./M.Tech. Degree Examinations - Nov. /dec.-2013
No ratings yet
Time Table For M.E./M.Tech. Degree Examinations - Nov. /dec.-2013
74 pages
MFCC
100% (2)
MFCC
6 pages
Listing Code Voice Recognition
No ratings yet
Listing Code Voice Recognition
11 pages
DSP Manual
No ratings yet
DSP Manual
64 pages
MFCC Code
No ratings yet
MFCC Code
8 pages
5EC4-02 - Electromagnetics Waves - Pallav Rawal
No ratings yet
5EC4-02 - Electromagnetics Waves - Pallav Rawal
192 pages
Solutions: Cardiff University Examination Paper
No ratings yet
Solutions: Cardiff University Examination Paper
18 pages
Bartlett Window Using Matlab
No ratings yet
Bartlett Window Using Matlab
5 pages
Neural Networks and Fuzzy Logic PDF
No ratings yet
Neural Networks and Fuzzy Logic PDF
9 pages
KWS- Taiwan Chinese Paper 2002
No ratings yet
KWS- Taiwan Chinese Paper 2002
21 pages
Micro Processer 8086-1
No ratings yet
Micro Processer 8086-1
91 pages
Understanding Delta-Sigma Data Converters: Oleee
No ratings yet
Understanding Delta-Sigma Data Converters: Oleee
7 pages
Intro To Geant4
No ratings yet
Intro To Geant4
10 pages
Digital Design and Communication
No ratings yet
Digital Design and Communication
26 pages
Biometric Voice Recognition
No ratings yet
Biometric Voice Recognition
33 pages
(Signal Processing and Communications 13) Hu, Yu Hen - Programmable Digital Signal Processors - Architecture, Programming, and App PDF
No ratings yet
(Signal Processing and Communications 13) Hu, Yu Hen - Programmable Digital Signal Processors - Architecture, Programming, and App PDF
386 pages
How Sound Cards Work: Computer Hardware Image Gallery
No ratings yet
How Sound Cards Work: Computer Hardware Image Gallery
6 pages
Edge Detection
No ratings yet
Edge Detection
25 pages
Djeon 1
No ratings yet
Djeon 1
118 pages
Database Management Systems Unit-1
100% (1)
Database Management Systems Unit-1
5 pages
13MFCC Tutorial
No ratings yet
13MFCC Tutorial
6 pages
Matlab Code of Image Compression
0% (1)
Matlab Code of Image Compression
5 pages
Florence Littauer Personalitate Puzzle
100% (1)
Florence Littauer Personalitate Puzzle
149 pages
Digital Image Processing in Frequency Domain
No ratings yet
Digital Image Processing in Frequency Domain
36 pages
Course Notes v17
No ratings yet
Course Notes v17
82 pages
CSEN701 PA2 Solution 27172 29689
100% (1)
CSEN701 PA2 Solution 27172 29689
4 pages
Realization of A Sigma-Delta Modulator in Fpga
No ratings yet
Realization of A Sigma-Delta Modulator in Fpga
76 pages
Vector Processor
No ratings yet
Vector Processor
83 pages
Texture Mapping: David Luebke 1 12/07/21
No ratings yet
Texture Mapping: David Luebke 1 12/07/21
35 pages
Circular Buffer
No ratings yet
Circular Buffer
6 pages
Fir and Iir Digital Filter Design Guide
No ratings yet
Fir and Iir Digital Filter Design Guide
11 pages
Lab2-Spectral Analysis in Matlab
No ratings yet
Lab2-Spectral Analysis in Matlab
14 pages
Chapter 9 Computation of The DFT
No ratings yet
Chapter 9 Computation of The DFT
37 pages
Sampling and Aliasing
No ratings yet
Sampling and Aliasing
39 pages
DIP Notes Unit-3
No ratings yet
DIP Notes Unit-3
57 pages
Lab Filter Noise Music
No ratings yet
Lab Filter Noise Music
5 pages
Fast Fourier Transforms
No ratings yet
Fast Fourier Transforms
19 pages
جميع اسئلة الرؤيا
No ratings yet
جميع اسئلة الرؤيا
13 pages
HW 3 Soln
100% (1)
HW 3 Soln
4 pages
The Binomial Multisection Matching Transformer
No ratings yet
The Binomial Multisection Matching Transformer
17 pages
Introduction To TMS320C6713 DSP Starter Kit DSK)
No ratings yet
Introduction To TMS320C6713 DSP Starter Kit DSK)
18 pages
Binary Decision Diagrams: Theory, Implementation, Usage
No ratings yet
Binary Decision Diagrams: Theory, Implementation, Usage
39 pages
Ninja Trader EHLERS Indicators ACT
100% (3)
Ninja Trader EHLERS Indicators ACT
11 pages
An Introduction To S-Transform For Time-Frequency Analysis: S.K. Steve Chang
No ratings yet
An Introduction To S-Transform For Time-Frequency Analysis: S.K. Steve Chang
35 pages
4bit Comparator
No ratings yet
4bit Comparator
7 pages
Image Resolution Using Super Resolution Convolutional Neural Network (SRCNN)
No ratings yet
Image Resolution Using Super Resolution Convolutional Neural Network (SRCNN)
6 pages
Chebyshev2 IIR Filter Design
No ratings yet
Chebyshev2 IIR Filter Design
4 pages
Signal and Systems Minor
No ratings yet
Signal and Systems Minor
3 pages
Voice Recognition Using Matlab
100% (1)
Voice Recognition Using Matlab
10 pages
Digital Imaging Ethics
No ratings yet
Digital Imaging Ethics
6 pages
Microprocessors Notes
No ratings yet
Microprocessors Notes
205 pages
Midterm PDF
No ratings yet
Midterm PDF
2 pages
Addition of Two 16-Bit Numbers - 8086
No ratings yet
Addition of Two 16-Bit Numbers - 8086
2 pages
Worklog 5761 Sync
No ratings yet
Worklog 5761 Sync
14 pages
Ai ch5 Computer Vision
No ratings yet
Ai ch5 Computer Vision
10 pages
Digital Image Processing Codes For Engg
No ratings yet
Digital Image Processing Codes For Engg
17 pages
Advanced DSP 1
No ratings yet
Advanced DSP 1
56 pages
CS3491 - Notes - Unit 4 - Ensemble Techniques and Unsupervised Learning
No ratings yet
CS3491 - Notes - Unit 4 - Ensemble Techniques and Unsupervised Learning
35 pages
Huffman Code
No ratings yet
Huffman Code
51 pages
Control Question
No ratings yet
Control Question
4 pages
D 01
0% (1)
D 01
209 pages
Mcs 10096
No ratings yet
Mcs 10096
3 pages
DFT Domain Image
No ratings yet
DFT Domain Image
65 pages
Lab File ON Digital Image Processing: Session-2017-2018 Signal Processing (M.TECH.I
No ratings yet
Lab File ON Digital Image Processing: Session-2017-2018 Signal Processing (M.TECH.I
20 pages
النظرية الاحتسابية
No ratings yet
النظرية الاحتسابية
5 pages
Adamson University College of Engineering Electronics Engineering Department 1 SEMESTER / S. Y. 2015-2016
No ratings yet
Adamson University College of Engineering Electronics Engineering Department 1 SEMESTER / S. Y. 2015-2016
8 pages
Lemp El Ziv Report
No ratings yet
Lemp El Ziv Report
17 pages
DSP_Unit_2_3_Answers
No ratings yet
DSP_Unit_2_3_Answers
5 pages
Fdident
No ratings yet
Fdident
232 pages
LDIC Word New
No ratings yet
LDIC Word New
221 pages
Introduction To Wavelet
No ratings yet
Introduction To Wavelet
44 pages
Design of Efficient Multiplier Using VHDL
No ratings yet
Design of Efficient Multiplier Using VHDL
50 pages
Pulse Amplitude Modulation: Objective
No ratings yet
Pulse Amplitude Modulation: Objective
8 pages
Dsplabmanual-By 22
100% (1)
Dsplabmanual-By 22
64 pages
Allprob 08 N 2
No ratings yet
Allprob 08 N 2
11 pages
بنك الاسئلة د محمود ابوالفتوح PDF
No ratings yet
بنك الاسئلة د محمود ابوالفتوح PDF
4 pages
Cse304 Computer Graphics and Visualization
No ratings yet
Cse304 Computer Graphics and Visualization
1 page
Xilinx Block RAM
No ratings yet
Xilinx Block RAM
34 pages
5,6 Ldic New Course File
No ratings yet
5,6 Ldic New Course File
49 pages
RF Circuit Design Lecture Notes - Google Search
50% (2)
RF Circuit Design Lecture Notes - Google Search
1 page

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

MATLAB Code For Speech Recognition

Uploaded by

MATLAB Code For Speech Recognition

Uploaded by

12/10/2015

MATLAB code for speech recognition

Re: MATLAB code for speech recognition

mfcc - Mel frequency cepstrum coefficient analysis.

% (c) 1998 Interval Research Corporation

Now figure the band edges. Interesting frequencies are spaced

MATLAB code for speech recognition

freqs = lowestFrequency + (0:linearFilters-1)*linearSpacing;

MATLAB code for speech recognition

MATLAB code for speech recognition

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.