0% found this document useful (0 votes)
41 views15 pages

Project Report: Call For Your Symphony

The document describes a project that develops an algorithm to play a desired song by calling out its name. It uses cross-correlation between a recorded voice input and predefined voice models of song names stored in a library. Songs are generated by combining basic musical notes represented as sinusoidal functions of specific frequencies. The recorded input is correlated with each voice model, and the song with the highest correlation is played.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views15 pages

Project Report: Call For Your Symphony

The document describes a project that develops an algorithm to play a desired song by calling out its name. It uses cross-correlation between a recorded voice input and predefined voice models of song names stored in a library. Songs are generated by combining basic musical notes represented as sinusoidal functions of specific frequencies. The recorded input is correlated with each voice model, and the song with the highest correlation is played.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

CALL FOR YOUR SYMPHONY

PROJECT REPORT

Submitted for the course: Signal Analysis And Processing (ECE1018)

By
YOGESH KAUSHIK 16BIS0149
ANSHUL RANJAN MODI 16BIS0141
SINGAM MEGHANA 16BIS0139

Slot: A1
Name of faculty: Dr.S.KALAIVANI
Dr.CHRISTOPHER CLEMENT J

SCHOOL OF ELECTRONICS ENGINEERING

November,2017

1
CERTIFICATE

This is to certify that the project work entitled “ Call For Your Symphony” that is being
submitted by “ Yogesh Kaushik,Anshul Ranjan Modi,Singam Meghana” for Signal Analsysis
And Processing(ECE1018) is a record of bonafide work done under my supervision. The
contents of this Project work, in full or in parts, have neither been taken from any other source
nor have been submitted for any other CAL course.

Place : Vellore

Date : 03/11/2017

Signature of Students:

YOGESH KAUSHIK

ANSHUL RANJAN MODI

SINGAM MEGHANA

Signature of Faculty:

Dr.S.KALAIVANI

Dr.CHRISTOPHER CLEMENT J

2
ACKNOWLEDGEMENTS

The members of the group would like to acknowledge all those who have helped with the
completion of this project.

First of all, we would like to express our surpassing gratefulness to Dr.Kalaivani S,


Dr.Christopher Clement J our teacher, for his valuable advice, continual support, suggestions and
patience during our study. We would also like to thank our Dean , Dr. Elizabeth Rufus, for
giving us an opportunity to carry out our studies at the University.

Our special thanks are extended to the lab assistants, who helped us. Finally are special thanks
are also due to all our friends for their academic, moral support and furthermore their helpful
assistance during the data collection throughout the study.

Signature:

< Yogesh Kaushik

16BIS0149

< Anshul Ranjan Modi

16BIS0141

< Singam Meghana

16BIS0139

3
ABSTRACT

Technology is serving the mankind with the ripen fruits of its advancement; similar is the
objective of our paper to aid to mankind with ease of initiating a task with advancement in
technology as the main weapon to reduce the complexity of the tasks in hand. In this paper we
have developed an algorithm for playing the desired track by just calling out the name of the
latter. Cross correlation * has aided us in making this algorithm; all of our work is done by using
MATLAB. Basic transducers ** are used to provide the physicality to our work.

*
Measure of similarity of two signals

**
(Microphone & speakers)

4
1.Introduction:
The algorithm which we created can be implemented to provide a smart environment which
would ease in work load of the user. As per the frontend part the user calls out the name of the
song and the tool we used processes user’s request and plays the desired track. Switching over to
the backend part, this is much complex as compared to the latter.

We have first generated the songs line by line using the basic musical notes; which are stored
under a library. Then there names are stored as the predefined voice models, which will later be
used during speech correlation. The main idea behind this project is the speech recognition. The
test voice input should match with the predefined voice models; which will result in the
generation of the desired output. [6]

The fig (1) depicts the overview of the algorithm used (basically the interaction b/w the client
and interface) which is incorporated using in MATLAB.

Fig (1): overview of algorithm

5
2.Methodology:
A: Generation of songs

Every musical note has a particular frequency. By using this frequency we can generate the
musical notes. Sinusoidal functions are incorporated in our algorithm to generate these basic
notes.

ote th o䳌 䁮 h 䁮 䁝 䁮 o㌳䁓 耀晦ee (1)

{F(x) is the note; f is the particular frequency; range will be the frequency range in which
function is defined}

Then the song is generated line by line as is done while playing a piano, picking up right notes in
a particular order to generate the desired melody.

The generated tracks are stored under a library which can later be accessed during the speech
correlation. The above stated method is used for the generation of the desired number of songs.

For storing the tracks in the accessible format we use Wavwrite, Audiowrite matlab
functions.One thing which serves as the link between the tracks and the input test voice sample is
predefined voice modules; for this we use the audiorecorder ***
function of the matlab. This
function aids in gathering the input voice modules from the user by means of the transducer
(microphone). [7]

*
Writes data to 8-, 16-, 24-, and 32-bit .wav files

**
Writes a matrix of audio data

***
Records audio from an input device, such as a microphone connected to your system

6
B: Speech recognition using Correlation:

In the field of signals, the resemblance of two function of the displacement of one relative to
other is called as cross correlation. This process has many applications in the field of
Neurophysiology, averaging, pattern recognition etc.

The general mathematical expression of the latter (2)*

t o et䳌 o e (2)

{x1 denotes the first function; x2 denotes the second function; t is the time τ is time variable}

Implementation:

We are using the cross correlation method to determine the result of the project. One by one, we
are calling the predefined voice modules stored in the library and are correlating those with the
input. The matlab function which aids to our need to solve this particular glitch is wavread *. We
are using the wavread function to read the predefined voice functions from the source library,
example:
{y1=wavread ('one.wav');
One.wav is the name of a track under the library}

Correlation factors are generated for each of the voice modules using cross correlation of the
latter with the test voice input. Then using the max ** function, the correlation term of maximum
value is used to compare with the correlation factors generated by cross correlation of each. The
one which is close to the maximum is the desired output, and then the song to be played is called
using the function wavread from the source library which gives user the desired output. [3][4][5]

The comparison is done by using simple conditional loop statements. If the desired match is
found the algorithm calls the desired track to be played from the source library and plays it, if not
matched error sound is popped.[1][2]
The figures depicted below show the working of the algorithm. Fig (2) is the test voice input; Fig
(3) is the cross correlation of the test voice input with one of the predefined voice module; Fig (4)
is the song being played.

7
Fig (2): test voice input

Fig (3): cross correlation result

Fig (4): track being played

Read Microsoft WAVE (.wav) sound file


*

**
Returns the largest element

8
Fs=8192;

a=sin(2*pi*440*(0:0.000125:0.5));

b=sin(2*pi*493.88*(0:0.000125:0.5));

c=sin(2*pi*554.37*(0:0.000125:0.5));

d=sin(2*pi*587.33*(0:0.000125:0.5));

e=sin(2*pi*659.26*(0:0.000125:0.5));

f=sin(2*pi*739.99*(0:0.000125:0.5));

g = sin(2*pi*195.99*(0:0.000125:0.5));

line1=[a,a,e,e,f,f,e,e,];

line2=[d,d,c,c,b,b,a,a,];

line3=[e,e,d,d,c,c,b,b];

song1=[line1,line2,line3,line3,line1,line2];

ln1=[c,d,e,e,e,e,e,e,e,e,e,d,e,f];

ln2=[e,e,e,d,d,b,b,d,c];

ln3=[c,g,g,g,g,f,a,g];

ln4=[f,f,f,f,f,f,e,d,f]

song2=[ln1,ln2,ln3,ln4];

l1=[e,e,e,e,e,e,e,g,c,d];

l2=[e,f,f,f,f,f,e,e,e,e];

l3=[e,d,d,e,d,g,e,e,e,e,e,e];

l4=[e,g,c,d,e,f,f,f,f];

l5=[f,e,e,e,e,g,g,f,d,c];

song3=[l1,l2,l3,l4,l5];

lnn1=[e,d,c,d,e,e,e];

lnn2=[d,d,d,e,g,g];

lnn3=[e,d,d,e,d,c];

song4=[lnn1,lnn2,lnn1,lnn3];

lnnn1=[b,a,b,b,a,b,g,a,a];

lnnn2=[g,d,a,b,g,d,a,b];

lnnn3=[c,b,c,c,b,a,g,a,g,a];

lnnn4=[g,d,a,b,g,d,a,b];

lnnn5=[c,b,c,c,b,a,g,a,g,a];

9
song5=[lnnn1,lnnn2,lnnn3,lnnn4,lnnn5];

recObj = audiorecorder;

disp('Start');

recordblocking(recObj, 2);

disp('end');

Obj=getaudiodata(recObj);

%Speech Recognition Using Correlation Method

%Write Following Command On Command Window

%speechrecognition('test.wav')

voice=Obj;

x=voice;

x=x';

x=x(1,:);

x=x';

y1=audioread('one.wav');

y1=y1';

y1=y1(1,:);

y1=y1';

z1=xcorr(x,y1);

m1=max(z1);

l1=length(z1);

t1=-((l1-1)/2):1:((l1-1)/2);

t1=t1';

%subplot(3,2,1);

plot(t1,z1);

y2=audioread('two.wav');

y2=y2';

y2=y2(1,:);

y2=y2';

z2=xcorr(x,y2);

m2=max(z2);

l2=length(z2);

10
t2=-((l2-1)/2):1:((l2-1)/2);

t2=t2';

%subplot(3,2,2);

figure

plot(t2,z2);

y3=audioread('three.wav');

y3=y3';

y3=y3(1,:);

y3=y3';

z3=xcorr(x,y3);

m3=max(z3);

l3=length(z3);

t3=-((l3-1)/2):1:((l3-1)/2);

t3=t3';

%subplot(3,2,3);

figure

plot(t3,z3);

y4=audioread('four.wav');

y4=y4';

y4=y4(1,:);

y4=y4';

z4=xcorr(x,y4);

m4=max(z4);

l4=length(z4);

t4=-((l4-1)/2):1:((l4-1)/2);

t4=t4';

%subplot(3,2,4);

figure

plot(t4,z4);

y5=audioread('five.wav');

y5=y5';

y5=y5(1,:);

11
y5=y5';

z5=xcorr(x,y5);

m5=max(z5);

l5=length(z5);

t5=-((l5-1)/2):1:((l5-1)/2);

t5=t5';

%subplot(3,2,5);

figure

plot(t5,z5);

m6=80;

a=[m1 m2 m3 m4 m5 m6];

m=max(a);

if m<=m1

sound(song1,Fs);

elseif m<=m2

sound(song2,Fs);

elseif m<=m3

sound(song3,Fs);

elseif m<=m4

sound(song4,Fs);

elseif m<m5

sound(song5,Fs);

else

soundsc(audioread('denied.wav'),8192)

end

12
3. APPLICATIONS:
The sources of entertainment for differently abled people are very limited. This paper mainly
focuses on the betterment of them in a simple but effective way. Many surveys have been
conducted where differently abled people have shared their problems. This project simplifies the
idea of playing the music with the help of our automated music system. The differently abled
people just have to use their voice to play the music, instead of playing it manually.

Other applications of our project can be used in audio systems used in cars, futuristic cars are
said to be enabled with the smart features, as in voice command controls; our project aids in the
latter. It not only serves as a futuristic wizard, but also aids to the society as a helping wizard to
the differentially abled.

13
CONCLUSION:
This paper will give a brief description of simple and efficient voice recognition method for
extraction of the song which is stored in the library under the particular voice module. The main
area of concern was the development of the algorithm. Our algorithm is efficient and serve caters
to the promises made in the paper. We successfully have created and tested the algorithm and
hope that it will be used in the days to come as a tech-wizard.

Fig (5) gives the procedural methodology of our work plan incorporated in the paper.

14
REFERENCES
[1]. X.D Huang and K.F.Lee. Phonene classification using semi continuous hidden markov
models. IEEE Trans. On signal processing, 40(5):1962-1967, May 1992.

[2]. Acero, Acoustical and environmental robustness in automatic speech recognition, Kluwer
Academic Pubs.1993.

[3]. Rabiner, L.R. Schafer, R.W. Digital processing of speech signals, Prentice hall, 1978.

[4]. F.Jelinek. “Continuous speech recognition by stasticial methods.” IEEE proceedings


64:4(1976):532-556.

[5]. Young,S., Review of large vocabulary continuous speech recognition, IEEE signal
processing Magazine, pp.45-57, September 1996.

[6]. Rabiner L.R, Juang B.H., Fundamentals of speech recognition, Prentice Hall, 1993.

[7]. “Speech and speaker recognition: A tutorial” by Samudravijaya K S. Young. The general use
of typing in phoneme-based hmm speech recognizers, Proceedings of ICASSP 1992

[9]. http://www.wikipedia.org

[10]. http://www.google.co.in

15

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy