0% found this document useful (0 votes)

14 views9 pages

Music Database Retrieval Based On Spectral Similarity.

Uploaded by

fuh926

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views9 pages

Music Database Retrieval Based On Spectral Similarity.

Uploaded by

fuh926

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Music Database Retrieval Based on Spectral Similarity

Cheng Yang
Department of Computer Science
Stanford University
yangc@cs.stanford.edu

Abstract Music can be represented in computers in two different

ways. One way is based on musical scores, with one entry
We present an efficient algorithm to retrieve similar music per note, keeping track of the pitch, duration (start time /
pieces from an audio database. The algorithm tries to cap- end time), strength, etc, for each note. Examples of this rep-
ture the intuitive notion of similarity perceived by human: resentation include MIDI and Humdrum, with MIDI being
two pieces are similar if they are fully or partially based the most popular format. Another way is based on acoustic
on the same score, even if they are performed by different signals, recording the audio intensity as a function of time,
people or at different speed. sampled at a certain frequency, often compressed to save
Each audio file is preprocessed to identify local peaks space. Examples of this representation include .wav, .au,
in signal power. A spectral vector is extracted near each and MP3.
peak, and a list of such spectral vectors forms our interme- A simple software or hardware synthesizer can convert
diate representation of a music piece. A database of such MIDI-style data into audio signals, to be played back for hu-
intermediate representations is constructed, and two pieces man listeners. However, there is no known algorithm to do
are matched against each other based on a specially-defined reliable conversion in the other direction. For decades peo-
distance function. Matching results are then filtered accord- ple have been trying to design automatic transcription sys-
ing to some linearity criteria to select the best result to a tems that extract musical scores from raw audio recordings,
user query. but have only succeeded in monophonic and very simple

polyphonic cases [1, 3, 9], not in general polyphonic case .

In Section 3.1 we will explain briefly why it is a difficult task
1 Introduction to do automatic transcription on general polyphonic music.
Score-based representations such as MIDI and Humdrum
With the explosive amount of music data available on are much more structured and easier to handle than raw audio
the internet in recent years, there has been much interest data. On the other hand, they have limited expressive power
in developing new ways to search and retrieve such data and are not as rich as what people would like to hear in
effectively. Most on-line music databases today, such as music recordings. Therefore, only a small fraction of music
Napster and mp3.com, rely on file names or text labels to do data on the internet is represented in score-based formats;
searching and indexing, using traditional text searching tech- most music data is found in various raw audio formats.
niques. Although this approach has proven to be useful and Most content-based music retrieval systems operate on
widely accepted, it would be nice to have more sophisticated score-based databases, with input methods ranging from
search capabilities, namely, searching by content. Potential note sequences to melody contours to user-hummed tunes [2,
applications include “intelligent” music retrieval systems, 5, 6]. Relatively few systems are for raw audio databases. A
music identification, plagiarism detection, etc. Traditional brief review of related work will be given in Section 2. Our
techniques used in text searching do not easily carry over work focuses on raw audio databases; both the underlying
to the music domain, and people have built a number of database and the user query are given in .wav audio format.
special-purpose systems for content-based music retrieval. We develop algorithms to search for music pieces similar to

Supported by a Leonard J. Shustek Fellowship, part of the

the user query. Similarity is based on the intuitive notion
Stanford Graduate Fellowship program, and NSF Grant IIS-9811904. of similarity perceived by humans: two pieces are similar if
Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are Polyphony refers to the scenario where multiple notes occur at the
not made or distributed for profit or commercial advantageand that copies same time, possibly by different instruments or vocal sounds. As we know,
bear this notice and the full citation on the first page. most music pieces are polyphonic.

1
they are fully or partially based on the same score, even if
4000
they are performed by different people or at different tempo.
In the next section we will discuss some previous work 3500

in this area. In Section 3 we will start with some back-

3000
ground information and then give a detailed presentation of

frequency (Hz)
our algorithm to detect music similarity. Section 4 gives 2500
experimental results, and future directions will be discussed
in Section 5. 2000

1500

2 Related Work 1000

500

Examples of score-based database (MIDI or Hum- 0

2.6 2.8 3 3.2 3.4 3.6 3.8 4 4.2 4.4 4.6
drum) retrieval systems include the ThemeFinder project time (sec.)
(http://www.themefinder.org) developed at Stanford Uni-
versity, where users can query its Humdrum database by Figure 1. Spectrogram of piano notes C, E, G
entering pitch sequences, pitch intervals, scale degrees or
contours (up, down, etc). The “Query-By-Humming” sys-
tem [5] at Cornell University takes a user-hummed tune as in-
3 Detecting Similarity
put, converts it to contour sequences, and matches it against
its MIDI database. Human-hummed tunes are monophonic In this section we start with some background infor-
melodies and can be automatically transcribed into pitches mation on signal processing techniques and musical signal
with reasonable accuracy, and melody contour information properties, then give a detailed discussion of our algorithm.
is generally sufficient for retrieval purposes [2, 5, 6].
3.1 Background
Among music retrieval research conducted on raw au-
dio databases, Scheirer [7, 8] studied pitch and rhythmic After decompression and parsing, each raw audio file
analysis, segmentation, as well as music similarity estima- can be regarded as a list of signal intensity values, sampled
tion at a high level such as genre classification. Tzanetakis at a specific frequency. CD-quality stereo recordings have
and Cook [10] built tools to distinguish speech from music, two channels, each sampled at 44.1kHz, with each sam-
and to do segmentation and simple retrieval tasks. Wold ple represented as a 16-bit integer. In our experiments we
et al. at Muscle Fish LLC [11] developed audio retrieval use single-channel recordings of a lower quality, sampled
methods for a wider range of sounds besides music, based at 22.05kHz, with each sample represented as an 8-bit inte-
on analyses of sound signals’ statistical properties such as ger. Therefore, a 60-second uncompressed sound clip takes
loudness, pitch, brightness, bandwidth, etc. Recently, *CD
bytes.
(http://www.starcd.com) commercialized a music identifica- We use the Short-Time Fourier Transform (STFT) to con-
tion system that can identify songs played on radio stations vert each signal into a spectrogram: split each signal into
by analyzing each recording’s audio properties. 1024-byte-long segments with 50% overlap, window each
Foote [4] experimented with music similarity detection segment with a Hanning window and perform 2048-byte
by matching power and spectrogram values over time using a zero-padded FFT on each windowed segment. Taking ab-
dynamic programming method. He defined a cost model for solute values (magnitudes) of the FFT result, we obtain a
matching two pieces point-by-point, with a penalty added spectrogram giving localized spectral content as a function
for non-matching points. Lower cost means a closer match of time. Since the details of this process are covered in most
in the retrieval result. Test results on a small test corpus signal processing textbooks, we will not discuss them here.
indicated that the method is feasible for detecting similarity Figure 1 shows a sample spectrogram on the note se-
in orchestral music. Part of our algorithm makes use of a quence of middle C, E and G played on a piano. The
similar idea, but with two important differences: we focus horizontal axis is time in seconds, and the vertical axis is
on spectrogram values near power peaks only, rather than frequency component in Hz. Lighter pixels correspond to

over the entire time period, therefore making tempo changes higher values. If we zoom in to time and look at the
more transparent; furthermore, we evaluate final matching frequency components of note G closely, we notice that it
results by some linearity criteria which is more intuitive and has many peaks (Figure 2), one at 392 Hz (its fundamental
robust than the cost models used for dynamic programming. frequency) and several others at integer multiples of 392 Hz

2
4
3000 x 10
8

7
2500

2000

5
intensity

power
1500
4

3
1000

500
1

0 0
0 500 1000 1500 2000 2500 3000 3500 4000 0 5 10 15 20 25 30 35 40
frequency (Hz) time (sec.)

Figure 2. Frequency components of note G Figure 4. Power plot of Tchaikovsky’s Piano

played by a piano Concerto No. 1

B
A
1500

1000
(a) C
500 D
0
0 500 1000 1500 2000 2500 3000 3500 4000
2000

(b) 1000
time

0
0 500 1000 1500 2000 2500 3000 3500 4000 Figure 5. True peak vs. bogus peak
3000

2000
(c)
1000

0
3.2 The Algorithm
0 500 1000 1500 2000 2500 3000 3500 4000
3000

(d)
2000
The algorithm consists of three components, which are
1000
discussed separately.
0
0 500 1000 1500 2000 2500 3000 3500 4000
frequency (Hz)
1. Intermediate Data Generation.
Figure 3. Illustration of polyphony For each music piece, we generate its spectrogram as
discussed in Section 3.1, and plot its instantaneous
power as a function of time. Figure 4 shows such a
(its harmonics). Fundamental frequency corresponds to the power plot for a 40-second sound clip of Tchaikovsky’s
pitch (middle G in this case), and the pattern of harmonics Piano Concerto No. 1. Next, we identify peaks in this
depends on the characteristics of the musical instrument that power plot, where peak is defined as a local maximum
plays it. value within a neighborhood of a fixed size. This def-
inition helps remove bogus local “peaks” which are
When multiple notes occur at the same time
immediately followed or preceded by higher values.
(“polyphony”), their frequency components add. Figure
For example, in Figure 5, are true peaks but
3(a)-(c) show the frequency components of C, E and G
is a bogus peak. Intuitively, these peaks roughly cor-
played individually, while Figure 3(d) shows that of all three
respond to distinctive notes or rhythmic patterns. For
notes played together. In this simple example it is still pos-
the 60-second music clips used in our experiments, we
sible to design algorithms to extract individual pitches from
typically find 100-200 peaks in each of them.
the chord signal C-E-G, but in actual music recordings, many
more notes co-exist, played by many different instruments, After a list of peaks is obtained, we extract the fre-
of which we do not know the patterns of harmonics. In addi- quency components near each peak. We take 180
tion, there are sounds produced by percussion instruments, samples of frequency components between 200Hz and
human voice, and noise. The task of automatic transcription 2000Hz. Average values over a short time period fol-
of music from arbitrary audio data (i.e., conversion from lowing the peak are used in order to reduce sensitivity
raw audio format into MIDI) becomes extremely difficult, to noise and to avoid the “attack” portions produced by
and remains unsolved today. Our algorithm, as in most other certain instruments (short, non-harmonic signal seg-
music retrieval systems, does not attempt to do transcription. ments at the onset of each note).

3
1 x1 x2 x3 x4 x5 xk n to as:
s ...... .... ...... ......

...... C D ,G ! G I
HKJ : H <ML
time 5+? ? @BA
9
1 m FE

r ... .... ...... ......

4N5
y1 y2 y3 y4 y5 yk and the minimum distance between and 8-9
as:
PORQTS
5+? +
5 ? ?@
Figure 6. Set of matching pairs 9 @ 9

The distance definition is basically a sum of all

In the end, we get spectral vectors of 180 dimensions matching errors plus a penalty term for the num-
each, where is the number of peaks obtained. We ber of non-matching points (weighted by J ). Ex-

normalize each spectral vector so that they each have periments have shown that J works rea-
mean 0 and variance 1. After normalization, these sonably well.
vectors form our intermediate representation of the The minimum distance 5 can be found by a
corresponding music piece. Typically each new note 9
dynamic programming approach, because
in a piece corresponds to a new peak, and therefore to
a vector in this representation. Notice that we do not VU ? W ? U J
expect to capture all new notes in this way, and will

almost certainly have some false positives and false and for any NX BX ,
negatives. However, later stages of the algorithm will
PORQTS
compensate for this inaccuracy. W ? BZY

? [Y IH

WZY

? /Y \H
J

W ? /Y ]HJ

WZY

? HKJ6
2. Matching.
This component matches two music pieces against each The optimal matching set ^ that leads to the
other and determines how close they are, based on the minimum distance can also be traced from the
intermediate representation generated above. Match- dynamic programming algorithm.
ing comes in two stages: minimum-distance matching Based on the definitions above, the minimum dis-
and linearity filtering. tance between the two music pieces with spectral

vectors

and is ? ,

(a) Minimum-distance matching and can be found with dynamic programming.

Suppose we would like to compare two music (b) Linearity filtering

pieces with spectral vectors and Although the previous step gives the minimum

respectively. Define to be root-

distance and optimal matching based on the dis-

mean-squared error between vectors and . tance function, it is not robust enough for music
It can be shown that is linearly related to comparison. Experiments have shown that cer-
the correlation coefficient of the original spec- tain subjectively dissimilar pieces may also end
tra near peak of the first piece and peak of up with a small distance score, therefore appear-
the second one. A smaller value corresponds ing similar to the system. To make the algorithm
to a larger correlation coefficient. (See [13] for more robust, further filtering is needed.
proof.) Therefore,
Figure 7 shows two ways to match against ,
is a natural indicator of
similarity of the original spectra at corresponding
both with 10 matches. Both may yield a low
peaks.
matching score, but the top one is obviously better
Let be
than the bottom one. In the top one, there is a

a #" set of " matches, pairing with ! ,

%$ slight tempo change between the two pieces, but
with ! , etc, as shown in Figure 6. ( the change is uniform in time. In the bottom one,
'& ( &*)+)+),& $ -$. '& / &0)+))1&

however, there is no plausible explanation for the

$32
.) twisted matching. If we plot a 2-D graph of the

Given the following subsets of and vectors:
465 7
matching points of on the horizontal axis vs.
5 7 ,
, 8-9 the corresponding points of on the vertical axis,
$.:;$ $=<>9 $

and a particular match 4 (5 the top match would give a straight line while the
2
), define the distance of and 8 9 with respect bottom one would not.

4
s
query music Intermediate query vector

music
Data
vector
r database Generation database
A "good" match

s Minimum-
candidate
matches Distance
Matching
r
A "bad" match

Figure 7. “Good” vs. “bad” matching Linearity Filtering

Formally, the matching set

C, / Final Results

can be plotted on a 2-D graph, with the origi- Figure 8. Summary of algorithm structure

nal location (time offset) of peaks

(of the first music piece) on the horizontal axis

and that of peaks (of the second

intermediate representation), the database is matched

piece) on the vertical axis. If the two pieces were against the query using minimum-distance matching
indeed mostly based on the same score, the plot- and linearity filtering algorithm. The pieces that end
ted points should fall roughly on a straight line. up with the highest number of matching points (and if
Without tempo change, the line should be at a 45- above a certain threshold) are selected as answers to
degree angle. With possible tempo change, the the user query.
line may be at a different angle, but it should still
be straight. Figure 8 summarizes the overall structure of the music
In this step of linearity filtering, we examine the retrieval algorithm.
graph of the optimal matching set obtained from
dynamic programming above, fit a straight line 3.3 Complexity Analysis
through the points (using least mean-square cri-
teria), and check if any points fall too far away Time complexity of the preprocessing step is ,
from the line. If so, remove the most outlying where is the size of the database. Because only “peak”
point and fit a new line through the remaining information is recorded in the spectral vector representa-
points. Repeat the process until all remaining tion, space required is only a fraction of the original audio
points lie within a small neighborhood of the fit- database.
ted line. (In the worst case, only two points are Dynamic programming for minimum-distance
matching
2 2 overall, where 2
left at the end. But in practice we stop when fewer takes time for each run,
2
than 10 points remain.) is the expected number of peaks in each piece. Because
The total number of matching points after this is much less than when the database is large, it can be

filtering step is taken as an indicator of how well regarded as a constant and is the dominant factor.
two pieces match. As will be shown in Section 4, Linearity filtering takes a negligible amount of time in
this criterion is remarkably effective in detecting
practice, although its worst-case complexity is also up to
2 .
similarity. 2
Overall, assuming is a constant factor, the algorithm
3. Query Processing.
runs in time for each query. When the database gets

All music files are preprocessed into the intermedi- large, the running time of may be too slow. We are
ate representation of spectral vectors discussed ear- experimenting with indexing schemes [12] which will give
lier. Given a query sound clip (also converted into the better performance.

5
4
x 10
10

A. 5

0 55
0 4 5 10 15 20 25 30 35 40
x 10 50
10
45

B. 5
40

similarity
35

0 30
0 5 5 10 15 20 25 30 35 40
x 10 25
2
20
0
C. 1
15
2
10
4
0
0 2 6
0 4 5 10 15 20 25 30 35 40 4
x 10 6 8
15 8
10 10
Item 2
10
D. Item 1
5

0
0 5 10 15 20 25 30 35 40 Figure 10. Pairwise matching result
time (sec.)

Figure 9. Power plots Polonaise (C and D). Both pairs are of Type-IV similarity.
Each pair was performed by different orchestras, published
by different companies. There were variations in tempo as
4 Experiments well as in performance style. From the power plots it can
be seen that notes are emphasized differently. Neverthe-
Our data collection is done by recording CDs or tapes less, both pairs yield small distance scores after minimum-
into PCs through a low-quality PC microphone. No special distance matching. On the other hand, a few dissimilar pairs
efforts are taken to reduce noise. This setup is intentional, in also yield scores that are not large, such as Tchaikovsky’s Pi-
order to test the algorithm’s robustness and performance in ano Concerto No. 1 (A) vs. Brahms’ Cradle Song (referred
a practical environment. Both classical music and modern to as E from now on), and Chopin’s “Military” Polonaise
music are included, with classical music being the focus. (D) vs. Mendelssohn’s Spring Song (referred to as F from
Instead of taking the entire pieces, only 30- to 60-second now on).
clips are taken from each piece, because that much data is Figure 11 shows sample plots of optimal matching sets
generally enough for similarity detection. before linearity filtering (solid lines connecting the dots),
We identify five different types of “similar” music pairs, where the horizontal axis is time (in seconds) of the first
with increasing levels of difficulty: piece and vertical axis is time of the second piece. A straight
line is fitted through each set of matching points (dashed
Type I: Identical digital copy lines). As is clear from the plots, A and B are truly similar
Type II: Same analog source, different digital copies, (almost all points are colinear), while A and E are not; C
possibly with noise and D are truly similar, while D and F are not.
After certain matching points are removed by linearity
Type III: Same instrumental performance, different vo- filtering, Figure 11 becomes Figure 12. The pairs (A, B)
cal components and (C, D) have 49 and 54 matching points respectively,
while the other two pairs have fewer than 15 remaining
Type IV: Same score, different performances (possibly matching points.
at different tempo) Figure 10 shows the pairwise matching result of a set of
Type V: Same underlying melody, different otherwise, 10 music pieces, of which two pairs ((A, B) and (C, D))
are different performances of the same scores (with Type-
with possible transposition
IV similarity). The result is shown as a matrix
Sound samples of each type can be found at http: where the entry ( , ) gives the final number of matching
//www-db.stanford.edu/˜yangc/musicir/ . points between two pieces and after linearity filtering.
Figure 9 shows the power plots of two different per- Because of symmetry only the upper triangle of the matrix
formances of Tchaikovsky’s Piano Concerto No. 1 (A and is presented. Two peaks in the graph clearly indicate the
B) and two different performances of Chopin’s “Military” discovery of the “correct” pairs.

6
A. vs. B. C. vs. D.
35 40

30
30
25

20
20
15

10
10
5

0 0
0 10 20 30 40 0 10 20 30 40

A. vs. E. D. vs. F.
60 40

50
30
40

30 20

20
10
10

0 0
0 10 20 30 40 0 10 20 30 40

Figure 11. Matching plots before filtering

A. vs. B. C. vs. D.
40 40

30 30

20 20

10 10

0 0
0 10 20 30 40 0 10 20 30 40

A. vs. E. D. vs. F.
60 40

50
30
40

30 20

20
10
10

0 0
0 10 20 30 40 0 10 20 30 40

Figure 12. Matching plots after filtering

7
100 processing, one can also incorporate existing rhythm de-
90
tection algorithms to improve performance. Also, different
algorithms may be suited to different types of music, so it
80
may be helpful to conduct some analysis of general statisti-
% Retrieval Accuracy

70
cal properties before deciding which algorithm to use.
60 Content-based retrieval of musical audio data is still a new
50 area that is not well explored. There are many possible future
40
directions, and this paper is only intended as a demonstration
on the feasibility of certain prototype ideas, of which more
30
extensive experiments and research will need to be done.
20

10
References
0
I II III IV V
Type
[1] J. P. Bello, G. Monti and M. Sandler, “Techniques
for Automatic Music Transcription”, in International
Figure 13. Retrieval Accuracy Symposium on Music Information Retrieval, 2000.
[2] S. Blackburn and D. DeRoure, “A Tool for Content
More queries are conducted on a larger dataset of 120 Based Navigation of Music”, in Proc. ACM Multime-
music pieces, each of size 1MB. For each query, items from dia, 1998.
the database are ranked according to the number of final
matching points with the query music, and the top 2 matches [3] J. C. Brown and B. Zhang, “Musical Frequency Track-
are returned. Figure 13 shows the retrieval accuracy for each ing using the Methods of Conventional and ’Narrowed’
of the five types of similarity queries. As can be seen from Autocorrelation”, J. Acoust. Soc. Am. 89, pp. 2346-
the graph, the algorithm performs very well in the first 4 2354. 1991.
types. Type-V is the most difficult, and better algorithms
[4] J. Foote, “ARTHUR: Retrieving Orchestral Music by
need to be developed to handle it.
Long-Term Structure”, in International Symposium on
Music Information Retrieval, 2000.
5 Conclusions and Future Work
[5] A. Ghias, J. Logan, D. Chamberlin and B. Smith,
“Query By Humming – Musical Information Retrieval
We have presented an efficient algorithm to perform in an Audio Database”, in Proc. ACM Multimedia,
content-based music retrieval based on spectral similarity. 1995.
Experiments have shown that the approach can detect sim-
ilarity while tolerating tempo changes, some performance [6] R. J. McNab, L. A. Smith, I. H. Witten, C. L. Hender-
style changes and noise, as long as the different perfor- son and S. J. Cunningham, “Towards the digital music
mances are based on the same score. library: Tune retrieval from acoustic input”, in Proc.
Future research may include the study of the effects of ACM Digital Libraries, 1996.
various threshold parameters used in the algorithm, and to
find ways to automate the selection of certain parameters to [7] E. D. Scheirer, “Pulse Tracking with a Pitch Tracker”,
optimize performance. in Proc. Workshop on Applications of Signal Process-
ing to Audio and Acoustics, 1997.
We are experimenting with indexing schemes [12] in
order to get faster retrieval response. We are also planning [8] E. D. Scheirer, Music-Listening Systems, Ph. D. disser-
to augment the algorithm to handle transpositions (pitch tation, Massachusetts Institute of Technology, 2000.
shifts). Although transpositions of entire pieces are not very
common, it is common to have small segments transposed [9] A. S. Tanguiane, Artificial Perception and Music
to a different key, and it would be important that we detect Recognition, Springer-Verlag, 1993.
such cases.
[10] G. Tzanetakis and P. Cook, “Audio Information Re-
One other future direction is to design algorithms to ex- trieval (AIR) Tools”, in International Symposium on
tract high-level representations such as approximate melody Music Information Retrieval, 2000.
contours. This task is certainly non-trivial, but it may be
less difficult than transcription, and at the same time very [11] E. Wold, T. Blum, D. Keislar and J. Wheaton,
powerful in similarity detection for complex cases. “Content-Based Classification, Search and retrieval of
Instead of using the peak-detection scheme during pre- audio”, in IEEE Multimedia, 3(3), 1996.

8
[12] C. Yang, “MACS: Music Audio Characteristic Se-
quence Indexing for Similarity Retrieval”, in IEEE
Workshop on Applications of Signal Processing to Au-
dio and Acoustics, 2001.
[13] C. Yang and T. Lozano-Pérez, “Image Database Re-
trieval with Multiple-Instance Learning Techniques”,
Proc. International Conference on Data Engineering,
2000, pp. 233-243.

Problem Chapter 8
No ratings yet
Problem Chapter 8
62 pages
The Fundamentals of Synthesizer Programming
From Everand
The Fundamentals of Synthesizer Programming
Joseph Akins
1.5/5 (2)
BMW 5-Serie - Individual (2006-01)
No ratings yet
BMW 5-Serie - Individual (2006-01)
22 pages
A Music Data Mining and Retrieval Primer: Dan Berger Dberger@cs - Ucr.edu May 27, 2003
No ratings yet
A Music Data Mining and Retrieval Primer: Dan Berger Dberger@cs - Ucr.edu May 27, 2003
6 pages
JournalNX-Mp3 File Retrieval
No ratings yet
JournalNX-Mp3 File Retrieval
3 pages
Query Ghias95
No ratings yet
Query Ghias95
6 pages
Efficient Index-Based Audio Matching
No ratings yet
Efficient Index-Based Audio Matching
14 pages
Emilia ResearchWork
No ratings yet
Emilia ResearchWork
114 pages
Audio Matching Via Chroma-Based Statistical Features
No ratings yet
Audio Matching Via Chroma-Based Statistical Features
8 pages
Using Exact Locality Sensitive Mapping To Group and Detect Audio-Based Cover Songs
No ratings yet
Using Exact Locality Sensitive Mapping To Group and Detect Audio-Based Cover Songs
8 pages
Melodic Similarity
No ratings yet
Melodic Similarity
272 pages
PDF
No ratings yet
PDF
272 pages
Searching Musical Audio Datasets by A Batch of Multi-Variant Tracks
No ratings yet
Searching Musical Audio Datasets by A Batch of Multi-Variant Tracks
7 pages
2005 Automatic Music Classification and Summarization
No ratings yet
2005 Automatic Music Classification and Summarization
11 pages
Content-Based Classification of Musical Instrument Timbres: Agostini Longari Pollastri
100% (1)
Content-Based Classification of Musical Instrument Timbres: Agostini Longari Pollastri
8 pages
Querying Large Collections of Music For Similarity Matt Welsh
No ratings yet
Querying Large Collections of Music For Similarity Matt Welsh
13 pages
Sloboda & Parker's Recall Paradigm For Melodic Memory: A New, Computational Perspective
No ratings yet
Sloboda & Parker's Recall Paradigm For Melodic Memory: A New, Computational Perspective
33 pages
Audio - Retrieval by Fuzzy
No ratings yet
Audio - Retrieval by Fuzzy
8 pages
Content-Based Music Similarity Search and Emotion Detection
No ratings yet
Content-Based Music Similarity Search and Emotion Detection
4 pages
Icme06 Final
No ratings yet
Icme06 Final
4 pages
Audio Indexing: Gaël Richard
No ratings yet
Audio Indexing: Gaël Richard
1 page
Musical Instrument Timbres Classification With Spectum
100% (1)
Musical Instrument Timbres Classification With Spectum
10 pages
A Comparative Study in Automatic Recognition of Broadcast Audio
No ratings yet
A Comparative Study in Automatic Recognition of Broadcast Audio
4 pages
Markov Random Fields and Maximum Entropy Modeling For Music Information Retrieval
100% (1)
Markov Random Fields and Maximum Entropy Modeling For Music Information Retrieval
8 pages
Instrument Recognition
No ratings yet
Instrument Recognition
1 page
Factors in Automatic Musical Genre Classification of Audio Signals
No ratings yet
Factors in Automatic Musical Genre Classification of Audio Signals
4 pages
Audio-Based Music Classification
100% (1)
Audio-Based Music Classification
47 pages
Automatic Musical Instrument
No ratings yet
Automatic Musical Instrument
1 page
Chapter - 1: 1.1 Introduction To Music Genre Classification
No ratings yet
Chapter - 1: 1.1 Introduction To Music Genre Classification
57 pages
Understandable Models of Music Collections Based On Exhaustive Feature Generation With Temporal Statistics
No ratings yet
Understandable Models of Music Collections Based On Exhaustive Feature Generation With Temporal Statistics
10 pages
Automatic Genre Classification of Music Content: (A Survey)
No ratings yet
Automatic Genre Classification of Music Content: (A Survey)
28 pages
First Research Paper
No ratings yet
First Research Paper
15 pages
A Survey On Symbolic Data-Based Music Genre Classification
No ratings yet
A Survey On Symbolic Data-Based Music Genre Classification
21 pages
Content-Based Audio Retrieval Using A Generalized Algorithm
No ratings yet
Content-Based Audio Retrieval Using A Generalized Algorithm
13 pages
Automatic Music Timbre Indexing
No ratings yet
Automatic Music Timbre Indexing
1 page
A Comparative Study On Content-Based Music Genre Classification
No ratings yet
A Comparative Study On Content-Based Music Genre Classification
8 pages
Music Score Alignment and Computer Accompaniment: Roger B. Dannenberg and Christopher Raphael
100% (1)
Music Score Alignment and Computer Accompaniment: Roger B. Dannenberg and Christopher Raphael
8 pages
A Discriminative Model For Polyphonic Piano Transcription
No ratings yet
A Discriminative Model For Polyphonic Piano Transcription
9 pages
Dissertacao Bruno Dias
No ratings yet
Dissertacao Bruno Dias
95 pages
Insights On Song Genres With PCA Analysis of Spectrograms
No ratings yet
Insights On Song Genres With PCA Analysis of Spectrograms
20 pages
Article 2
No ratings yet
Article 2
2 pages
Okay 2
No ratings yet
Okay 2
138 pages
Audio Segment Heuristic
No ratings yet
Audio Segment Heuristic
10 pages
025 What Effect Audio Quality Robustness MFCC Chroma Features
No ratings yet
025 What Effect Audio Quality Robustness MFCC Chroma Features
6 pages
Chathur Anga 2012
No ratings yet
Chathur Anga 2012
6 pages
Music Genre Classification
No ratings yet
Music Genre Classification
5 pages
Computational Phonogram Archiving: Articles You May Be Interested in
No ratings yet
Computational Phonogram Archiving: Articles You May Be Interested in
7 pages
Chap 5 Audio Dbms
No ratings yet
Chap 5 Audio Dbms
16 pages
Cheng 2020
No ratings yet
Cheng 2020
5 pages
Pap Tza 2017
No ratings yet
Pap Tza 2017
17 pages
Music Retrieval and Recommendation: A Tutorial Overview
No ratings yet
Music Retrieval and Recommendation: A Tutorial Overview
4 pages
Learning Similarity Metrics For Melody Retrieval
No ratings yet
Learning Similarity Metrics For Melody Retrieval
8 pages
1999 Waspaa Mfas PDF
No ratings yet
1999 Waspaa Mfas PDF
4 pages
Computer Vision For Music Identification
No ratings yet
Computer Vision For Music Identification
8 pages
Audio Segmentation in AAC Domain For Content
No ratings yet
Audio Segmentation in AAC Domain For Content
4 pages
Evaluation of Feature Extractors and Psycho-Acoustic Transformations For Music Genre Classifications
No ratings yet
Evaluation of Feature Extractors and Psycho-Acoustic Transformations For Music Genre Classifications
8 pages
Música - Analytical Techniques For The Identification
No ratings yet
Música - Analytical Techniques For The Identification
11 pages
Thesis - ADE - David Sijbesma
No ratings yet
Thesis - ADE - David Sijbesma
23 pages
Pert Usa PHD
No ratings yet
Pert Usa PHD
232 pages
Ph.D. Thesis Computationally Efficient Methods For Polyphonic Music Transcription
No ratings yet
Ph.D. Thesis Computationally Efficient Methods For Polyphonic Music Transcription
232 pages
Computer Audition: Fundamentals and Applications
From Everand
Computer Audition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Analog vs. Digital: Choosing Your Sound Path
From Everand
Analog vs. Digital: Choosing Your Sound Path
PRODUCERGENIE
No ratings yet
Lecture 4.2 Supervised Learning Classification
No ratings yet
Lecture 4.2 Supervised Learning Classification
25 pages
Untitled Document
No ratings yet
Untitled Document
1 page
Economics Enrichment Plan UPSC
No ratings yet
Economics Enrichment Plan UPSC
2 pages
Whistleblower Statement PDF
No ratings yet
Whistleblower Statement PDF
2 pages
B. Ujwala Libre
No ratings yet
B. Ujwala Libre
5 pages
LMHF Data Sheet Most Recent One
No ratings yet
LMHF Data Sheet Most Recent One
2 pages
Css12 1st Week5 SSLM
No ratings yet
Css12 1st Week5 SSLM
6 pages
AVO Company Profile
No ratings yet
AVO Company Profile
41 pages
Conservation Equations and Modeling of Chemical and Biochemical Processes 1st Edition Said S.E.H. Elnashaie Download
No ratings yet
Conservation Equations and Modeling of Chemical and Biochemical Processes 1st Edition Said S.E.H. Elnashaie Download
63 pages
Anjaney Deshpande Resume
No ratings yet
Anjaney Deshpande Resume
1 page
Ticket Booking System: 1.1 Over View of Project
0% (2)
Ticket Booking System: 1.1 Over View of Project
37 pages
User Manual M900/M1800 Base Transceiver Station (BTS30) System Description
No ratings yet
User Manual M900/M1800 Base Transceiver Station (BTS30) System Description
3 pages
Chinmaya Vidyalaya
No ratings yet
Chinmaya Vidyalaya
4 pages
Police Officer CV Examples Uk
100% (2)
Police Officer CV Examples Uk
7 pages
UNHCR Note On The Interpretation of Article 1E of The 1951 Convention Relating To The Status of Refugees
No ratings yet
UNHCR Note On The Interpretation of Article 1E of The 1951 Convention Relating To The Status of Refugees
6 pages
Copy Assessment - WNG
No ratings yet
Copy Assessment - WNG
3 pages
Software Requirements Specification Template
No ratings yet
Software Requirements Specification Template
12 pages
Who Is Ferdinand Marcos
No ratings yet
Who Is Ferdinand Marcos
1 page
CMV 230330430
No ratings yet
CMV 230330430
1 page
Case Study Presentation Two Tough Calls A Harvard Business School
No ratings yet
Case Study Presentation Two Tough Calls A Harvard Business School
10 pages
2 Introduction To Management Science
100% (1)
2 Introduction To Management Science
16 pages
Installing OpenCV With Visual C++ On Windows 7
100% (1)
Installing OpenCV With Visual C++ On Windows 7
10 pages
MD - Asif Parvez Sarker
No ratings yet
MD - Asif Parvez Sarker
2 pages
Module Iii - Object Oriented Testing
No ratings yet
Module Iii - Object Oriented Testing
10 pages
CLAT Previous Year Question Papers With Answers
No ratings yet
CLAT Previous Year Question Papers With Answers
209 pages
Data Sheet: CB-12-POI-DF-48
100% (1)
Data Sheet: CB-12-POI-DF-48
3 pages
QA Assignment 02
No ratings yet
QA Assignment 02
2 pages
Kashaf New Bsns Plan Bba
No ratings yet
Kashaf New Bsns Plan Bba
22 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Music Database Retrieval Based On Spectral Similarity.

Uploaded by

Music Database Retrieval Based On Spectral Similarity.

Uploaded by

Music Database Retrieval Based on Spectral Similarity

Abstract Music can be represented in computers in two different

polyphonic cases [1, 3, 9], not in general polyphonic case .

Supported by a Leonard J. Shustek Fellowship, part of the

in this area. In Section 3 we will start with some back-

2 Related Work 1000

Examples of score-based database (MIDI or Hum- 0

Figure 2. Frequency components of note G Figure 4. Power plot of Tchaikovsky’s Piano

r ... .... ...... ......

The distance definition is basically a sum of all

and is ? ,

(a) Minimum-distance matching and can be found with dynamic programming.

respectively. Define to be root-

a #" set of " matches, pairing with ! ,

however, there is no plausible explanation for the

Figure 7. “Good” vs. “bad” matching Linearity Filtering

Formally, the matching set

(of the first music piece) on the horizontal axis

intermediate representation), the database is matched

Figure 11. Matching plots before filtering

Figure 12. Matching plots after filtering

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Music Database Retrieval Based On Spectral Similarity.

Uploaded by

Music Database Retrieval Based On Spectral Similarity.

Uploaded by

Music Database Retrieval Based on Spectral Similarity

Abstract Music can be represented in computers in two different

polyphonic cases [1, 3, 9], not in general polyphonic case .

Supported by a Leonard J. Shustek Fellowship, part of the

in this area. In Section 3 we will start with some back-

2 Related Work 1000

Examples of score-based database (MIDI or Hum- 0

Figure 2. Frequency components of note G Figure 4. Power plot of Tchaikovsky’s Piano

r ... .... ...... ......

The distance definition is basically a sum of all

 and      is ?  ,

(a) Minimum-distance matching and can be found with dynamic programming.

     respectively. Define  to be root-

a #" set of  " matches, pairing with ! ,

however, there is no plausible explanation for the

Figure 7. “Good” vs. “bad” matching Linearity Filtering

Formally, the matching set

(of the first music piece) on the horizontal axis

intermediate representation), the database is matched

Figure 11. Matching plots before filtering

Figure 12. Matching plots after filtering

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

and is ? ,

respectively. Define to be root-

a #" set of " matches, pairing with ! ,