0% found this document useful (0 votes)
292 views11 pages

Rabin Karp Matching

The Rabin-Karp algorithm is a string matching algorithm that is more efficient than the naive O(n^2) algorithm. It works by hashing the pattern string and each substring of the text into a numeric value. A match is found when the hash values are equal. However, hash collisions may result in false positives requiring the strings to be directly compared. The algorithm runs in O(n) expected time by only comparing matching hash values rather than all substrings.

Uploaded by

Mouniga Ve
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
292 views11 pages

Rabin Karp Matching

The Rabin-Karp algorithm is a string matching algorithm that is more efficient than the naive O(n^2) algorithm. It works by hashing the pattern string and each substring of the text into a numeric value. A match is found when the hash values are equal. However, hash collisions may result in false positives requiring the strings to be directly compared. The algorithm runs in O(n) expected time by only comparing matching hash values rather than all substrings.

Uploaded by

Mouniga Ve
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 11

The Rabin-Karp Algorithm

String Matching

Jonathan M. Elchison 19 November 2004 CS-3410 Algorithms Dr. Shomper

Background
String matching Nave method
n size of input string m size of pattern to be matched O( (n-m+1)m )
( n2 ) if m = floor( n/2 )

We can do better

How it works
Consider a hashing scheme
Each symbol in alphabet can be represented by an ordinal value { 0, 1, 2, ..., d }
|| = d Radix-d digits

How it works
Hash pattern P into a numeric value
Let a string be represented by the sum of these digits
Horners rule ( 30.1)

Example
{ A, B, C, ..., Z } { 0, 1, 2, ..., 26 } BAN 1 + 0 + 13 = 14 CARD 2 + 0 + 17 + 3 = 22

Upper limits
Problem
For long patterns, or for large alphabets, the number representing a given string may be too large to be practical

Solution
Use MOD operation When MOD q, values will be < q

Example
BAN = 1 + 0 + 13 = 14
14 mod 13 = 1 BAN 1

CARD = 2 + 0 + 17 + 3 = 22
22 mod 13 = 9 CARD 9

Searching

Spurious Hits
Question
Does a hash value match mean that the patterns match?

Answer
No these are called spurious hits

Possible cases
MOD operation interfered with uniqueness of hash values
14 mod 13 = 1 27 mod 13 = 1 MOD value q is usually chosen as a prime such that 10q just fits within 1 computer word

Information is lost in generalization (addition)


BAN 1 + 0 + 13 = 14 CAM 2 + 0 + 12 = 14

Code
RABIN-KARP-MATCHER( T, P, d, q )
n length[ T ] m length[ P ] h dm-1 mod q p0 t0 0 for i 1 to m Preprocessing do p ( d*p + P[ i ] ) mod q t0 ( d*t0 + T[ i ] ) mod q for s 0 to n m Matching do if p = ts then if P[ 1..m ] = T[ s+1 .. s+m ] then print Pattern occurs with shift s if s < n m then ts+1 ( d * ( ts T[ s + 1 ] * h ) + T[ s + m + 1 ] ) mod q

Performance
Preprocessing (determining each pattern hash)
( m )

Worst case running time


( (n-m+1)m ) No better than nave method

Expected case
If we assume the number of hits is constant compared to n, we expect O( n ) Only pattern-match hits not all shifts

Demonstration
http://www-igm.univmlv.fr/~lecroq/string/node5.html

Sources: Cormen, Thomas S., et al. Introduction to Algorithms. 2nd ed. Boston: MIT Press, 2001. Karp-Rabin algorithm. 15 Jan 1997. <http://www-igm.univ-mlv.fr/~lecroq/string/node5.html>. Shomper, Keith. Rabin-Karp Animation. E-mail to Jonathan Elchison. 12 Nov 2004.

The Rabin-Karp Algorithm


String Matching

Jonathan M. Elchison 19 November 2004 CS-3410 Algorithms Dr. Shomper

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy