0% found this document useful (0 votes)

5 views6 pages

DSA String Matching - Part 3

The document outlines three string matching algorithms: Boyer-Moore, Brute-Force, and Aho-Corasick. The Boyer-Moore algorithm is efficient due to its use of heuristics to skip unnecessary comparisons, while the Brute-Force algorithm is a straightforward but less efficient approach. The Aho-Corasick algorithm is designed for searching multiple patterns simultaneously by building an automaton, significantly reducing search time.

Uploaded by

jaspinjose

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views6 pages

DSA String Matching - Part 3

Uploaded by

jaspinjose

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

CS4301-DSA I Department of CSE 2025-2026

CS4301- DATA STRUCTURES AND ALGORITHMS I

CONTENT BEYOND THE SYLLABUS
STRING MATCHING

String Matching Algorithms

1) Boyer Moore Algorithm for Pattern Matching
2) Brute-Force String Search Algorithm
3) Aho-Corasick Algorithm for Pattern Searching

1) Boyer Moore Algorithm for Pattern Matching

The Boyer Moore Algorithm is used to determine whether a given pattern is present
within a specified text or not. It follows a backward approach for pattern searching/matching.
The task of searching a particular pattern within a given string is known as a pattern
searching problem. For example, if the text is "THIS IS A SAMPLE TEXT" and the pattern
is "TEXT", then the output should be 10, which is the index of the first occurrence of pattern
in the given text.
This algorithm was developed by Robert Boyer and J Strother Moore in 1977. It is
considered as the most efficient and widely used algorithm for pattern matching.

How does Boyer Moore Algorithm work?

In the previous chapters, we have seen the naive way to solve this problem which
involves sliding the pattern over the text one by one and comparing each character. However,
this approach is very slow, as it takes O(n*m) time, where 'n' is the length of the text and 'm'
is the length of the pattern. The Boyer Moore algorithm improves this by preprocessing the
pattern and using two heuristics to skip some comparisons that are not going to match.
The two heuristics are as follows −
 Bad character heuristic − This heuristic uses a table that stores the last occurrence of
each character in the pattern. When a mismatch occurs at some character(bad
character) in the text, the algorithm checks if this character appears in the pattern. If it
does, then it shifts the pattern such that the last occurrence of this character in the

1
CS4301-DSA I Department of CSE 2025-2026

pattern aligns with the bad character in the text. If it does not, then it shifts the pattern
past the bad character.
 Good suffix heuristic − This heuristic uses another table that stores shift information
when the bad heuristic fails. In this case, we look within the pattern till bad character
become good suffix of the text. Then we shift onward to find the given pattern.
 Example:

The Boyer-Moore algorithm combines these two heuristics by choosing the maximum
shift suggested by them at each step. In this procedure, the substring or pattern is searched
from the last character of the pattern. When a substring of the main string matches with a
substring of the pattern, it moves to find other occurrences of the matched substring. If there
is a mismatch, it applies the heuristics and shifts the pattern accordingly. The algorithm stops
when it finds a complete match or when it reaches the end of the text.
The Boyer-Moore algorithm has a worst-case time complexity of O(nm), but, it can
perform much better than that. In fact, in some cases, it can achieve a sublinear time
complexity of O(n/m), which means that it can skip some characters in the text without
comparing them. This happens when the pattern has no repeated characters or when it has a
large alphabet size.

To illustrate how the Boyer-Moore algorithm works, let's consider an example −

Input:
main String: "AABAAABCEDBABCDDEBC" and pattern: "ABC"
Output:
Pattern found at position: 5

2
CS4301-DSA I Department of CSE 2025-2026

Pattern found at position: 11

2. Brute-Force String Search Algorithm

Brute-Force or Naive String Search algorithm searches for a string (also called pattern)
within larger string.
It checks for character matches of pattern at each index of string.
If all characters of pattern match with string then search stops.
If not, it shifts to the next index of string for check.
It has worst case complexity of O(mn). Where m is length of pattern and n is length of
string.

A brute force algorithm is a straight forward approach to solving a problem. It also refers to a
programming style that does not include any shortcuts to improve performance.
 It is based on trial and error where the programmer tries to merely utilize the
computer's fast processing power to solve a problem, rather than applying some
advanced algorithms and techniques developed with human intelligence.
 It might increase both space and time complexity.
 A simple example of applying brute force would be linearly searching for an element
in an array. When each and every element of an array is compared with the data to be
searched, it might be termed as a brute force approach, as it is the most direct and
simple way one could think of searching the given data in the array

Brute Force Pattern Matching Algorithm

1. Start at the beginning of the text and slide the pattern window over it.
2. At each position of the text, compare the characters in the pattern with the characters
in the text.
3. If a mismatch is found, move the pattern window one position to the right in the text.
4. Repeat steps 2 and 3 until the pattern window reaches the end of the text.
5. If a match is found (all characters in the pattern match the corresponding characters in
the text), record the starting position of the match.
6. Move the pattern window one position to the right in the text and repeat steps 2-5.
7. Continue this process until the pattern window reaches the end of the text.

3
CS4301-DSA I Department of CSE 2025-2026

Example:

Pseudo-code
Explain
function bruteForcePatternMatch(T, P):
n = length(T)
m = length(P)

for i from 0 to n - m:
j=0
while j < m and T[i + j] == P[j]:
j=j+1
if j == m:
return i // Pattern found at position i
return -1 // Pattern not found

4
CS4301-DSA I Department of CSE 2025-2026

3) Aho-Corasick Algorithm for Pattern Searching

 This algorithm is Proposed in 1975, by Alfred Aho and Margaret Corasick, Aho-
Corasick Algorithm is considered to be a much more efficient approach while
searching for a number of strings in the given text.
 If we go with the naive approach of pattern searching, that is, by using the KMP
algorithm it'll take a much longer time. And that is one of the things your interviewer
wouldn't like! So, instead of searching for each pattern one by one, we do a little bit of
complex programming by building an automaton of all the given words.
 Since this algorithm helps to minimize the time taken during pattern-matching, it is
also a sort of Dictionary-matching Algorithm. This algorithm works in 3 phases:
1. Go-To
2. Failure
3. Output

Example:
In this portion, we'll make you understand how the Aho-Corasick Algorithm works for a
particular string and gives patterns. So, let's get started.
1) Preprocessing: This step happens before any of the given stages of the algorithm
and is very important for the smooth functioning of this pattern-matching
algorithm.
First, Build a trie of all words given which are to be found in the given string.

5
CS4301-DSA I Department of CSE 2025-2026

Second, extend the trie into an automaton so that time complexity can be reduced to linear.

2) Go-To: After building the tree, now we move on to the first phase of pattern-matching.
We observe all the characters present in the trie, and if there is any character that does not
have an edge at root, we add an edge back to its root.
3) Failure: For each state, using the Breadth First Traversal, we try to find the longest
proper suffix of the given string.

Output: For a particular state, indices of all words are stored in a bitwise map, to ease the
retrieval process.

Unit-V DS Pattern Matching and Tries
No ratings yet
Unit-V DS Pattern Matching and Tries
26 pages
Physical Education 12 Fitt Goals
No ratings yet
Physical Education 12 Fitt Goals
6 pages
11 Data Structures and Algorithms - Narasimha Karumanchi
No ratings yet
11 Data Structures and Algorithms - Narasimha Karumanchi
12 pages
Toxoplasma Gondii The Model Apicomplexan Perspectives and Methods 3rd Edition New Edition PDF
No ratings yet
Toxoplasma Gondii The Model Apicomplexan Perspectives and Methods 3rd Edition New Edition PDF
15 pages
Algorithm Questions and Answers
No ratings yet
Algorithm Questions and Answers
23 pages
UGC Draft Regulations 2025 For Faculty Recruitment - What's Changing and What Will Remain The Same - Times of India
No ratings yet
UGC Draft Regulations 2025 For Faculty Recruitment - What's Changing and What Will Remain The Same - Times of India
6 pages
Kanban - Agile Methodology - GeeksforGeeks
No ratings yet
Kanban - Agile Methodology - GeeksforGeeks
19 pages
SplitPDFFile 346 To 402
No ratings yet
SplitPDFFile 346 To 402
57 pages
Lec 3
No ratings yet
Lec 3
37 pages
Combined Mathematics Applied Maths: Probability
100% (1)
Combined Mathematics Applied Maths: Probability
56 pages
Std12 Computer Paper Set Upto July 2024
No ratings yet
Std12 Computer Paper Set Upto July 2024
170 pages
Algo Lecture 7
No ratings yet
Algo Lecture 7
52 pages
Evolution Toward Engineering Complex Systems: INCOSE Joint Chapter Meeting New England and Washington Area
No ratings yet
Evolution Toward Engineering Complex Systems: INCOSE Joint Chapter Meeting New England and Washington Area
50 pages
Strings and Pattern Searching
100% (1)
Strings and Pattern Searching
80 pages
Algorithm Specification: Problem Input/Output
No ratings yet
Algorithm Specification: Problem Input/Output
29 pages
UNIT-4 PPT New
No ratings yet
UNIT-4 PPT New
47 pages
IRS Unit-5
No ratings yet
IRS Unit-5
62 pages
M269 - Lec8 Fall 1819
No ratings yet
M269 - Lec8 Fall 1819
24 pages
DS Unit V
No ratings yet
DS Unit V
12 pages
String Matching: COMP171 Fall 2005
No ratings yet
String Matching: COMP171 Fall 2005
15 pages
04 03-PatternMatchingAndTries
No ratings yet
04 03-PatternMatchingAndTries
28 pages
9.4, 9.5, 9.6 Rabin Karp, KMP, Boyer Moore
No ratings yet
9.4, 9.5, 9.6 Rabin Karp, KMP, Boyer Moore
17 pages
Irs Unit 5 PDF
No ratings yet
Irs Unit 5 PDF
24 pages
Baldwin 2020 The Shift To The Third Unbundling in The World
No ratings yet
Baldwin 2020 The Shift To The Third Unbundling in The World
13 pages
Makgrade 1 3RD Quarter Tos
No ratings yet
Makgrade 1 3RD Quarter Tos
1 page
DS V Unit Notes
No ratings yet
DS V Unit Notes
33 pages
Cat-1 Key
No ratings yet
Cat-1 Key
9 pages
Performance Task 2 Project Scheduling PERT CPM Summer 2022 Bus Math 43 Management Science II PDF
No ratings yet
Performance Task 2 Project Scheduling PERT CPM Summer 2022 Bus Math 43 Management Science II PDF
17 pages
Communicating Expert Opinion
No ratings yet
Communicating Expert Opinion
2 pages
Outline and Reading: Strings ( 9.1.1) Pattern Matching Algorithms
No ratings yet
Outline and Reading: Strings ( 9.1.1) Pattern Matching Algorithms
3 pages
ALo 2
No ratings yet
ALo 2
23 pages
Ads Unit5
No ratings yet
Ads Unit5
26 pages
Module 2 Business & Entrepreneurship
No ratings yet
Module 2 Business & Entrepreneurship
47 pages
G2 Term 3 Exam Schedule and Pointers
No ratings yet
G2 Term 3 Exam Schedule and Pointers
2 pages
Aoa Assignment
No ratings yet
Aoa Assignment
5 pages
Chapter 4 - Approaches To Music Education - Music and The Child
No ratings yet
Chapter 4 - Approaches To Music Education - Music and The Child
47 pages
MADFL 2025 Expt8
No ratings yet
MADFL 2025 Expt8
8 pages
Week 9 String Algorithms, Approximation
No ratings yet
Week 9 String Algorithms, Approximation
22 pages
Lecture 40 Boyer Moore Algorithm
100% (1)
Lecture 40 Boyer Moore Algorithm
13 pages
Bài thảo luận Tiếng Anh 3
No ratings yet
Bài thảo luận Tiếng Anh 3
9 pages
Library
No ratings yet
Library
4 pages
Unit 5
No ratings yet
Unit 5
14 pages
4string Matching Kmprabin Karp and Naive
No ratings yet
4string Matching Kmprabin Karp and Naive
57 pages
Approximate String
No ratings yet
Approximate String
36 pages
UNIT 5.3 (String Mactching)
No ratings yet
UNIT 5.3 (String Mactching)
23 pages
Daa
No ratings yet
Daa
10 pages
Chapter 2 - String Processing
No ratings yet
Chapter 2 - String Processing
26 pages
EOY24 MTC S1 KhalidM754857296
No ratings yet
EOY24 MTC S1 KhalidM754857296
2 pages
String Matching Class
No ratings yet
String Matching Class
31 pages
U3 - SpaceAndTimeTradeoff
No ratings yet
U3 - SpaceAndTimeTradeoff
30 pages
Chapter 21: "Adult Female Learner: Is That A Real Thing?": An Overview of Adult Education
No ratings yet
Chapter 21: "Adult Female Learner: Is That A Real Thing?": An Overview of Adult Education
13 pages
Impact of Project Management Training and
No ratings yet
Impact of Project Management Training and
17 pages
Unit 5
No ratings yet
Unit 5
42 pages
String Search Algorithm
No ratings yet
String Search Algorithm
6 pages
Unit-4 Ads
100% (1)
Unit-4 Ads
31 pages
資料工程 Data Engineering: Pattern Matching 張賢宗
No ratings yet
資料工程 Data Engineering: Pattern Matching 張賢宗
38 pages
F1 Math Opener 2024 - 010252
No ratings yet
F1 Math Opener 2024 - 010252
10 pages
String Searching Over Small Alphabets
No ratings yet
String Searching Over Small Alphabets
5 pages
Lahari Dhanasi-1
No ratings yet
Lahari Dhanasi-1
2 pages
Experiment No.09: Part A
No ratings yet
Experiment No.09: Part A
7 pages
Algoritmen & Datastructuren 2012 - 2013 Substring Search (Slides by Sedgewick)
No ratings yet
Algoritmen & Datastructuren 2012 - 2013 Substring Search (Slides by Sedgewick)
32 pages
String Matching
No ratings yet
String Matching
5 pages
Notes 5
No ratings yet
Notes 5
23 pages
Text Processing (Complete)
No ratings yet
Text Processing (Complete)
100 pages
SOU Lecture Handout ADA Unit-8
No ratings yet
SOU Lecture Handout ADA Unit-8
17 pages
Pe01 - Lesson 1
No ratings yet
Pe01 - Lesson 1
48 pages
Enterprise Resource Planning (Erp) Systems
No ratings yet
Enterprise Resource Planning (Erp) Systems
39 pages
CHPT 9 Pattern Matching
No ratings yet
CHPT 9 Pattern Matching
14 pages
String Matching: COMP171 Fall 2005
No ratings yet
String Matching: COMP171 Fall 2005
8 pages
28 - Text Processing
No ratings yet
28 - Text Processing
7 pages
Pattern Matching
No ratings yet
Pattern Matching
3 pages
Survey Paper On String Matching
No ratings yet
Survey Paper On String Matching
4 pages
Data Structures Unit 5
No ratings yet
Data Structures Unit 5
20 pages
Pattern Matching
No ratings yet
Pattern Matching
46 pages
String Matching Algorithm
100% (1)
String Matching Algorithm
14 pages
Pattern Matching 2
No ratings yet
Pattern Matching 2
46 pages
Tle Css 9 Las 2nd Quarter 1
No ratings yet
Tle Css 9 Las 2nd Quarter 1
87 pages
3.1 Importance of Quantitative Research Across Fields
No ratings yet
3.1 Importance of Quantitative Research Across Fields
8 pages
EDN Metacognition
No ratings yet
EDN Metacognition
3 pages
Ir Asnment
No ratings yet
Ir Asnment
6 pages
Word of Gratitude
100% (7)
Word of Gratitude
2 pages
Personal Data
No ratings yet
Personal Data
6 pages
String Searching Algorithm
No ratings yet
String Searching Algorithm
22 pages
Pattren Matching
No ratings yet
Pattren Matching
3 pages
Abstract
No ratings yet
Abstract
12 pages
Class Reading Intervention Plan Grade-One: Objectives Clientele Time Line Activities Assestment Tool Success Indicator
No ratings yet
Class Reading Intervention Plan Grade-One: Objectives Clientele Time Line Activities Assestment Tool Success Indicator
3 pages
String Matching Algorithms: 1 Brute Force
No ratings yet
String Matching Algorithms: 1 Brute Force
5 pages
The Department of Education Culture and Sports
No ratings yet
The Department of Education Culture and Sports
6 pages
Knuth-Morris-Pratt Algorithm Explained: Definitive Reference for Developers and Engineers
From Everand
Knuth-Morris-Pratt Algorithm Explained: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

DSA String Matching - Part 3

Uploaded by

DSA String Matching - Part 3

Uploaded by

CS4301-DSA I Department of CSE 2025-2026

CS4301- DATA STRUCTURES AND ALGORITHMS I

String Matching Algorithms

1) Boyer Moore Algorithm for Pattern Matching

How does Boyer Moore Algorithm work?

To illustrate how the Boyer-Moore algorithm works, let's consider an example −

Pattern found at position: 11

Brute Force Pattern Matching Algorithm

3) Aho-Corasick Algorithm for Pattern Searching

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.