Content-Length: 375377 | pFad | http://github.com/cp-algorithms/cp-algorithms/commit/6c9567b646ef3fe709b8a7e41f7f28a4ffaf4283

A8 Start Prefix Function article · cp-algorithms/cp-algorithms@6c9567b · GitHub
Skip to content

Commit 6c9567b

Browse files
authored
Start Prefix Function article
1 parent 0a23a35 commit 6c9567b

File tree

1 file changed

+76
-0
lines changed

1 file changed

+76
-0
lines changed

src/string/prefix-function.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
<!--?title Prefix function. Knuth–Morris–Pratt algorithm -->
2+
3+
# Prefix function. Knuth–Morris–Pratt algorithm
4+
5+
## Prefix function definition
6+
7+
You are given a string $s$ of length $n$ $s[0..n-1]$. **Prefix function** for this string is defined as an array $\pi$
8+
of length $n$ $\pi[0..n-1]$, where $\pi[i]$ is the length of the longest proper prefix of a substring $s[0...i]$
9+
which is also a suffix of this substring. A proper prefix of a string is a prefix that is not equal to the string itself.
10+
By definition, $\pi[0] = 0$.
11+
12+
Mathematically the definition of the prefix function can be written as follows:
13+
14+
$$\pi[i] = \max_ {k = 0..i} \\{k : s[0..k-1] = s[i-(k-1)..i] \\}$$
15+
16+
For example, prefix function of string "abcabcd" is $[0, 0, 0, 1, 2, 3, 0]$, and prefix function of string "aabaaab" is $[0, 1, 0, 1, 2, 2, 3]$.
17+
18+
## Trivial Algorithm
19+
20+
An algorithm which follows the definition of prefix function exactly is the following:
21+
22+
```cpp
23+
vector<int> prefix_function (string s) {
24+
int n = (int) s.length();
25+
vector<int> pi (n);
26+
for (int i=0; i<n; ++i)
27+
for (int k=0; k<=i; ++k)
28+
if (s.substr(0,k) == s.substr(i-k+1,k))
29+
pi[i] = k;
30+
return pi;
31+
}
32+
```
33+
34+
It is easy to see that its complexity is $O(n^3)$, which has room for improvement.
35+
36+
## Efficient Algorithm
37+
38+
This algorithm was proposed by Knuth and Pratt and independently from them by Morris in 1977.
39+
It was used as the main element of a substring search algorithm.
40+
41+
### First Optimization
42+
43+
### Second Optimization
44+
45+
### Final Algorithm
46+
47+
### Implementation
48+
49+
The implementation ends up being surprisingly short and expressive.
50+
51+
```cpp
52+
vector<int> prefix_function (string s) {
53+
int n = (int) s.length();
54+
vector<int> pi (n);
55+
for (int i=1; i<n; ++i) {
56+
int j = pi[i-1];
57+
while (j > 0 && s[i] != s[j])
58+
j = pi[j-1];
59+
if (s[i] == s[j]) ++j;
60+
pi[i] = j;
61+
}
62+
return pi;
63+
}
64+
```
65+
66+
This is an *online* algorithm, i.e. it processes the data as it arrives - for example, you can read the string
67+
characters one by one and process them immediately, finding the value of prefix function for each next character.
68+
The algorithm still requires storing the string itself and the previously calculated values of prefix function,
69+
but if we know beforehand the maximum value $M$ the prefix function can take on the string, we can store only $M+1$
70+
first characters of the string and the same number of values of the prefix function.
71+
72+
## Practice Problems
73+
74+
* [UVA # 455 "Periodic Strings"](http://uva.onlinejudge.org/index.php?option=onlinejudge&page=show_problem&problem=396)
75+
* [UVA # 11022 "String Factoring"](http://uva.onlinejudge.org/index.php?option=onlinejudge&page=show_problem&problem=1963)
76+
* [UVA # 11452 "Dancing the Cheeky-Cheeky"](http://uva.onlinejudge.org/index.php?option=onlinejudge&page=show_problem&problem=2447)

0 commit comments

Comments
 (0)








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/cp-algorithms/cp-algorithms/commit/6c9567b646ef3fe709b8a7e41f7f28a4ffaf4283

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy