Skip to content

Commit acc62aa

Browse files
Mrityunjai SinghMrityunjai Singh
authored andcommitted
aho corasick text change
1 parent 4685762 commit acc62aa

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

src/string/aho_corasick.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ Accordingly, a trie for a set of strings is a trie such that each $\text{output}
3636
We now describe how to construct a trie for a given set of strings in linear time with respect to their total length.
3737

3838
We introduce a structure for the vertices of the tree:
39+
3940
```{.cpp file=aho_corasick_trie_definition}
4041
const int K = 26;
4142

@@ -209,7 +210,7 @@ Assume that at the moment we stand in a vertex $v$ and consider a character $c$.
209210
1. $go[v][c] = -1$. In this case, we may assign $go[v][c] = go[u][c]$, which is already known by the induction hypothesis;
210211
2. $go[v][c] = w \neq -1$. In this case, we may assign $link[w] = go[u][c]$.
211212
212-
In this way, we spend $O(1)$ time per each pair of a vertex and a character, making the running time $O(mk)$. The major overhead here is that we copy a lot of transitions from $u$ in the first case, while the transitions of the second case form the trie and sum up to $m$ over all vertices. To avoid the copying of $go[u][c]$, we may use a persistent array data structure, using which we initially copy $go[u]$ into $go[v]$ and then only update values for characters in which the transition would differ. This leads to the $O(m \log k)$ algorithm.
213+
In this way, we spend $O(1)$ time per each pair of a vertex and a character, making the running time $O(mk)$. The major overhead here is that we copy a lot of transitions from $u$ in the first case, while the transitions of the second case form the trie and sum up to $m$ over all vertices. To avoid the copying of $go[u][c]$, we may use a persistent array data structure, using which we initially copy $go[u]$ into $go[v]$ and then only update values for characters in which the transition would differ. This leads to the $O(n \log k)$ algorithm.
213214
214215
## Applications
215216
@@ -277,4 +278,5 @@ Thus we can find such a path using depth first search (and if the search looks
277278
- [CodeChef - TWOSTRS](https://www.codechef.com/MAY20A/problems/TWOSTRS)
278279
279280
## References
281+
280282
- [Stanford's CS166 - Aho-Corasick Automata](http://web.stanford.edu/class/archive/cs/cs166/cs166.1166/lectures/02/Slides02.pdf) ([Condensed](http://web.stanford.edu/class/archive/cs/cs166/cs166.1166/lectures/02/Small02.pdf))

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy