Skip to content

Commit d8df2dd

Browse files
authored
complete issue #924
I've now resolved issue #924. I added new techniques to calculating the total length of different strings.
1 parent da025de commit d8df2dd

File tree

1 file changed

+21
-1
lines changed

1 file changed

+21
-1
lines changed

src/string/suffix-automaton.md

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -510,7 +510,7 @@ long long get_diff_strings(){
510510
}
511511
```
512512

513-
While this is also $O(length(S))$, it requires no extra space besides (what's used for the suffix automaton construction) and no recursive calls, consequently running faster in practice.
513+
While this is also $O(length(S))$, it requires no extra space and no recursive calls, consequently running faster in practice.
514514

515515
### Total length of all different substrings
516516

@@ -529,6 +529,26 @@ We take the answer of each adjacent vertex $w$, and add to it $d[w]$ (since ever
529529

530530
Again this task can be computed in $O(length(S))$ time.
531531

532+
Alternatively, we can, again, take advantage of the fact that each state $v$ matches to substrings of length $[minlen(v),len(v)]$.
533+
Since $minlen(v) = 1 + len(link(v))$ and the arimetic series formula $S[n] = n * (a[1]+a[n]) / 2$ (where $S[n]$ denotes the sum of $n$ terms,$a[1]$ representing the first term, and $a[n]$ representing the last), we can compute the length of substrings at a state in constant time. We then sum up these totals for each state $v \neq t[0]$ in the automaton. This is shown by the code below:
534+
535+
```cpp
536+
long long get_tot_len_diff_substings() {
537+
long long tot{};
538+
for(int i=1;i<sz;i++) {
539+
long long shortest=st[st[i].link].len+1;
540+
long long longest=st[i].len;
541+
542+
long long num_strings=longest-shortest+1;
543+
544+
long long cur=num_strings*(longest+shortest)/2;
545+
tot += cur;
546+
}
547+
return tot;
548+
}
549+
```
550+
This approaches runs in $O(length(S))$ time, but experimentally runs 20x faster than the memoized dynamic programming version on randomized strings. It requires no extra space and not recursion.
551+
532552
### Lexicographically $k$-th substring {data-toc-label="Lexicographically k-th substring"}
533553

534554
Given a string $S$.

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy