Skip to content

Commit fbc44e2

Browse files
committed
typo
1 parent ee1f5eb commit fbc44e2

File tree

2 files changed

+6
-7
lines changed

2 files changed

+6
-7
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Connectionist Temporal Classification (CTC) decoding algorithms are implemented
44

55
## Run demo
66
Go to the `src/` directory and run the script ```python main.py```.
7-
Adding the command line parameter ```gpu``` additionally executes best path decoding on the GPU.
7+
Appending the command line parameter ```gpu``` additionally executes best path decoding on the GPU.
88

99
Expected results:
1010
```
@@ -42,28 +42,28 @@ BEST PATH GPU : "the fak friend of the fomly hae tC"
4242
* Prefix Search Decoding: best-first search through tree of labelings. File: `PrefixSearch.py` \[1\]
4343
* Beam Search Decoding: iteratively searches for best labeling in a tree of labelings, optionally uses a character-level LM. File: `BeamSearch.py` \[2\] \[5\]
4444
* Token Passing: searches for most probable word sequence. The words are constrained to those contained in a dictionary. Can be extended to use a word-level LM. File: `TokenPassing.py` \[1\]
45-
* Lexicon Search: use approximation from best path decoding to find similar words in dictionary and return the one with highest score. File: `LexiconSearch.py` \[3\]
45+
* Lexicon Search: computes approximation with best path decoding to find similar words in dictionary. Returns the one with highest score. File: `LexiconSearch.py` \[3\]
4646
* Loss: calculates probability and loss of a given text in the RNN output. File: `Loss.py` \[1\] \[6\]
4747
* Word Beam Search: TensorFlow implementation see repository [CTCWordBeamSearch](https://github.com/githubharald/CTCWordBeamSearch)
4848

4949

5050
## Choosing the right algorithm
51-
[This paper](./doc/comparison.pdf) compares beam search decoding and tpassing.
51+
[This paper](./doc/comparison.pdf) compares beam search decoding and token passing.
5252
It gives suggestions when to use best path decoding, beam search decoding and token passing.
5353

5454

5555
## Testcases
5656

57-
Illustration of the **Mini example** testcase: the RNN output matrix contains 2 time-steps (t0 and t1) and 3 labels (a, b and - representing the CTC-blank).
57+
The RNN output matrix of the **Mini example** testcase contains 2 time-steps (t0 and t1) and 3 labels (a, b and - representing the CTC-blank).
5858
Best path decoding (see left figure) takes the most probable label per time-step which gives the path "--" and therefore the recognized text "" with probability 0.6\*0.6=0.36.
5959
Beam search, prefix search and token passing calculate the probability of labelings.
6060
For the labeling "a" these algorithms sum over the paths "-a", "a-" and "aa" (see right figure) with probability 0.6\*0.4+0.4\*0.6+0.4*0.4=0.64.
6161
The only path which gives "" still has probability 0.36, therefore "a" is the result returned by beam search, prefix search and token passing.
6262

6363
![mini](./doc/mini.png)
6464

65-
The **Word example** testcase contains a single word.
66-
It is used to test the lexicon search \[3\].
65+
The **Word example** testcase contains a single word from the IAM Handwriting Database \[4\].
66+
It is used to test lexicon search \[3\].
6767
RNN output was generated with the [SimpleHTR](https://github.com/githubharald/SimpleHTR) model.
6868
Lexicon search first computes an approximation with best path decoding, then searches for similar words in a dictionary, and finally scores them by computing the loss and returning the most probable dictionary word.
6969
Best path decoding outputs "aircrapt", lexicon search is able to find similar words like "aircraft", "airplane", ... in the dictionary, calculates a score for each of them and finally returns "aircraft", which is the correct result.

src/main.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,6 @@ def softmax(mat):
2727
e = np.exp(y)
2828
s = np.sum(e)
2929
res[t, :] = e/s
30-
3130
return res
3231

3332

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy