-
Notifications
You must be signed in to change notification settings - Fork 182
Use log probabilities in beam search. #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
child_beam.pr_text = parent_beam.pr_text * bigram_prob # probability of char sequence | ||
bigram_prob = lm_factor * np.log(lm.get_char_bigram(c1, c2)) | ||
if parent_beam.is_empty(): | ||
child_beam.pr_text = bigram_prob # first char in beam |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no special case needed (no if else), as an empty beam is initialized with pr_text=log(1)=0, so adding parent_beam.pr_text + bigram_prob is OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking this up.
@@ -75,8 +81,8 @@ def beam_search(mat: np.ndarray, labels: str, beam_width: int = 25, lm: Optional | |||
last = BeamState() | |||
labeling = () | |||
last.entries[labeling] = BeamEntry() | |||
last.entries[labeling].pr_blank = 1 | |||
last.entries[labeling].pr_total = 1 | |||
last.entries[labeling].pr_blank = LOG_ZERO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be log(1)==0 instead of LOG_ZERO which is -inf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, that allows the below if statements to be cleaned up also, thanks for picking this up.
# in case of non-empty beam | ||
if labeling: | ||
# probability of paths with repeated last char at the end | ||
pr_non_blank = last.entries[labeling].pr_non_blank * mat[t, labeling[-1]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"cannot add to -inf" -> I think this is due to initialization to log(0) instead of log(1).
-inf + x == -inf, which makes sense, because 0*x=0.
|
||
# probability of paths ending with a blank | ||
pr_blank = last.entries[labeling].pr_total * mat[t, blank_idx] | ||
if last.entries[labeling].pr_total == LOG_ZERO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
|
||
# probability of paths ending with a blank | ||
pr_blank = last.entries[labeling].pr_total * mat[t, blank_idx] | ||
if last.entries[labeling].pr_total == LOG_ZERO: | ||
pr_blank = np.log(mat[t, blank_idx]) # cannot add to -inf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
directly using np.log creates warning outputs for np.log(0).
Define some wrapper function which takes care of that:
def log(x):
with np.errstate(divide='ignore'):
return np.log(x)
thanks for your contribution. Using log-probs is a good idea 👍. |
I anyway will push some changes to the repo, so I'll integrate your changes and will do the small modifications myself. |
Thanks @githubharald for accepting the pull request. I've taken a look at the changes you've implemented since and they look good, thanks for your improvements and for maintaining the repo. |
Implemented log-space operations in beam search to avoid underflow when dealing with regular probabilities.