0% found this document useful (0 votes)
83 views7 pages

FST Explained

An FSM is a model of computation based on states where only one state can be active at a time, requiring transitions between states. An FST is similar to an FSM but also produces output when reading input, making it useful for parsing. FSTs consist of states linked by input/output pair transitions and can efficiently parse regular languages in linear time. The ARPA-MIT LM format stores n-gram language models with header describing contents and body containing probability values and word sequences for each n-gram section.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views7 pages

FST Explained

An FSM is a model of computation based on states where only one state can be active at a time, requiring transitions between states. An FST is similar to an FSM but also produces output when reading input, making it useful for parsing. FSTs consist of states linked by input/output pair transitions and can efficiently parse regular languages in linear time. The ARPA-MIT LM format stores n-gram language models with header describing contents and body containing probability values and word sequences for each n-gram section.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

FSM(Finite State Machine)

A finite-state machine, or FSM for short, is a model of computation


based on a hypothetical machine made of one or more states. Only a
single state can be active at the same time, so the machine must
transition from one state to another in order to perform different
actions.
FST(Finite State Transducer)

A finite state transducer (FST) is a finite state automaton (FSA, FA) which produces output as well as
reading input, which means it is useful for parsing (while a "bare" FSA can only be used for recognizing,
i.e. pattern matching).

An FST consists of a finite number of states which are linked by transitions labeled with an input/output
pair. The FST starts out in a designated start state and jumps to different states depending on the input,
while producing output according to its transition table.

FSTs are useful in NLP and speech recognition because they have nice algebraic properties, most notably
that they can be freely combined (form an algebra) under composition, which implements relational
composition on regular relations (think of this as non-deterministic function composition) while staying
very compact. FSTs can do parsing of regular languages into strings in linear time.

As an example, I once implemented morphological parsing as a bunch of FSTs. My main FST for verbs
would turn a regular verb, say "walked", into "walk+PAST". I also had an FST for the verb "to be", which
would turn "is" into "be+PRESENT+3rd" (3rd person), and similarly for other irregular verbs. All the FSTs
were combined into a single one using an FST compiler, which produced a single FST that was much
smaller than the sum of its parts and ran very fast. FSTs can be built by a variety of tools that accept an
extended regular expression syntax.
The ARPA-MIT LM format

An ARPA-style language model file comes in two parts - the header and
the  -gram definitions. The header contains a description of the contents
of the file.
<header> = { ngram <int>=<int> }
The first <int> gives the  -gram order and the second <int> gives
the number of  -gram entries stored.

For example, a trigram language model consists of three sections - the


unigram, bigram and trigram sections respectively. The corresponding
entry in the header indicates the number of entries for that section. This
can be used to aid the loading-in procedure. The body part contains all
sections of the language model and is defined as follows:
<body> = { <lmpart1> } <lmpart2>
<lmpart1> = \<int>-grams:
{ <ngramdef1> }
<lmpart2> = \<int>-grams:
{ <ngramdef2> }
<ngramdef1> = <float> { <word> } <float>
<ngramdef2> = <float> { <word> }
Each %-gram definition starts with a probability value stored
as log10 followed by a sequence of % words describing the actual  -
gram. In all sections excepts the last one this is followed by a back-off
weight which is also stored as log10. The following example shows an
extract from a trigram language model stored in the ARPA-text format.
\data\
ngram 1=19979
ngram 2=4987955
ngram 3=6136155

\1-grams:
-1.6682 A -2.2371
-5.5975 A'S -0.2818
-2.8755 A. -1.1409
-4.3297 A.'S -0.5886
-5.1432 A.S -0.4862
...

\2-grams:
-3.4627 A BABY -0.2884
-4.8091 A BABY'S -0.1659
-5.4763 A BACH -0.4722
-3.6622 A BACK -0.8814
...

\3-grams:
-4.3813 !SENT_START A CAMBRIDGE
-4.4782 !SENT_START A CAMEL
-4.0196 !SENT_START A CAMERA
-4.9004 !SENT_START A CAMP
-3.4319 !SENT_START A CAMPAIGN
...
\end\
Discrete Cosine Transform

A discrete cosine transform (DCT) expresses a finite sequence


of data points in terms of a sum of cosine functions oscillating at
different frequencies
Widely used transformation technique in signal
processing and data compression(JPEG)
CMVN
CMVN minimizes distortion by noise contamination for
robust feature extraction by linearly transforming the cepstral
coefficients to have the same segmental statistics.[2] Cepstral
Normalization has been effective in the CMU Sphinx for
maintaining a high level of recognition accuracy over a wide
variety of acoustical environments.[

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy