0% found this document useful (0 votes)

31 views5 pages

Unconstrained Offline Handwritten Word

ME Refrerence Paper ieee

Uploaded by

padmanath2001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views5 pages

Unconstrained Offline Handwritten Word

ME Refrerence Paper ieee

Uploaded by

padmanath2001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

IEEE SIGNAL PROCESSING LETTERS, VOL. 26, NO.

4, APRIL 2019 597

Unconstrained Offline Handwritten Word

Recognition by Position Embedding
Integrated ResNets Model
Xiangping Wu , Qingcai Chen , Member, IEEE, Jinghan You, and Yulun Xiao

Abstract—The state-of-the-art methods usually integrate with [13]–[15]. In these tasks, the convolutional neural network
linguistic knowledge in the recognizer, which makes models more (CNN) is usually used to extract low/mid/high-level image
complicated and hard for resource-lacking languages. This let- features automatically. For example, Xie et al. [16] presented
ter proposes a new method for unconstrained offline handwritten
word recognition by combining position embeddings with resid- a multi-spatial-context fully convolutional recurrent network
ual networks (ResNets) and bidirectional long short-term memory (MC-FCRN). Jaderberg et al. [17] developed a character se-
(BiLSTM) networks. At first, ResNets are used to extract abun- quence model by using a CNN with multiple position-sensitive
dant features from the input image. Then, position embeddings character classifier. When an image contains a long sequence of
are used as indices of the character sequence corresponding to characters, they need to build a lot of classifiers. To improve the
a word. By combining the ResNets features with each position
capability of handling misalignment between inputs and target
embedding, the model generates different inputs for the BiLSTM
networks. Finally, the state sequence of the BiLSTM is used to labels, the long short-term memory (LSTM) [18] network com-
recognize corresponding characters. Without additional language bining with connectionist temporal classification (CTC) [19]
resource, the proposed model achieved the best result on two public is used for labeling sequence. Based on CTC model, Zhan
corpora, i.e., the 2017 ICDAR word-level information extraction in et al. [20] used ResNets [21] to extract features. The recurrent
historical handwritten records competition and the RIMES public neural network (RNN) was used to model the contextual infor-
dataset on character error rate.
mation and predict recognition sequences in Zhan’s model. Shi
Index Terms—Position embedding, residual networks, bidirec- et al. [22] proposed a novel end-to-end scene text recognition ar-
tional long short-term memory network, off-line handwritten word chitecture, which uses a convolutional recurrent neural network
recognition. (CRNN) with CTC. As one kind of important deep learning
mechanism, the attention model is successfully applied to text
I. INTRODUCTION recognition [23], [24]. Shi et al. [25] proposed a flexible rec-
tification mechanism based on the spatial transformer network
FF-LINE handwritten word or sentence recognition is
O still a very challenging problem, especially for those
languages that lack language resources. Traditionally, segmenta-
(STN) for irregular text recognition. And Wojna et al. [26] pre-
sented an end-to-end approach with a spatial attention mask for
scene text recognition.
tion is one of the key tasks for word recognition [1]–[4]. Models To deal with the diversity of writing styles and the similar-
based on hidden markov model (HMM) or neural network hid- ities between characters, the neural networks for handwriting
den markov model (NN-HMM) had been successfully applied recognition usually rely on constructing additional features and
on segmentation-free word recognition [5], [6]. The main is- lexicons. Chherawala et al. [27] achieved promising results with
sues of traditional methods include the overfitting and the long features such as histograms, direction distribution, and profiles.
distance dependency [7], [8]. Almazán et al. [28] proposed a word spotting and recognition
In recent years, deep learning has been introduced for recog- method by embedding both word images and text strings in a
nition tasks. Outstanding performance has been reached on common vectorial subspace. Based on the Almazán’s work, Poz-
handwriting recognition [9]–[12] and scene text recognition nanski et al. [29] presented a CNN-N-Gram method to estimate
its n-gram frequency profile by constructing a set of attributes.
Manuscript received October 13, 2018; revised January 3, 2019; accepted They utilized canonical correlation analysis (CCA) to match the
January 20, 2019. Date of publication January 29, 2019; date of current version predicted profiles to the true profiles of all words in a big lexicon.
March 12, 2019. This work was supported in part by the Natural Science Foun-
dation of China (Grants 61473101, 61872113, and 61573118); and in part by
Their system has been applied on several handwriting recogni-
Strategic Emerging Industry Development Special Funds of Shenzhen (Grants tion benchmarks and reached an obvious performance gain. The
JCYJ20170307150528934 and JCYJ20170811153836555). The associate edi- issue of this method is that it requires the construction of a large
tor coordinating the review of this manuscript and approving it for publication
was Dr. Yap-Peng Tan. (Corresponding author: Qingcai Chen.)
number of linguistic features, such as unigrams, bigrams, and
The authors are with the Shenzhen Chinese Calligraphy Digital Simu- trigrams. In the papers [28] and [29], the recognition tasks are
lation Engineering Laboratory, Harbin Institute of Technology (Shenzhen), performed to some extent like retrieval systems, which match
Shenzhen University Town, Shenzhen 518055, China (e-mail:, wxpleduole@ the word label from existing dictionaries and are named as the
gmail.com; qingcai.chen@hit.edu.cn; youjinghan2018@163.com; xiaoyulun@
stu.hit.edu.cn). lexicon-driven method. In order to avoid constructing a large
Digital Object Identifier 10.1109/LSP.2019.2895967 number of linguistic features and reduce the dependency on
1070-9908 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
598 IEEE SIGNAL PROCESSING LETTERS, VOL. 26, NO. 4, APRIL 2019

ResNets rather than deep convolution neural networks (DCNN)

to produce more sophisticated features. Then, to indicate the
position of each character contained in the word, a position
embedding is assigned and combined with the output of the
ResNets. Since the ResNets output is the same for all characters
in the word, the PE plays the role of attention mechanism. By
combining with PEs corresponding to different positions, the
different parts of the ResNets output are emphasized. Finally,
the ResNets output combined with PEs are input into the BiL-
STM sequentially. Each hidden state of the BiLSTM is used as
the recognition vector and is fed into the full-connected multiple
perceptrons, then the softmax layer conducts the classification
of each handwritten character.

B. ResNets
In this letter, 101-layer ResNets are employed to learn the
features of an image. The network architecture is based on the
earlier work of He et al. [21]. Considering an input sample x
(here x could be multiple color channels) and output vector y,
a building block of ResNets is defined as:

y = F(x, {Wi }) + x (1)

where the residual function F(x, {Wi }) represents the residual

mapping to be learned, which can have multiple convolutional
layers. When x and F have different dimensions, a linear pro-
jection Ws by the shortcut connections can be introduced to
Fig. 1. The architecture of PE-ResNets-BiLSTM model. match the dimensions:

dictionary, a lexicon-verification process is employed by Stuner y = F(x, {Wi }) + Ws x (2)

et al. [30]. They used a cohort of LSTM and a verification strat-
In this letter, we use a bottleneck building block that includes
egy. Its advantage is that the out of vocabulary (OOV) words
a stack of 3 layers. Its architecture is depicted in [21]. 1 × 1
could also be recognized [30]. Although it greatly reduces the
convolution of the head and end are used to reduce and restore
requirements on linguistic resources, the method of Stuner et al.
the dimensions, and 3 × 3 convolution in the middle layer is a
is integrated with a dictionary during modeling, which makes
bottleneck with smaller input/output dimensions.
the model complicated.
To further reduce the dependency on linguistic resources,
this letter proposes an unconstrained off-line handwritten word C. Position Embedding
recognition model. The main contributions of this letter include: To avoid the segmentation of handwritten word, a general way
1) Proposes a method to convert the position information of char- is to segment the feature representation of an image by column,
acters into position embeddings (PE), which helps the network then feed each slice into a LSTM network sequentially to gen-
automatically learn the character representation. 2) Proposes a erate character classification vectors [31], [32]. Even by using
novel handwritten word recognition model by integrating the CTC method, such hard segmentation may cause information
ResNets, position embeddings, and BiLSTM network. The pro- loss or abundant. It also cannot handle too short sequences or
posed model is greatly simplified while keeping the capability sequences that have more characters than the number of seg-
of segmentation-free and multilingual portable. 3) Experiments mented slices.
are conducted on public corpora for two languages, and the To address the above issues, this letter introduces the posi-
comparable results are achieved. By adding a simple postpro- tion embeddings [33]. For a word of length n, the characters
cessing model, the proposed method reached the state-of-the-art contained in it are indicated by their order in the sequence as
performance of CER on 2017 ICDAR IEHHR competition and i = 1, 2, . . . , n(n ≤ K). Here K denotes the maximum length
RIMES datasets respectively. of characters in an image. It is obvious that the features cor-
responding to a character are related to its index. But there is
II. PROPOSED METHOD no explicit boundary between two characters, especially after
the operations of feature extraction. To fully use the order in-
A. The Model Structure
formation of characters, while avoid making a hard decision of
The architecture of the position embedding integrated model character boundaries, we define the position embeddings (PE)
(denoted as PE-ResNets-BiLSTM model) is given in Fig. 1. At as a series of vectors that represent the positional index of the
first, the 101-layer ResNets [21] are used to generate a represen- character in the image. That is, representing each index i by a
tative vector for the image of a handwritten word. Here we use Q-dimensional position embedding Pi (i = 1, 2, . . . , K). The
WU et al.: UNCONSTRAINED OFFLINE HANDWRITTEN WORD RECOGNITION BY POSITION EMBEDDING INTEGRATED RESNETS MODEL 599

PE is then used to distinguish the part of features for a given written character sequence. Since in prediction stage, we will
character contained in a handwritten word. Though there are end the sequence while the first “null” label is encountered, we
many ways to combine a PE with the output of ResNets, the only count on the loss of the first “null” label in the training
simple and efficient way of concatenation is used. Let Fg de- stage.
notes the global feature vector output from the ResNets, which
is the same for all characters in the given handwritten word. D. The Model Training
Then the feature of each character can be expressed as
In order to overcome overfitting, we add L2 regularization
(i) with a weight decay of 0.0001 to the loss function expression.
Fc = Fg ⊕ Pi , i = 1, 2, . . . , K (3)
The network is trained by stochastic gradient descent (SGD)
where the ⊕ denotes the concatenating operation. There are usu- with a momentum set to 0.9. The learning rate is initially set
ally two ways to determine the value of each PE [33], [34], one to 0.1 and is divided by 10 per 50 iterations. Considering the
is to use randomly generated values, the other is to dynamically huge parameter space of the complex networks, data enhance-
learn from the model. This letter uses a mechanism of PE learn- ments are employed to expand training samples. For each in-
ing during recognizer training. As shown in Fig. 1, the character put image, we rotate, shear, and zoom, respectively. And the
feature vector Fc (i) is the ith input of the BiLSTM, and the corresponding parameter ranges are [−5◦ , +5◦ ], [−0.5, +0.5]
ith hidden state is corresponding to the ith classification vector and [0.8, 1.2]. By this method, each input image can gener-
used by multiple layer perceptrons and a softmax classifier. ate 12 additional images. Prediction-side data enhancement is
Given a handwritten word containing the character sequence performed in the same way. During the prediction stage, 13
C = {c1 , c2 , . . . , cn }, n ≤ K, we assume A is a set of all char- images were separately passed through the proposed network,
acter labels that the language needs to predict. The standard and then calculate the average of the softmax results of 13 im-
output label corresponding to the ith state is given as: ages. Finally, the sequence of characters with the highest con-
fidence was taken as the predicted result of the test image. In
char, char ∈ A, i ≤ n
ci = (4) the training process, neither segmentation techniques nor lan-
∅, n < i ≤ K guage resources are used. We only use an official lexicon in the
post-processing stage, which further improves the recognition
where ∅ represents the “null” label, which is just a placeholder
performance by not requiring any additional language resource
while the real length of the given word is shorter than the max-
and keeping the simplicity of the recognition system. In this
imum length K. Assuming that the K prediction results are
step, we first do lexicon-free recognition, and then select the
denoted as S = {s1 , s2 , . . . , sK }, then the conditional proba-
most closed word from the lexicon according to the edit distance
bility is defined as:
metric. The maximal edit distance is set to 7 to limit the search

n
K complexity.
p(C|I) = p(si = ci |I) p(sj = ∅|I) (5)
i=1 j =n +1 III. EXPERIMENTS
where the p(·) is the probability output of softmax classifiers. A. Dataset
We only extract the characters before the first ‘∅’ label as the
word prediction results. Here n is the number of the characters The experiments of this letter are constructed on two
contained in the ground truth label sequence. Given the train- benchmarks. For the 2017 ICDAR IEHHR competition [35],
ing dataset X = {I (d) , C (d) }, d = 0, 1, 2, . . . , |X|, where I (d) 125 pages of the Esposalles dataset [36], [37] are used for hand-
is the dth input image and C (d) is the ground truth label se- writing recognition and named entities recognition (NER). This
quence. Let L be the label set of the model, including character dataset consists of historical handwritten marriage records from
set A and ∅ label. yi is the one-hot encoding of the character the archives of the Cathedral of Barcelona in old Catalan. The
ci , which is a vector of the |L| dimension. Then the element of training set is composed of 968 marriage records with 31501
the vector yi can be expressed as isolated word images. The test set is composed of 253 marriage
records. Since the test set is not publicly available, we randomly
1, ci = L[j] divide the training word images into equal five parts and per-
yij = (6) form a 5-fold cross-validation. The performance of our previous
0, ci = L[j]
competition system running on the test set is also given as the
Here L[j] denotes the j th class of the label set L. The loss comparison. In this competition, we had won and improved the
function is defined as: performance from the baseline of 70.18% given by the compe-
⎛ tition organizer to 91.97% on the test set.
|X | n |L|
1 ⎝ (d)
L= − yij ln pij (d) The RIMES [38] dataset was used for ICDAR 2011 compe-
|X| i=1 j =1 tition as an isolated word recognition task [39]. A dictionary
d=1
⎞ (7) composed of more than 5000 words is also provided. In this
|L|
letter, we conduct experiments on the training and test sets.
+ yn +1,j (d) ln pn +1,j (d) ⎠ (n + 1)
j =1 B. Experimental Results
where the pij is the probability of ci = L[j]. In (7), the second In this letter, we use the same character error rate (CER)
term of the cross-entropy corresponds to the “null” of the hand- measure as in [29], which is based on the Levenshtein Distance.
600 IEEE SIGNAL PROCESSING LETTERS, VOL. 26, NO. 4, APRIL 2019

TABLE I
COMPARISON TO EXISTING METHODS IN CHARACTER ERROR RATE (%) ON
RIMES AND ESPOSALLES DATASET

∗
The result is obtained on unpublished test set, which is not directly
comparable to our result. † The results listed from the 2nd to 6th raws
are reported in [29].

TABLE II
COMPARISON TO DIFFERENT VARIANTS OF THE FULL SYSTEM
IN CHARACTER ERROR RATE (%)
Fig. 2. Examples of recognition. Left characters are ground truth and right
are predictions. Red characters are recognized wrongly.

consistently improve performance. It is remarkable that, with-

out the lexicon for post-processing, the character error rate of
the proposed method is 2.78%. It outperforms all compared
methods, except the method of [29]. Experiments show that the
proposed method is suitable for the language-independent hand-
written word recognition task. On Esposalles dataset, we give
the performance of another baseline model, i.e., the v6 model
The maximum character sequence length of each handwritten in Table II. This model won the first place in the 2017 ICDAR
word is set to 15 and 20 on Esposalles dataset and RIMES word-level IEHHR competition. The baseline model uses only
dataset respectively. ResNets with position embeddings, and the position embed-
The state-of-the-art systems on the handwriting recognition dings are only randomly initialized once and are not changed
benchmarks RIMES and Esposalles are compared in Table I. during the whole training process.
It shows that, in both benchmarks, the proposed method out-
performs the compared systems on the CER measure. For C. Error Analysis
the RIMES dataset, we make the 12 times data enhancement,
Fig. 2 shows some wrong recognized samples of the pro-
while Poznanski et al. [29] used 36 times data enhancement.
posed method. Our error analysis shows that most of the errors
Compared to [29], we get the absolute performance gain of
are caused by uppercase/lowercase character recognition errors.
0.11% on CER, i.e., 5.79% of error dropping on the RIMES
Fig. 2(a) and Fig. 2(b) give such examples on Esposalles and
dataset.
RIMES datasets, respectively. Most of such errors could be cor-
On the Esposalles dataset, to the best of our knowledge, only
rected by linguistic rules in specific language. On Esposalles
Toledo et al. [40] have published the performance of word-level
dataset, as shown by Fig. 2(c), there are some erased charac-
handwritten recognition. Compared with their performance, we
ters been recognized. Fig. 2(d) shows the wrong recognition of
have the performance gain of 0.34% on CER, which corre-
accents characters on RIMES dataset.
sponds to the 40.96% of error rate dropping. It should be noted
that we use the 5-fold cross-validation on published Espos-
IV. CONCLUSION
alles dataset rather than the unpublished test set used by Toledo
et al. [40]. This letter presents a novel unconstrained off-line handwritten
In order to show the effectiveness of each part of our full word recognition method named PE-ResNets-BiLSTM. Exper-
system on the recognition results, we conducted more experiments conducted on handwriting recognition datasets of two
iments on several variations of the proposed method. The re- languages show that, without any language resource, the model
sults are shown in Table II. For each of the variation version, gets nearly the same performance as existing language resource
we keep all other sets the same as the full version, and only enriched models. By adding a simple post-processing based only
the mentioned part is removed. On Esposalles dataset, we ran- on a lexicon of the given language, the model reached the state-
domly selected one of the 5-fold cross-validation data for the of-the-art CER performance on both languages. These results
experiment. Table II shows that there are the same trends of prove the effectiveness and latent capability of the proposed
performance gain for each variation version of the system on method on handwriting recognition of other resource-lacking
both datasets. Data enhancement and post-processing modules languages.
WU et al.: UNCONSTRAINED OFFLINE HANDWRITTEN WORD RECOGNITION BY POSITION EMBEDDING INTEGRATED RESNETS MODEL 601

REFERENCES [21] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016,
[1] M.-Y. Chen, A. Kundu, and J. Zhou, “Off-line handwritten word recogni- pp. 770–778.
tion using a hidden Markov model type stochastic network,” IEEE Trans. [22] B. Shi, X. Bai, and C. Yao, “An end-to-end trainable neural network
Pattern Anal. Mach. Intell., vol. 16, no. 5, pp. 481–496, May 1994. for image-based sequence recognition and its application to scene text
[2] C.-L. Liu, H. Sako, and H. Fujisawa, “Effects of classifier structures and recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 11,
training regimes on integrated segmentation and recognition of handwrit- pp. 2298–2304, Nov. 2017.
ten numeral strings,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, [23] F. Bai, Z. Cheng, Y. Niu, S. Pu, and S. Zhou, “Edit probability for scene
no. 11, pp. 1395–1407, Nov. 2004. text recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
[3] M. Kumar, M. Jindal, and R. Sharma, “Segmentation of isolated and 2018, pp. 1508–1516.
touching characters in offline handwritten Gurmukhi script recognition,” [24] Z. Cheng, F. Bai, Y. Xu, G. Zheng, S. Pu, and S. Zhou, “Focusing attention:
Int. J. Inf. Technol. Comput. Sci., vol. 6, no. 2, pp. 58–63, 2014. Towards accurate text recognition in natural images,” in Proc. IEEE Int.
[4] Y. Wang, X. Ding, and C. Liu, “Topic language model adaption for recogni- Conf. Comput. Vis., 2017, pp. 5086–5094.
tion of homologous offline handwritten Chinese text image,” IEEE Signal [25] B. Shi, M. Yang, X. Wang, P. Lyu, C. Yao, and X. Bai, “Aster: An at-
Process. Lett., vol. 21, no. 5, pp. 550–553, May 2014. tentional scene text recognizer with flexible rectification,” IEEE Trans.
[5] T.-H. Su, T.-W. Zhang, D.-J. Guan, and H.-J. Huang, “Off-line recognition Pattern Anal. Mach. Intell., to be published.
of realistic Chinese handwriting using segmentation-free strategy,” Pattern [26] Z. Wojna et al., “Attention-based extraction of structured information from
Recognit., vol. 42, no. 1, pp. 167–182, 2009. street view imagery,” in Proc. 14th IAPR Int. Conf. Doc. Anal. Recognit.,
[6] Z.-R. Wang, J. Du, W.-C. Wang, J.-F. Zhai, and J.-S. Hu, “A compre- 2017, pp. 844–850.
hensive study of hybrid neural network hidden Markov model for offline [27] Y. Chherawala, P. P. Roy, and M. Cheriet, “Combination of context-
handwritten chinese text recognition,” Int. J. Doc. Anal. Recognit., vol. 21, dependent bidirectional long short-term memory classifiers for robust of-
pp. 241–251, 2018. fline handwriting recognition,” Pattern Recognit. Lett., vol. 90, pp. 58–64,
[7] A. Graves and J. Schmidhuber, “Offline handwriting recognition with 2017.
multidimensional recurrent neural networks,” in Proc. Adv. Neural Inf. [28] J. Almazán, A. Gordo, A. Fornés, and E. Valveny, “Word spotting and
Process. Syst., 2009, pp. 545–552. recognition with embedded attributes,” IEEE Trans. Pattern Anal. Mach.
[8] T. Liu and J. Lemeire, “Efficient and effective learning of HMMS based Intell., vol. 36, no. 12, pp. 2552–2566, Dec. 2014.
on identification of hidden states,” Math. Probl. Eng., vol. 2017, 2017, [29] A. Poznanski and L. Wolf, “CNN-N-gram for handwriting word recogni-
Art. no. 7318940. tion,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 2305–
[9] J. Sueiras, V. Ruiz, A. Sanchez, and J. F. Velez, “Offline continuous 2314.
handwriting recognition using sequence to sequence neural networks,” [30] B. Stuner, C. Chatelain, and T. Paquet, “Cascading BLSTM networks for
Neurocomputing, vol. 289, pp. 119–128, 2018. handwritten word recognition,” in Proc. 23rd Int. Conf. Pattern Recognit.,
[10] X. Xiao, L. Jin, Y. Yang, W. Yang, J. Sun, and T. Chang, “Building fast 2016, pp. 3416–3421.
and compact convolutional neural networks for offline handwritten chinese [31] A. Ul-Hasan, S. B. Ahmed, F. Rashid, F. Shafait, and T. M. Breuel, “Of-
character recognition,” Pattern Recognit., vol. 72, pp. 72–81, 2017. fline printed Urdu Nastaleeq script recognition with bidirectional LSTM
[11] X.-Y. Zhang, F. Yin, Y.-M. Zhang, C.-L. Liu, and Y. Bengio, “Draw- networks,” in Proc. 12th Int. Conf. Doc. Anal. Recognit., 2013, pp. 1061–
ing and recognizing chinese characters with recurrent neural network,” 1065.
IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 4, pp. 849–862, [32] D. Ko, C. Lee, D. Han, H. Ohk, K. Kang, and S. Han, “Approach for
Apr. 2018. machine-printed Arabic character recognition: The-state-of-the-art deep-
[12] Q. Wang and Y. Lu, “A sequence labeling convolutional network and learning method,” Electron. Imag., vol. 2018, no. 2, pp. 1–8, 2018.
its application to handwritten string recognition,” in Proc. 26th Int. Joint [33] J. Gehring, M. Auli, D. Grangier, D. Yarats, and Y. N. Dauphin, “Con-
Conf. Artif. Intell., 2017, pp. 2950–2956. volutional sequence to sequence learning,” in Int. Conf. Machine Learn.,
[13] R. Wang, N. Sang, and C. Gao, “Scene text identification by leveraging 2017, pp. 1243–1252.
mid-level patches and context information,” IEEE Signal Process. Lett., [34] A. Vaswani et al., “Attention is all you need,” in Proc. Adv. Neural Inf.
vol. 22, no. 7, pp. 963–967, Jul. 2015. Process. Syst., 2017, pp. 5998–6008.
[14] X. Bai, C. Yao, and W. Liu, “Strokelets: A learned multi-scale mid-level [35] A. Fornés et al., “ICDAR2017 competition on information extraction in
representation for scene text recognition,” IEEE Trans. Image Process., historical handwritten records,” in Proc. 14th IAPR Int. Conf. Doc. Anal.
vol. 25, no. 6, pp. 2789–2802, Jun. 2016. Recognit., 2017, vol. 1, pp. 1389–1394.
[15] B. Su and S. Lu, “Accurate recognition of words in scenes without char- [36] D. Fernández-Mota, J. Almazán, N. Cirera, A. Fornés, and J. Lladós,
acter segmentation using recurrent neural network,” Pattern Recognit., “Bh2m: The barcelona historical, handwritten marriages database,” in
vol. 63, pp. 397–405, 2017. Proc. 22nd Int. Conf. Pattern Recognit., 2014, pp. 256–261.
[16] Z. Xie, Z. Sun, L. Jin, H. Ni, and T. Lyons, “Learning spatial-semantic [37] V. Romero et al., “The Esposalles database: An ancient marriage license
context with fully convolutional recurrent network for online handwritten corpus for off-line handwriting recognition,” Pattern Recognit., vol. 46,
chinese text recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 6, pp. 1658–1669, 2013.
no. 8, pp. 1903–1917, Aug. 2018. [38] E. Augustin, M. Carré, E. Grosicki, J.-M. Brodin, E. Geoffrois, and F.
[17] M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman, “Synthetic Prêteux, “Rimes evaluation campaign for handwritten mail processing,”
data and artificial neural networks for natural scene text recognition,” in in Proc. Int. Workshop Frontiers Handwriting Recognit., 2006, pp. 231–
Workshop Deep Learn., NIPS, 2014. 235.
[18] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural [39] E. Grosicki and H. El-Abed, “ICDAR 2011-french handwriting recogni-
Comput., vol. 9, no. 8, pp. 1735–1780, 1997. tion competition,” in Proc. Int. Conf. Doc. Anal. Recognit., 2011, pp. 1459–
[19] A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber, “Connectionist 1463.
temporal classification: Labelling unsegmented sequence data with re- [40] J. I. Toledo, S. Dey, A. Fornés, and J. Lladós, “Handwriting recognition by
current neural networks,” in Proc. 23rd Int. Conf. Mach. Learn, 2006, attribute embedding and recurrent neural networks,” in Proc. Doc. Anal.
pp. 369–376. Recognit.14th IAPR Int. Conf., 2017, vol. 1, pp. 1038–1043.
[20] H. Zhan, Q. Wang, and Y. Lu, “Handwritten digit string recognition by [41] B. Stuner, C. Chatelain, and T. Paquet, “Self-training of BLSTM with
combination of residual network and RNN-CTC,” in Proc. Int. Conf. lexicon verification for handwriting recognition,” in Proc. 14th IAPR Int.
Neural Inf. Process., 2017, pp. 583–591. Conf. Doc. Anal. Recognit., 2017, pp. 633–638.

Arun KRS
No ratings yet
Arun KRS
7 pages
Offline Handwritten Text Recognition Using Hybrid CNNBLSTM Network
No ratings yet
Offline Handwritten Text Recognition Using Hybrid CNNBLSTM Network
6 pages
Handwritten English Word Recognition Using A Deep Learning Based
No ratings yet
Handwritten English Word Recognition Using A Deep Learning Based
29 pages
Handwriting Detection Presentation
No ratings yet
Handwriting Detection Presentation
10 pages
Improving Irregular Text Recognition With Adaptive Feature Compression
No ratings yet
Improving Irregular Text Recognition With Adaptive Feature Compression
5 pages
Hierarchical Recurrent Neural Network For Handwritten Strokes Classification
No ratings yet
Hierarchical Recurrent Neural Network For Handwritten Strokes Classification
5 pages
A Convolutional Recurrent Neural Network For The Handwritten Text Recognition of Historical Greek Manuscripts
No ratings yet
A Convolutional Recurrent Neural Network For The Handwritten Text Recognition of Historical Greek Manuscripts
14 pages
Sign Language Recognition With Convolutional Neural Networks
No ratings yet
Sign Language Recognition With Convolutional Neural Networks
10 pages
2020 PAL Wu
No ratings yet
2020 PAL Wu
16 pages
Deep Network With Pixel Level Rectification and Robust Training For Handwriting Recognition
No ratings yet
Deep Network With Pixel Level Rectification and Robust Training For Handwriting Recognition
13 pages
Deep Convolutional Neural Network Based Recognition of Air Writing Ijariie26420
No ratings yet
Deep Convolutional Neural Network Based Recognition of Air Writing Ijariie26420
5 pages
Feature Set Evaluation For Offline Handwriting
No ratings yet
Feature Set Evaluation For Offline Handwriting
12 pages
Hand Written Letter Recognition
No ratings yet
Hand Written Letter Recognition
14 pages
Transformer-Based Approach For Joint Handwriting A
No ratings yet
Transformer-Based Approach For Joint Handwriting A
29 pages
Bhunia Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning CVPR 2019 Paper
No ratings yet
Bhunia Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning CVPR 2019 Paper
10 pages
DAN: A Segmentation-Free Document Attention Network For Handwritten Document Recognition
No ratings yet
DAN: A Segmentation-Free Document Attention Network For Handwritten Document Recognition
17 pages
Deep LSTM Networks For Online Chinese Handwriting Recognition
No ratings yet
Deep LSTM Networks For Online Chinese Handwriting Recognition
6 pages
On The Benefits of Convolutional Neural Network Combinations in of Ine Handwriting Recognition
No ratings yet
On The Benefits of Convolutional Neural Network Combinations in of Ine Handwriting Recognition
6 pages
CNN-BiLSTM Model For English Handwriting Recognition, Comprehensiv Evalution On The IAM Dataset 2307.00664v1
No ratings yet
CNN-BiLSTM Model For English Handwriting Recognition, Comprehensiv Evalution On The IAM Dataset 2307.00664v1
20 pages
Convolutional Multi-Directional Recurrent Network For of Ine Handwritten Text Recognition
No ratings yet
Convolutional Multi-Directional Recurrent Network For of Ine Handwritten Text Recognition
6 pages
Deep Feature Embedding For Accurate Recognition and Retrieval of Handwritten Text
No ratings yet
Deep Feature Embedding For Accurate Recognition and Retrieval of Handwritten Text
6 pages
Applsci 12 10155
No ratings yet
Applsci 12 10155
23 pages
A Review of Neural Networks in Handwritten Character Recognition
No ratings yet
A Review of Neural Networks in Handwritten Character Recognition
6 pages
Journal Paper Publication
No ratings yet
Journal Paper Publication
20 pages
Ingle 2019
No ratings yet
Ingle 2019
8 pages
Handwritten Text Recognition Using Tensorflow 2.0: Computer Vision
No ratings yet
Handwritten Text Recognition Using Tensorflow 2.0: Computer Vision
37 pages
Paper Summary Advancements and Challenges in Handwritten Text Recognition A Comprehensive Survey
No ratings yet
Paper Summary Advancements and Challenges in Handwritten Text Recognition A Comprehensive Survey
7 pages
Handwritten Text Generation Via Disentangled Representations
No ratings yet
Handwritten Text Generation Via Disentangled Representations
5 pages
Handwrittendigitrecognitionppt1 221115162428 68e03722
No ratings yet
Handwrittendigitrecognitionppt1 221115162428 68e03722
11 pages
Handwritten Character Recognition Using Deep Learning (Convolutional Neural Network)
No ratings yet
Handwritten Character Recognition Using Deep Learning (Convolutional Neural Network)
22 pages
Le 2018
No ratings yet
Le 2018
5 pages
Handwritten Text Recognition Using Machine Learning: Journal of Engineering Sciences Vol 14 Issue 02,2023
No ratings yet
Handwritten Text Recognition Using Machine Learning: Journal of Engineering Sciences Vol 14 Issue 02,2023
11 pages
Handwritten Digit Recognition Using Quantum Convolution Neural Network
No ratings yet
Handwritten Digit Recognition Using Quantum Convolution Neural Network
9 pages
Final Seminar Presentation2
No ratings yet
Final Seminar Presentation2
14 pages
1 s2.0 S0031320318304370 Main
No ratings yet
1 s2.0 S0031320318304370 Main
10 pages
Icicct 2018 8473291
No ratings yet
Icicct 2018 8473291
4 pages
Article 3 - Ar
No ratings yet
Article 3 - Ar
13 pages
Plagiarism Checker X Originality Report: Similarity Found: 26%
No ratings yet
Plagiarism Checker X Originality Report: Similarity Found: 26%
29 pages
Layer 2
No ratings yet
Layer 2
8 pages
Livro 4 - Deep-Learning
No ratings yet
Livro 4 - Deep-Learning
271 pages
IJCRT2107479
No ratings yet
IJCRT2107479
6 pages
Full Page Handwriting Recognition Via Image To Sequence Extraction
No ratings yet
Full Page Handwriting Recognition Via Image To Sequence Extraction
16 pages
High-Performance OCR For Printed English and Fraktur Using LSTM Networks
No ratings yet
High-Performance OCR For Printed English and Fraktur Using LSTM Networks
5 pages
Final
No ratings yet
Final
28 pages
Nikitha 2020
No ratings yet
Nikitha 2020
5 pages
ANN Viva Prep
No ratings yet
ANN Viva Prep
66 pages
Analogy Between CNN and RNN Using MNIST Dataset: Prof. Rathi R Assistant Professor Sr. Grade 1
No ratings yet
Analogy Between CNN and RNN Using MNIST Dataset: Prof. Rathi R Assistant Professor Sr. Grade 1
21 pages
Handwritten Character Recognition From Images Using CNN-ECOC Handwritten Character Recognition From Images Using CNN-ECOC
No ratings yet
Handwritten Character Recognition From Images Using CNN-ECOC Handwritten Character Recognition From Images Using CNN-ECOC
7 pages
Analogic Preprocessing and Segmentation Algorithms For Off-Line Handwriting Recognition
No ratings yet
Analogic Preprocessing and Segmentation Algorithms For Off-Line Handwriting Recognition
20 pages
Evaluating Sequence-to-Sequence Models For Handwritten Text Recognition
No ratings yet
Evaluating Sequence-to-Sequence Models For Handwritten Text Recognition
8 pages
Have We Solved The Problem of Handwriting Recognition - by Rachel Wiles - Towards Data Science
No ratings yet
Have We Solved The Problem of Handwriting Recognition - by Rachel Wiles - Towards Data Science
8 pages
Machine Learning For Handwriting Recognition: Preetha S, Afrid I M, Karthik Hebbar P, Nishchay S K
No ratings yet
Machine Learning For Handwriting Recognition: Preetha S, Afrid I M, Karthik Hebbar P, Nishchay S K
9 pages
PRNN P S Sastry Lec 1
No ratings yet
PRNN P S Sastry Lec 1
177 pages
Extraction of Information From Handwriting Using Optical Character Recognition and Neural Networks
No ratings yet
Extraction of Information From Handwriting Using Optical Character Recognition and Neural Networks
6 pages
Handwritten Text Recognition System Based On Neural Network
No ratings yet
Handwritten Text Recognition System Based On Neural Network
6 pages
Handwritten Text Recognition Using Deep Learning
No ratings yet
Handwritten Text Recognition Using Deep Learning
13 pages
Handwritten Digit Recognition Using Machine Learning
No ratings yet
Handwritten Digit Recognition Using Machine Learning
5 pages
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
No ratings yet
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
65 pages
FP Growth Algorithm
No ratings yet
FP Growth Algorithm
10 pages
CNN-RNN Based Handwritten Text Recognition: G.R. Hemanth, M. Jayasree, S. Keerthi Venii, P. Akshaya, and R. Saranya
No ratings yet
CNN-RNN Based Handwritten Text Recognition: G.R. Hemanth, M. Jayasree, S. Keerthi Venii, P. Akshaya, and R. Saranya
7 pages
DL CS05
No ratings yet
DL CS05
22 pages
2.10 Partitioning Methods - K-Means and K-Medoids
No ratings yet
2.10 Partitioning Methods - K-Means and K-Medoids
38 pages
Aug - Overlaid - Data Augmentation For Recognition of Handwritten PDF
No ratings yet
Aug - Overlaid - Data Augmentation For Recognition of Handwritten PDF
7 pages
1 IntroductionDL
No ratings yet
1 IntroductionDL
69 pages
SVM & CNN
No ratings yet
SVM & CNN
62 pages
Databook PDF
No ratings yet
Databook PDF
64 pages
Improving Classification With AdaBoost
No ratings yet
Improving Classification With AdaBoost
20 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
3 pages
(IJCST-V10I3P35) :aisha Farhana, Aswani K.S, Aswathy A.C, Divya Jolly M, Elia Nibia
No ratings yet
(IJCST-V10I3P35) :aisha Farhana, Aswani K.S, Aswathy A.C, Divya Jolly M, Elia Nibia
7 pages
Research Article
No ratings yet
Research Article
10 pages
Charalambous 2000
No ratings yet
Charalambous 2000
23 pages
Unit IV Artificial Neural Networks
No ratings yet
Unit IV Artificial Neural Networks
25 pages
QB Ecc604 May 2022 Examination Te Extc Sem Vi 2021-22
No ratings yet
QB Ecc604 May 2022 Examination Te Extc Sem Vi 2021-22
25 pages
Machine Learning Toolkit User Manual
No ratings yet
Machine Learning Toolkit User Manual
7 pages
Oracle: Question & Answers
No ratings yet
Oracle: Question & Answers
5 pages
Alex Net
No ratings yet
Alex Net
11 pages
DL
No ratings yet
DL
2 pages
Asset-V1 MITx+6.86x+3T2020+typeasset+blockslides Lecture2 Compressed
No ratings yet
Asset-V1 MITx+6.86x+3T2020+typeasset+blockslides Lecture2 Compressed
21 pages
Experiment 1
No ratings yet
Experiment 1
2 pages
Syllabus INTRODUCTION TO DEEP LEARNING
No ratings yet
Syllabus INTRODUCTION TO DEEP LEARNING
11 pages
7 Types of Classification Algorithms
No ratings yet
7 Types of Classification Algorithms
9 pages
Bidirectional Associative Memories BAM
No ratings yet
Bidirectional Associative Memories BAM
2 pages
An To Neural Networks Ben Krose Patrick Van Der Smagt
No ratings yet
An To Neural Networks Ben Krose Patrick Van Der Smagt
9 pages
Data Mining Analysis To Determine Employee Salaries According To Needs Based On The K-Medoids Clustering Algorithm
No ratings yet
Data Mining Analysis To Determine Employee Salaries According To Needs Based On The K-Medoids Clustering Algorithm
8 pages
19eid331 - Artificial Neural Networks
No ratings yet
19eid331 - Artificial Neural Networks
3 pages
Airline Reservation
No ratings yet
Airline Reservation
2 pages
Handwritten Text Recognition and Digital Text Conversion
No ratings yet
Handwritten Text Recognition and Digital Text Conversion
2 pages
ANN Matlab
No ratings yet
ANN Matlab
13 pages
Programming with X10: Definitive Reference for Developers and Engineers
From Everand
Programming with X10: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unconstrained Offline Handwritten Word

Uploaded by

Unconstrained Offline Handwritten Word

Uploaded by

IEEE SIGNAL PROCESSING LETTERS, VOL. 26, NO.

4, APRIL 2019 597

Unconstrained Offline Handwritten Word

ResNets rather than deep convolution neural networks (DCNN)

y = F(x, {Wi }) + x (1)

where the residual function F(x, {Wi }) represents the residual

dictionary, a lexicon-verification process is employed by Stuner y = F(x, {Wi }) + Ws x (2)

consistently improve performance. It is remarkable that, with-

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.