Abstract
With advances in the field of digitization, document analysis and handwriting recognition have emerged as key research areas. Authors present a handwritten character recognition system for Gujrati, an Indian language spoken by 40 million people. The proposed system extracts four features. A unique pattern descriptor and Gabor phase XNOR pattern are the two features that are newly proposed for isolated handwritten character set of Gujrati. In addition to these two features, we use contour direction probability distribution function and autocorrelation features. Next contribution is the weighted k-NN classifier. This research finally contributes is a novel mean χ 2 distance measure. Proposed classifier exploits a combination of feature weights, new distance measure along with a triangular distance and Euclidian distance for performance that improves conventional k-NN classifier. The implementation on a comprehensive data set show 86.33 % recognition efficiency. Facts and figures show that proposed approach outperforms conventional k-NN. It is concluded that despite the shape ambiguities in Indian scripts, proposed classification algorithm could be a dominant technique in the field of handwritten character recognition.












Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Yagnik A, Mohan SR (2006) Identification of Gujrati characters using wavelets and neural networks. In: Proceedings of artificial intelligence and soft computing, pp 150–155
Kokku A, Srinivasa Chakravarthy V (2009) A complete OCR system development for Tamil Magazine Documents. In: OCR for Indic scripts. Advances in Pattern Recognition. Springer, Berlin, pp 147–162
Antani S, Agnihotri L (1999) Gujrati character recognition. In: ICDAR, pp 418–421
Shah SK, Sharma A (2006) Design and implementation of optical character recognition system to recognize Gujarati script using template matching. In: IE(I) J ET 86:44–49
Desai A (2010) Gujarati handwritten numeral optical character reorganization through neural network. In: Pattern recognition, vol 43, issue 7. Elsevier Science Inc. New York, pp 2582–2589
Dholakia J, Negi J, Rama Mohan S (2005) Zone identification in the printed Gujarati text. In: ICDAR, pp 272–276
Maloo M, Kale KV (2011) Gujarati script recognition: a review. Int J Comput Sci Eng (IJCSE) 8:480–489
Chaudhuri BB, Bera S (2010) Line word and character segmentation from handwritten Bangla text documents. In: Proceedings of International conference on advances in computer vision and information technology. I. K. International publishing, New Delhi, pp 542–551
Lehal GS, Singh C (2002) A complete OCR system for Gurmukhi script. In: Lecture notes in computer science, vol 2396. Springer, New York, pp 344–352
Taneja IJ (2006) Bounds on triangular discrimination, harmonic mean and symmetric Chi square divergences. J Concrete Math Appl Math 4:91–111
Maloo M, Kale KV (2011) Support vector machine based Gujarati numeral recognition. Int J Comput Sci Eng (IJCSE) 3:2595–2600
Clowes MB, Parks JR (1961) A new technique in automatic character recognition. Comput J 4(2):121–128
Cheriet M, Hharma N, Liu C-L, Sen CY (2007) Character recognition systems a guide for students and practitioners. Wiley, New York
Bulacu M, Schomaker L, Brink A (2007) Text-independent writer identification and verification on offline Arabic handwriting. ICDAR IEEE Comput Soc II, 23–26:769–773
Dong P, Brankov JG, Galatsanos NP, Yang Y, Davoine F (2005) Digital watermarking robust to geometric distortions. IEEE Trans Image Process 14(12):2040–2050
Lam L, Lee S-W, Suen CY (1992) Thinning methodologies—a comprehensive survey. IEEE Trans Pattern Anal Mach Intell 14(9):879
Chacko BP, Vimal Krishnan VR, Raju G, Babu Anto P (2012) Handwritten character recognition using wavelet energy and extreme learning machine. Int J Mach Learn Cybern 3(2):149–161
Xie S, Shan S, Chen X, Chen J (2010) Fusing local patterns of Gabor magnitude and phase for face recognition. IEEE Trans Image Process 19(5):1349–1361
David W (2000) Jacobs, classification with nonmetric distances: image retrieval and class representation. IEEE Trans Pattern Anal Machine Intell 22(6):583–600
Scheirer WJ, Willber MJ, Eckmann M, Boult TE (2013) Good recognition is non-metric. Comput Vision Pattern Recognit
Cha SH (2007) Comprehensive survey on distance/similarity measures between probability density functions. Int J Math Models Methods Appl Sci 1(4):300–307
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou ZH, Steinbach M, Hand DJ, Steinberg D (2008) Top 10 algorithms in data mining. J Knowl Inf Syst 14:1–37
Tomašev N, Radovanović M, Mladenić D, Ivanović M (2012) Hubness-based fuzzy measures for high-dimensional k-nearest neighbor classification. Int J Mach Learn Cybern. doi:10.1007/s13042-012-0137-1
Jiang L, Cai Z, Wang D, Zhang H (2013) Bayesian citation-KNN with distance weighting. Int J Mach Learn Cybern. doi:10.1007/s13042-013-0152-x.
Dhurandhar A, Dobra A (2012) Probabilistic characterization of nearest neighbor classifier. Int J Mach Learn Cybern. 2012. doi:10.1007/s13042-012-0091-y.
Agarwal M, Ma H, Doermann D (2010) Online handwriting recognition for Indic scripts. In: Advances in pattern recognitions, pp 125–146
Neeba NV, Namboodiri A, Jawahar CV, Narayanan PJ (2010) Recognition of Malayalam documents. In: Advances in pattern recognition, pp 125–146
Mukhtar O, Setlur S, Govindaraju V (2010) Experiments in Urdu text recognition. In: Guide, advances in pattern recognition, pp 125–146
Natrajan P, MacRostie E, Decerbo M (2009) The BBN Byblos Hindi OCR system. IEEE Trans Image Process 19(5):1349–1361
Shing J, Jang R (1993) ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Man Cybern 23(3):665–686
Cetişli B, Barkana A (2009) Speeding up the scaled conjugate gradient algorithm and its application in neuro-fuzzy classifier training, In: Soft computing a fusion of foundations, methodologies and applications. Springer, Berlin, pp 365–378. doi:10.1007/s00500-009-0410-8
Cetisli B (2010) Development of an adaptive neuro-fuzzy classifier using linguistic hedges: part 1. J Expert Syst Appl 37:6093–6101
Cetisli B (2010) The effect of linguistic hedges on feature selection: part 2. Expert Syst Appl 37:6102–6108
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Prasad, J.R., Kulkarni, U. Gujrati character recognition using weighted k-NN and Mean χ 2 distance measure. Int. J. Mach. Learn. & Cyber. 6, 69–82 (2015). https://doi.org/10.1007/s13042-013-0187-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-013-0187-z