Skip to main content
Log in

Gujrati character recognition using weighted k-NN and Mean χ 2 distance measure

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

With advances in the field of digitization, document analysis and handwriting recognition have emerged as key research areas. Authors present a handwritten character recognition system for Gujrati, an Indian language spoken by 40 million people. The proposed system extracts four features. A unique pattern descriptor and Gabor phase XNOR pattern are the two features that are newly proposed for isolated handwritten character set of Gujrati. In addition to these two features, we use contour direction probability distribution function and autocorrelation features. Next contribution is the weighted k-NN classifier. This research finally contributes is a novel mean χ 2 distance measure. Proposed classifier exploits a combination of feature weights, new distance measure along with a triangular distance and Euclidian distance for performance that improves conventional k-NN classifier. The implementation on a comprehensive data set show 86.33 % recognition efficiency. Facts and figures show that proposed approach outperforms conventional k-NN. It is concluded that despite the shape ambiguities in Indian scripts, proposed classification algorithm could be a dominant technique in the field of handwritten character recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Yagnik A, Mohan SR (2006) Identification of Gujrati characters using wavelets and neural networks. In: Proceedings of artificial intelligence and soft computing, pp 150–155

  2. Kokku A, Srinivasa Chakravarthy V (2009) A complete OCR system development for Tamil Magazine Documents. In: OCR for Indic scripts. Advances in Pattern Recognition. Springer, Berlin, pp 147–162

  3. Antani S, Agnihotri L (1999) Gujrati character recognition. In: ICDAR, pp 418–421

  4. Shah SK, Sharma A (2006) Design and implementation of optical character recognition system to recognize Gujarati script using template matching. In: IE(I) J ET 86:44–49

  5. Desai A (2010) Gujarati handwritten numeral optical character reorganization through neural network. In: Pattern recognition, vol 43, issue 7. Elsevier Science Inc. New York, pp 2582–2589

  6. Dholakia J, Negi J, Rama Mohan S (2005) Zone identification in the printed Gujarati text. In: ICDAR, pp 272–276

  7. Maloo M, Kale KV (2011) Gujarati script recognition: a review. Int J Comput Sci Eng (IJCSE) 8:480–489

    Google Scholar 

  8. Chaudhuri BB, Bera S (2010) Line word and character segmentation from handwritten Bangla text documents. In: Proceedings of International conference on advances in computer vision and information technology. I. K. International publishing, New Delhi, pp 542–551

  9. Lehal GS, Singh C (2002) A complete OCR system for Gurmukhi script. In: Lecture notes in computer science, vol 2396. Springer, New York, pp 344–352

  10. Taneja IJ (2006) Bounds on triangular discrimination, harmonic mean and symmetric Chi square divergences. J Concrete Math Appl Math 4:91–111

    MATH  MathSciNet  Google Scholar 

  11. Maloo M, Kale KV (2011) Support vector machine based Gujarati numeral recognition. Int J Comput Sci Eng (IJCSE) 3:2595–2600

    Google Scholar 

  12. Clowes MB, Parks JR (1961) A new technique in automatic character recognition. Comput J 4(2):121–128

    Article  Google Scholar 

  13. Cheriet M, Hharma N, Liu C-L, Sen CY (2007) Character recognition systems a guide for students and practitioners. Wiley, New York

    Book  Google Scholar 

  14. Bulacu M, Schomaker L, Brink A (2007) Text-independent writer identification and verification on offline Arabic handwriting. ICDAR IEEE Comput Soc II, 23–26:769–773

    Google Scholar 

  15. Dong P, Brankov JG, Galatsanos NP, Yang Y, Davoine F (2005) Digital watermarking robust to geometric distortions. IEEE Trans Image Process 14(12):2040–2050

    Article  Google Scholar 

  16. Lam L, Lee S-W, Suen CY (1992) Thinning methodologies—a comprehensive survey. IEEE Trans Pattern Anal Mach Intell 14(9):879

    Article  Google Scholar 

  17. Chacko BP, Vimal Krishnan VR, Raju G, Babu Anto P (2012) Handwritten character recognition using wavelet energy and extreme learning machine. Int J Mach Learn Cybern 3(2):149–161

    Article  Google Scholar 

  18. Xie S, Shan S, Chen X, Chen J (2010) Fusing local patterns of Gabor magnitude and phase for face recognition. IEEE Trans Image Process 19(5):1349–1361

    Article  MathSciNet  Google Scholar 

  19. David W (2000) Jacobs, classification with nonmetric distances: image retrieval and class representation. IEEE Trans Pattern Anal Machine Intell 22(6):583–600

    Article  Google Scholar 

  20. Scheirer WJ, Willber MJ, Eckmann M, Boult TE (2013) Good recognition is non-metric. Comput Vision Pattern Recognit

  21. Cha SH (2007) Comprehensive survey on distance/similarity measures between probability density functions. Int J Math Models Methods Appl Sci 1(4):300–307

    MathSciNet  Google Scholar 

  22. Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou ZH, Steinbach M, Hand DJ, Steinberg D (2008) Top 10 algorithms in data mining. J Knowl Inf Syst 14:1–37

    Google Scholar 

  23. Tomašev N, Radovanović M, Mladenić D, Ivanović M (2012) Hubness-based fuzzy measures for high-dimensional k-nearest neighbor classification. Int J Mach Learn Cybern. doi:10.1007/s13042-012-0137-1

    Google Scholar 

  24. Jiang L, Cai Z, Wang D, Zhang H (2013) Bayesian citation-KNN with distance weighting. Int J Mach Learn Cybern. doi:10.1007/s13042-013-0152-x

  25. Dhurandhar A, Dobra A (2012) Probabilistic characterization of nearest neighbor classifier. Int J Mach Learn Cybern. 2012. doi:10.1007/s13042-012-0091-y

  26. Agarwal M, Ma H, Doermann D (2010) Online handwriting recognition for Indic scripts. In: Advances in pattern recognitions, pp 125–146

  27. Neeba NV, Namboodiri A, Jawahar CV, Narayanan PJ (2010) Recognition of Malayalam documents. In: Advances in pattern recognition, pp 125–146

  28. Mukhtar O, Setlur S, Govindaraju V (2010) Experiments in Urdu text recognition. In: Guide, advances in pattern recognition, pp 125–146

  29. Natrajan P, MacRostie E, Decerbo M (2009) The BBN Byblos Hindi OCR system. IEEE Trans Image Process 19(5):1349–1361

    Google Scholar 

  30. Shing J, Jang R (1993) ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Man Cybern 23(3):665–686

    Article  Google Scholar 

  31. Cetişli B, Barkana A (2009) Speeding up the scaled conjugate gradient algorithm and its application in neuro-fuzzy classifier training, In: Soft computing a fusion of foundations, methodologies and applications. Springer, Berlin, pp 365–378. doi:10.1007/s00500-009-0410-8

  32. Cetisli B (2010) Development of an adaptive neuro-fuzzy classifier using linguistic hedges: part 1. J Expert Syst Appl 37:6093–6101

    Article  Google Scholar 

  33. Cetisli B (2010) The effect of linguistic hedges on feature selection: part 2. Expert Syst Appl 37:6102–6108

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jayashree Rajesh Prasad.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Prasad, J.R., Kulkarni, U. Gujrati character recognition using weighted k-NN and Mean χ 2 distance measure. Int. J. Mach. Learn. & Cyber. 6, 69–82 (2015). https://doi.org/10.1007/s13042-013-0187-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-013-0187-z

Keywords

Navigation

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy