0% found this document useful (0 votes)

141 views

Tamil Textual Image Reader

1) The Tamil Textual Image Reader is a mobile app that uses optical character recognition and text-to-speech conversion to scan Tamil text images and read them aloud, helping students study efficiently. 2) It uses React Native to build the mobile interface and Flask/Connexion to build a web API for image-to-text and text-to-speech conversion services. Tesseract OCR and Indic TTS are used for recognition and synthesis. 3) Recurrent neural networks are employed for their ability to consider sequential dependencies important for tasks like language processing. The app aims to support more languages and offline use in the future.

Uploaded by

KogulVimal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

141 views

Tamil Textual Image Reader

Uploaded by

KogulVimal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Tamil Textual Image Reader

Introduction
Since the invention of mobile phones there has been a challenge prevailing to make them
speak. So, I thought of an idea to making them to speak in my native language, Tamil. And for the
sake of learning students I thought to find an alternative for eyestrain when reading lengthy texts and
articles. The resolution was to make a mobile application that would read all the lengthy texts by
voice for students. And the “Tamil Textual Image Reader” has been built.
The “Tamil Textual Image Reader” is a mobile application which can be used to scan a Tamil textual
image, and which read the text by voice. This application includes the features of optical character
reading and text to speech conversion of Tamil language. This application is useful for students who
learn by listening. This application also supports image file uploading and conversion feature.
Literature Review
For the last few decades, textual image processing and Optical Character Recognition (OCR)
have been leading research topic in the field of machine learning. Recognition of machine printed, or
handwritten document is an essential part in applications like intelligent scanning machines, text to
speech converters and automatic language to language translators. “The purpose of document image
analysis is to recognize the text and graphical components in the paper document and to extract the
intended information, as human beings do. Two components of document image analysis are Textual
processing and Graphical processing. Textual processing deals with the text component of the
document image. The graphical processing deals with non-textual line and symbol components that
make up line diagrams, delimiting straight lines between text sections and company logos etc. In the
current context, it is limited only to the textual processing part”.

This system also deals with implementation of text to speech using machine learning. Research is
being done throughout the world to improve the human-computer interface and one of the promising
areas is the text-to-speech (TTS) conversion. “The term text-to-speech refers to the conversion of
input text into spoken utterance. The input text may consist of number of words, sentences,
paragraphs, numbers and even abbreviations. The TTS conversion process should identify the text
without any ambiguity and generate the corresponding sound output with acceptable clarity. This
means that the quality of the output of the TTS engine should be made as close to the natural speech
as possible.”

“In general, TTS conversion can be carried out in three ways. They are Formant based, Parameter
based, and Concatenation based. In this work, the concatenation-based technique has been chosen to
develop a TTS engine. Since preliminary studies have been already carried out by several researches,
this work was focused on Tamil. The method of concatenation includes two phases - namely “offline
phase” and “online phase”. The offline phase includes the basic unit selection, identification of
language rules (phonetic rules and prosodic rules) and creation of the sound database”. The online
phase includes splitting of input text into basic units and converting them into speech after applying
the Tamil language rules. The system takes an arbitrary text file and processes the contents letter by
letter and passes them through the stages of text analysis & parsing (i.e. identification of basic units),
application of language rules and finally concatenation and synthesis to produce the speech output.
Usage of the Application

This mobile application can be used by students who prefers studying by hearing than by reading and
the students who are visually disabled. They can install this application in their mobile and take the
picture of the text and the app will read the text for them by voice then they can take necessary points
and study. There is audio player in the application which has the feature to pause and continue the
hearing. The users can save the audio file and listen to it whenever they want.

This system can be divided in t two parts such as the mobile application and the web hosted system.
When the user submits the captured image or existing image in his storage the image is posted to a
cloud hosted web system. Then the image processing begins.

First, the image is converted into a readable text document then the document is processed to read the
words and converted into an audio file. Then the audio file is responded back to the mobile
application. Then the mobile application plays the audio file in the audio player embed in the
application. If you want to save the audio file and listen to it later, we have a save feature for you.

Used Technologies
Mobile application: I have used React Native Javascript Library to build the mobile application so
that, It can run on devices with Android and Ios operating systems. This library is easy to use and has
the ability to build user friendly mobile applications.

Web System: I have used Flask and Connexion Python libraries to build up my web API in which I
can run the services that converts image to text and then converts text to voice.
The are two services in the Web API.
1. Image to Text Converter Service
2. Text to Voice Converted Service.
Image to Text Converter Service used pyterserract Python library to identify the textual representation
of the graphics in the image and converts to Tamil texts. ‘pyteserract’ python library uses Tesseract
OCR to recognize the characters.
Text to Voice Converted Service uses Tamil TTS library to identify the sound for the words and
concatenates to change in to human understandable voice sentence. Tamil TTS library uses a huge
database with phenomes of the Tamil words and characters. These audio segments are joined together
to provide the Tamil vocal output of the input Tamil text.
Tesseract OCR
Tesseract was developed as a proprietary software by Hewlett Packard Labs. In 2005, it was open
sourced by HP in collaboration with the University of Nevada, Las Vegas. Since 2006 it has been
actively developed by Google and many open source contributors.
“Tesseract acquired maturity with version 3.x when it started supporting many image formats and
gradually added a large number of scripts (languages). Tesseract 3.x is based on traditional computer
vision algorithms. In the past few years, Deep Learning based methods have surpassed traditional
machine learning techniques by a huge margin in terms of accuracy in many areas of Computer
Vision. Handwriting recognition is one of the prominent examples. So, it was just a matter of time
before Tesseract too had a Deep Learning based recognition engine.”

In version 4, Tesseract has implemented a Long Short-Term Memory (LSTM) based recognition
engine. LSTM is a kind of Recurrent Neural Network (RNN).

Indic TTS

This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages,
improving quality of synthesis, as well as small footprint TTS integrated with disability aids and
various other applications. “This is a consortium-based project funded by the Department of
Electronics and Information Technology (Deity), Ministry of Communication and Information
Technology (M CIT), Government of India involving 13 institutions and SMT, IITM being one of
them. The project comprises of Phase I and Phase II. Phase I of the project used Festival-based speech
synthesis for Bengali, Hindi, Tamil, Telugu, Malayalam and Marathi . Phase II of the project
commenced in 2012 employing HTS based statistical speech synthesis for 13 Indian languages.
Neural networks are not used for speech segmentation in the TTS framework for Indian languages
even though they are widely used in speech recognition. In this work, GMMs in HMM-GMM
framework for phoneme segmentation in TTS systems are replaced by DNN and CNN for better
phoneme segmentation. Acoustic models are built by training the neural networks with the GMM-
HMM monophonic alignment (also known as HMM-based phone alignment) as the initial alignment.
The DNN-HMM/CNN-HMM are then trained iteratively to get accurate final phone boundaries.

AI Techniques Used
Recurrent Neural Networks
“This neural network type is a most necessary kind of neural networks which is mostly used
in natural language processing. Commonly in neural networks an input is processed through a number
of layers and an output is produced, with an assumption that two successive inputs are not depending
on each other”.
“We cannot assume in this way in most of the real-life scenarios. As an example, we can take the
prediction of stock exchange at a given time or prediction of the next word in a sentence, in these two
scenarios we have dependency on the previous value. This must be considered. This term “Recurrent”
in this neural network explains that this executes the same action for each element of a sequence in
which the output is depended on the previous execution. In another way, we can say that RNNs have a
“memory” which holds the information of the calculations happened so far. Theoretically, Recurrent
Neural Nets can access information in arbitrarily long series, but actually, they are limited to look
back only few steps. The executed results (information) runs through a loop in recurrent neural
networks. In this cycle, when the net comes to conclusion, it takes into consideration of the current
input and the information it has learnt from the inputs which are already received.”
Future Scope
This application future scope is to add more languages to the system and offline functionality of the
system.

Conclusions
The package “Tamil Text Image Reader” has been tested on different fonts. Attempts have been
made to make the output (recognised voice) very similar to input document visually. The overall
recognition rate is around 94% with the presence of some special characters and numerals. A
hierarchical classification scheme has been followed.

Gere and Timoshenko Mechanics Materials 2nd Editio
No ratings yet
Gere and Timoshenko Mechanics Materials 2nd Editio
1 page
Footnotes & Bibliography
No ratings yet
Footnotes & Bibliography
6 pages
Sba Mark Scheme
0% (1)
Sba Mark Scheme
2 pages
Appar
No ratings yet
Appar
7 pages
Liberation of The People in South Travancore
No ratings yet
Liberation of The People in South Travancore
20 pages
Hebrew and Tamil
No ratings yet
Hebrew and Tamil
98 pages
Implementation of Sandhi Viccheda For Sanskrit Wordssentencesparagraphs
No ratings yet
Implementation of Sandhi Viccheda For Sanskrit Wordssentencesparagraphs
7 pages
Informatics NOTES Prelim
No ratings yet
Informatics NOTES Prelim
19 pages
The Spicer Adventist University History by Sunil Sarkar
100% (1)
The Spicer Adventist University History by Sunil Sarkar
17 pages
CTH 217 The Prophets
No ratings yet
CTH 217 The Prophets
176 pages
History of Printing in Tamil Language
0% (1)
History of Printing in Tamil Language
7 pages
Conference Report - Knowledge Representation and Inference in Sanskrit - Summer 1987 (AI Volume 8)
100% (1)
Conference Report - Knowledge Representation and Inference in Sanskrit - Summer 1987 (AI Volume 8)
1 page
Tamil Christian Books List
No ratings yet
Tamil Christian Books List
4 pages
"Rare Words" in Classical Tamil Literature: From The Uriyiyal To The Tivākaram
No ratings yet
"Rare Words" in Classical Tamil Literature: From The Uriyiyal To The Tivākaram
17 pages
Madhwa and His Philosophy
No ratings yet
Madhwa and His Philosophy
5 pages
Information & Technology Management
No ratings yet
Information & Technology Management
14 pages
Brief Account of Baghdad As A City of Knowledge and Culture
No ratings yet
Brief Account of Baghdad As A City of Knowledge and Culture
4 pages
Methods of Delivery
No ratings yet
Methods of Delivery
3 pages
Hebron Messenger For July 2002
No ratings yet
Hebron Messenger For July 2002
12 pages
Studies in Tamil Literature and History 1930
No ratings yet
Studies in Tamil Literature and History 1930
363 pages
Gift of Prophecy Module 1st Grading Final
No ratings yet
Gift of Prophecy Module 1st Grading Final
13 pages
ADVERBS - Final
No ratings yet
ADVERBS - Final
4 pages
A Short Biographical Sketch of Sri Jibananda Vidyasagara by Sri Sudipta Muns1
No ratings yet
A Short Biographical Sketch of Sri Jibananda Vidyasagara by Sri Sudipta Muns1
1 page
Thesis Manual 2019-2020
No ratings yet
Thesis Manual 2019-2020
34 pages
Language Research-Tamil
100% (1)
Language Research-Tamil
67 pages
Tamil Dictionary
0% (2)
Tamil Dictionary
208 pages
Review Report of English Vinglish..
No ratings yet
Review Report of English Vinglish..
13 pages
Instant ebooks textbook A Frequency Dictionary of Spanish Core Vocabulary for Learners 2nd Edition Mark Davies download all chapters
100% (3)
Instant ebooks textbook A Frequency Dictionary of Spanish Core Vocabulary for Learners 2nd Edition Mark Davies download all chapters
40 pages
Library Orientation Programme - Ratnapuri Institutions
No ratings yet
Library Orientation Programme - Ratnapuri Institutions
31 pages
Learning Tamil
No ratings yet
Learning Tamil
16 pages
Approaches of Study of Politics Science
No ratings yet
Approaches of Study of Politics Science
2 pages
Staying Power
No ratings yet
Staying Power
11 pages
4T Grammar 101 & Gen 1 Hebrew Study PT 1 (70 Slides) 26 Jan 2018
100% (1)
4T Grammar 101 & Gen 1 Hebrew Study PT 1 (70 Slides) 26 Jan 2018
70 pages
09 Chapter 3
100% (1)
09 Chapter 3
17 pages
Nine Forms of Bhakti
No ratings yet
Nine Forms of Bhakti
1 page
Kakinada
No ratings yet
Kakinada
13 pages
Creation Dasa Sahitya
100% (1)
Creation Dasa Sahitya
321 pages
Sivagnana Siddhiyar Supakkam
No ratings yet
Sivagnana Siddhiyar Supakkam
228 pages
The UST Hymn
No ratings yet
The UST Hymn
10 pages
Simon Peter His Life and Its Lessons - C.H. Mackintosh - 10906
No ratings yet
Simon Peter His Life and Its Lessons - C.H. Mackintosh - 10906
87 pages
Missionary Biographies
No ratings yet
Missionary Biographies
9 pages
5 Yugas Concept With Etymological Meaning
No ratings yet
5 Yugas Concept With Etymological Meaning
10 pages
Unit I and II - Hinduism
No ratings yet
Unit I and II - Hinduism
53 pages
Interesting Facts About Languages - 2012
100% (1)
Interesting Facts About Languages - 2012
31 pages
2020 NECF+Prayer+Booklet Children - ENG Ebook
100% (2)
2020 NECF+Prayer+Booklet Children - ENG Ebook
62 pages
Mountain Path 2004 II April
No ratings yet
Mountain Path 2004 II April
67 pages
Sivagnana Siddhiyar Parapaksham
No ratings yet
Sivagnana Siddhiyar Parapaksham
74 pages
Bishop of Allahabad Consecrates and Installs Protestant Bishop
100% (2)
Bishop of Allahabad Consecrates and Installs Protestant Bishop
3 pages
Savitri Bai Phule
100% (1)
Savitri Bai Phule
24 pages
German Scholarship and Second Temple Judaism
No ratings yet
German Scholarship and Second Temple Judaism
4 pages
Manual Tester Resume
No ratings yet
Manual Tester Resume
3 pages
Avatar
No ratings yet
Avatar
8 pages
Rick Brigs - Knowledge Representation and Inference in Sanskrit PDF
No ratings yet
Rick Brigs - Knowledge Representation and Inference in Sanskrit PDF
1 page
Colonizing The Realm of Words The Transformation of Tamil Literature in Nineteenth-Century South India (Ebeling, Sascha)
No ratings yet
Colonizing The Realm of Words The Transformation of Tamil Literature in Nineteenth-Century South India (Ebeling, Sascha)
382 pages
XIVE: External Interrupt Virtualization For The Cloud Infrastructure
No ratings yet
XIVE: External Interrupt Virtualization For The Cloud Infrastructure
10 pages
Ziegenbalg Missionary Study
No ratings yet
Ziegenbalg Missionary Study
10 pages
Concept of Postmodernism
No ratings yet
Concept of Postmodernism
7 pages
Lord Shri Lahari Krishna's Message For Mankind.
No ratings yet
Lord Shri Lahari Krishna's Message For Mankind.
14 pages
Croteau and Hoynes (2014)
No ratings yet
Croteau and Hoynes (2014)
19 pages
KH
No ratings yet
KH
7 pages
PDF To Voice by Using Deep Learning
No ratings yet
PDF To Voice by Using Deep Learning
5 pages
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
From Everand
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
Timothy King
No ratings yet
Fanuc Rj3ib Controller Manual
0% (1)
Fanuc Rj3ib Controller Manual
5 pages
100+ Tambola Ticket in Hindi and English Mix Printable FREE
No ratings yet
100+ Tambola Ticket in Hindi and English Mix Printable FREE
30 pages
Complete Lab Manual Lab VII
No ratings yet
Complete Lab Manual Lab VII
49 pages
BIO/CS 471 - Algorithms For Bioinformatics: Concepts
No ratings yet
BIO/CS 471 - Algorithms For Bioinformatics: Concepts
33 pages
Code Wars 2007 - Programming
No ratings yet
Code Wars 2007 - Programming
6 pages
Cet Prospectus
No ratings yet
Cet Prospectus
33 pages
BPCS - Log-12 Material Allocation
No ratings yet
BPCS - Log-12 Material Allocation
4 pages
Necessary Conditions For An Interior Optimum
No ratings yet
Necessary Conditions For An Interior Optimum
6 pages
Microsoft Ole DB Provider For Visual Foxpro 9.0 SP2
No ratings yet
Microsoft Ole DB Provider For Visual Foxpro 9.0 SP2
5 pages
Transportation Problem-1
85% (13)
Transportation Problem-1
58 pages
Quectel Umts&Lte Evb User Guide v2.1
No ratings yet
Quectel Umts&Lte Evb User Guide v2.1
45 pages
A Project Report On KBC Game
No ratings yet
A Project Report On KBC Game
5 pages
Programming in The Large (SE101) Mid-Term Exam
No ratings yet
Programming in The Large (SE101) Mid-Term Exam
7 pages
Berkeley Sockets in Linux Using X86 ASM
100% (2)
Berkeley Sockets in Linux Using X86 ASM
7 pages
Telecom Sector in Gujarat
No ratings yet
Telecom Sector in Gujarat
68 pages
60 Ways To Change Your Life
50% (2)
60 Ways To Change Your Life
2 pages
LNCS 3193 A Public Key Encryption Scheme with Pseudo random Ciphertexts 1st edition by Bodo MÃ¶ller ISBN 3540229876 978-3540229872 pdf download
100% (2)
LNCS 3193 A Public Key Encryption Scheme with Pseudo random Ciphertexts 1st edition by Bodo MÃ¶ller ISBN 3540229876 978-3540229872 pdf download
51 pages
CVF Rter
No ratings yet
CVF Rter
142 pages
GIRD Systems CORDIC Tutorial
No ratings yet
GIRD Systems CORDIC Tutorial
19 pages
Project Management Implementation, Monitoring Control
100% (1)
Project Management Implementation, Monitoring Control
62 pages
P DF Presentations: PDF As A Presentation Tool
No ratings yet
P DF Presentations: PDF As A Presentation Tool
9 pages
Helpful Hint: ALTER TABLE On Its Own Does Nothing. Start With The Command
No ratings yet
Helpful Hint: ALTER TABLE On Its Own Does Nothing. Start With The Command
10 pages
Digital Signal Processing by J.s.chitode
75% (4)
Digital Signal Processing by J.s.chitode
588 pages
Bca3010 Unit 01 SLM
No ratings yet
Bca3010 Unit 01 SLM
12 pages
Python Debugging for AI, Machine Learning, and Cloud Computing: A Pattern-Oriented Approach 1st Edition Vostokov All Chapters Instant Download
100% (1)
Python Debugging for AI, Machine Learning, and Cloud Computing: A Pattern-Oriented Approach 1st Edition Vostokov All Chapters Instant Download
47 pages
ADVPN Technical Deep Dive
No ratings yet
ADVPN Technical Deep Dive
51 pages
Alumni Website Report
100% (1)
Alumni Website Report
18 pages
Resume Hema
No ratings yet
Resume Hema
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Tamil Textual Image Reader

Uploaded by

Tamil Textual Image Reader

Uploaded by

Tamil Textual Image Reader

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.