0% found this document useful (0 votes)

180 views41 pages

Blind Aid Report

The document describes a technical seminar report on a system called "Blind Aid" that was developed to assist visually impaired individuals in reading printed text. It provides an introduction to the problem of accessing printed materials for the blind and proposes a system using a camera, image processing, optical character recognition, and text-to-speech to allow blind users to hear scanned text read aloud. The report will evaluate the developed system and its potential applications to improve independence and access to information for the blind.

Uploaded by

sahithi karlapalem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

180 views41 pages

Blind Aid Report

Uploaded by

sahithi karlapalem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

A

TECHNICAL SEMINAR REPORT ON

BLIND - AID

Submitted in partial fulfillment of the requirement

for the award of the degree of

BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING

Sri Lakshmi Naga Sahithi Karlapalem - 16P61A0575

VIGNANA BHARATHI INSTITUTE OF TECHNOLOGY

(A UGC Autonomous Institution, Approved by AICTE,
Affiliated to JNTUH, Kukatpally
Accredited by National Board of Accreditation (NBA),
National Assessment and Accreditation Council (NAAC))

Aushapur (V), Ghatkesar (M), Medchal(dist.).

2019-2020

1

(A UGC Autonomous Institution, Approved by AICTE,

Affiliated to JNTUH, Kukatpally
Accredited by National Board of Accreditation (NBA),
National Assessment and Accreditation Council (NAAC))

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

CERTIFICATE

This is to certify that the technical seminar report titled “BLIND AID” is being
submitted by K. Sri Lakshmi Naga Sahithi (16P61A0575) in B. Tech IV-I
semester Computer Science & Engineering is a record of bonafide work carried out
by them. The results embodied in this report have not been submitted to any other
University for the award of any degree.

SEMINAR IN-CHARGE HEAD OF THE DEPARTMENT

Sowmya Dr.K.Sreenivasa Rao
29/10/2019
2

ACKNOWLEDGEMENT

Self-confidence, hard work, commitment and planning are essential to carry-out any task.
Possessing these qualities is sheer waste, if an opportunity does not exist. So, we
whole-heartedly thank Mr. Dr.G. AMARENDER RAO, Principal, and Dr.K.Sreenivasa
Rao, Head of the Department, Computer Science and Engineering for their encouragement
and support.

We thank our seminar in-charge, seminar in-charge name for guiding us in completing
our seminar successfully.

We would also like to express our sincere thanks to all the staff of Computer Science and
Engineering, VBIT, for their kind cooperation and timely help during the course of our
seminar. Finally, we would like to thank our parents and friends who have always stood by
us whenever we were in need of them.
3

CONTENTS

ABSTRACT

CHAPTER 1
INTRODUCTION
1.1 Problem Statement
1.2. Existing system
1.3 Proposed system

CHAPTER 2
WORKFLOW PROCESS

CHAPTER 3
Modules

CHAPTER 4
Applications

CHAPTER 5
Conclusion
Bibliography
4

ABSTRACT

“Boundary – an often imaginary line that marks the edge or limit of something ”. Blindness is
one of the largest boundaries which can be “drawn ” between people and the modern world. It is
the sight, which conveys more information than any other of human senses. According to
International Council of Ophthalmology, there are 45 million blind people in the world and 135
million more with significant loss of vision. Unfortunately, it is currently impossible to make a
system which will make the blind person see. However, it is possible to design the one which
will read a printed text for them. Therefore, we undertook the development of Blind Aid – a
personal, portable text-reading system. The system comprises a video camera (mounted inside
sunglasses), a processing device, a text-to-speech converter and an earphone. The process of
reading includes extracting text from the video stream and synthesizing it into a human-like
speech, which can be heard in an earphone. Our system is designed to read printed texts
(documents, books, magazines, newspapers, posters, information signs, etc.). It is able to
perform the following secondary tasks: saving the extracted text into a memory to play it back
later,
5

CHAPTER I

INTRODUCTION

Accessing printed text in a mobile context is a major challenge for the blind. The scope of
this project is to provide technical solution and to assist the visually impaired people to
access various text resources and enhance their knowledge. This project deals with a device
that assists blind people with reading printed text in real time. In this project, the camera
module is used to capture the real time image of the product. Which is given to the main
module. The main module is of raspberry pi which is on its own a mini-computer, which
processes the image captured by the camera. Raspberry pi module, which contains the image
processing code loaded, optical character recognition technique, is used to process the image.
The image is processed internally in the raspberry pi hardware to separate the text from the
captured image by using OPENCV (open source computer vision) library. The desired letters
in the label is identified by using Tesseract OCR (optical character recognition). When the
program is executed, this system captures the image placed in front of the web camera which
is connected to Raspberry pi through USB. After that the captured image undergoes OCR
Technology. OCR is the identification of printed characters using computer software. It
converts images of typed, handwritten or printed text into machine encoded text from
scanned document or from subtitle text superimposed on an image. It also allows the
conversion of scanned images of printed text or symbols into text or information that can be
understood or edited using a computer program. In our system for OCR technology we are
using Tesseract library. Camera acts as main vision in detecting the image of the paper then
image is processed internally and separates texted region from image by using open CV
library and finally identifies the text and identified text is pronounced through voice. The
Raspberry Pi is a small, barebones computer developed by the Raspberry Pi Foundation. the
6

small size makes for an easy-to-hide computer that sips power and can be mounted behind
the display with an appropriate case. Raspberry Pi is meant to be used as a final product and
operate as a traditional desktop computer. Raspberry Pi computer is designed around the idea
of producing a computer that is “capable enough” as cheaply as possible. Raspberry Pi is a
low cost, credit card sized computer that plugs computer monitor or TV and uses standard
keyboard and mouse that uses python programming. The Raspberry Pi 3 is the third
generation Raspberry Pi. It replaced the Raspberry Pi 2 Model B in February 2016.
7

1.2 PROBLEM STATEMENT

Human communication today is mainly via speech and text. To access information in a text,
a person needs to have vision. However those who are deprived of vision can gather
information using their hearing capability. Reading is very important in today‟s world.
Blind people are an integral part of our society. However, their disabilities have forced them
to be dependent on others for assistance for daily life activities such as shopping, reading
sign post etc. This has also made them to have lesser access to computers, internet than the
people with clear vision. Consequently, they have not been able to improve on their own
knowledge, and have significant influence and impact on the society. Today in the world
there are more than 30 crore people who are visually impaired, out of which more than
4crore people are blind. According to the National Census of India there are around 2.2 crore
disabled people in India, out of which more than 1.5 crore are blind. This number tells us
that the numbers of blind people are more than other disabled people in India. And this
number is increasing rapidly since ages. Blind people are unable to perform visual tasks. For
instance, text reading requires the use of a braille reading system or a digital speech
synthesizer. The majority of published printed works does not include braille or audio
versions, and digital versions are still a minority. On the other hand, blind people are not
able to read the simple warnings in walls or signals that surround us. Thus, the development
of a portable device that can perform the image to speech conversion, whether it‟s has a
great potential and utility. Some blind students use guide dogs that are specifically trained
and usually well disciplined. Most of the time the guide dog lie quietly under or beside the
table or desk. The greatest disruption a faculty member might expect may be an occasional
yawn, stretch, or low moan at the sound of a siren. As tempting as it might be to pet a guide
dog, it is important to remember that the dog is responsible for guiding its owner and should
not be distracted from the duty while in harness.

1.3 EXISTING SYSTEM

The existing systems for blind are partially conventional but not wholesome. The most
commonly used system is Braille system.Braille is a system of raised dots that can be read
with the fingers by people who are blind or who have low vision. People who are not
visually impaired ordinarily read braille with their eyes. Braille is not a language. Rather, it
is a code by which many languages such as English, Spanish,Arabic, Chinese, and dozens of
others may be written and read. Braille isused by thousands of people all over the world in
their native languages, and provides a means of literacy for all. The specific code used in the
United States has been English Braille, American Edition but as of 2016 the main code for
reading material is Unified English Braille, a code used in seven other English speaking
countries. Braille symbols are formed within units of space known as braille cells. A full
braille cell consists of six raised dots arranged in two parallel rows each having three dots.
The dot positions are identified by numbers from one through six. Sixty-four combinations
are possible using one or more of these six dots. A single cell can be used to represent an
alphabet letter, number, punctuation mark, or even a whole word. This braille alphabet and
numbers page illustrates what a cell looks like and how each dot is numbered. When every
letter of every word is expressed in braille, it is referred to as uncontracted braille. Some
books for young children are written in uncontracted braille although it is less widely used
for reading material meant for adults. However, many newly blinded adults find
uncontracted braille useful for labelling personal or kitchen items when they are first
learning braille.
9

There are a variety of tools for both reading and writing that are used by blind. These might
include the following Perkins braille writer (also referred to as a braille writer): Similar in
appearance and function to an old-fashioned manual typewriter, the braille writer has six
keys used to emboss (press) dots on the page to form braille.
10

Slate and stylus:

A portable tool for writing braille, the slate and stylus is often used like a notepad to write
down short messages, such as a telephone number, telephone message, or shopping list or to
produce labels for items such as DVDs or cereal boxes. It is typically introduced to children.
11

Personal digital assistant (PDA):

Also known as a portable note taker or electronic note taker, a PDA is similar to a laptop
computer without a screen. Using this device, visually impaired can write with either a
standard keyboard or a braille keyboard, and can read material on the PDA either by
listening to it spoken aloud via synthetic speech or by reading braille on a refreshable braille
display.

Audio books:

When there is a large volume of material to be read, your blind people may find it beneficial
to listen to the material. Audio texts may be available on tape or CD, or, increasingly in
digital formats downloadable to a computer, PDA, or other device.
12

1.3 PROPOSED SYSTEM

The proposed system is to help the visually impaired on reading text. The existing system for
visually impaired is the Braille method which is a traditionally written with embossed paper.
But there are some difficulties such as the visually impaired people cannot read the text on
normal paper. To overcome such difficulties we propose a wearable device with camera which
captures the text and the captured text is first detected and extracted using MSER algorithm
.Then OCR method is employed which converts the images of typed or printed text into
machine encoded text. From OCR algorithm the text is checked for errors using Post
processing algorithm. The captured text is converted into speech signal using the Text To
Speech (TTS) algorithm. The converted speech signal is read out through the earphones of the
visually impaired persons. The software employed in this system is Python. The entire system
is implemented using Raspberry pi 3 model. This system enables blind person to lead an
independent day to day life. The main objectives are:

a. Independence in daily activities is the topmost priority for blind people.

b. Inserting assistive enhancements into a blind person’s shoes or cane adds more weight
influencing its torque and usage adversely.

c. Products targeted specifically at blind people tend to be more expensive.

Project aims at removing such constraints by helping the visually challenged to read
independently without restricting his/her movements. It also uses COTS (commercial
off-the-shelf technologies) ensuring cost-effectiveness of the product. The technical objective
of this device, in the context of reading from the distance, is to allow a cost-effective,
independent reading experience for the blind. We aimed to use commercial off-the-shelf
(COTS) components. Our prototype incorporates the following:

1. A high resolution camera.

2. A laptop or a smart phone.

3. An OCR software.
13

4. A microcontroller board.

5. Braille glove.

The detection and recognition of text from natural scene images constitute one of the main
tasks that need to be fulfilled in order to proceed with our project. A global method like the
Otsu’s technique is not quite suitable for camera captured images, since it often leads to loss
of textual information against the background. The camera used for this purpose was the
Microsoft LifeCam Studio Webcam. For text detection from the image, the open source
Optical Character Recognition engine, Abbyy FineReader was used. The pre-processed image
was then fed into the OCR engine and the detected text was displayed into a .txt file or a .doc
file. Optical Character Recognition, or OCR, is a technology that enables one to convert
different types of documents or images captured by a digital camera into editable and
searchable data. ABBYY FineReader is an optical character recognition (OCR) software that
works with text conversion and creates editable, searchable files and e-books from scans of
paper documents, PDFs and digital photographs.

Its salient features:

1. Convert scans & PDFs into searchable files.

2. One-click OCR software.

3. Convert documents with unmatched recognition accuracy, virtually eliminating retyping.

The three basic principles that allow humans to recognize objects are:

1. Integrity

2. Purposefulness

3. Adaptability (IPA).

Let’s take a look on how FineReader OCR recognizes text.

a. The program analyzes the structure of document image.

b. It divides the page into elements such as blocks of texts, tables, images, etc.
14

c. The lines are divided into words and then - into characters.

d. The program compares the characters with a set of pattern images.

e. It advances numerous hypotheses about what this character is.

After processing huge number of such probabilistic hypotheses, the program finally takes the
decision, presenting the recognized text. Using ABBYY FineReader OCR is easy. The process
generally consists of three stages:

1. Open (Scan) the document

2. Recognize it

3. Save in a convenient format

The entire process of data conversion from original paper document, image or PDF takes less
than a minute, and the final recognized document looks just like the original. Advanced,
powerful OCR software allows one to save a lot of time and effort when creating, processing
and repurposing various documents.

The entire project has been developed in two stages:

1. Using a Computer/Laptop.

2. Using a Smart Phone.

Using laptop: At first the project was developed using laptop as the processing medium. The
components exclusively needed for developing using a laptop are: a. Webcam or a digital
camera: A digital camera with 5-megapixel resolution or higher is used equipped with Flash
disable mode, optical zoom, an anti-shake feature and autofocus.

b. ABBYY Screenshot Reader: The Screenshot Reader is an advanced version of ABBYY

i) Finereader:

With ABBYY Screenshot Reader one can take Image Screenshots or Text Screenshots. Image
Screenshots: Easily create screenshots and save them as images, only selected area on the
screen, a complete window (print screen) or his entire desktop can be captured.
15

ii) Text Screenshots:

If one wants to grab some text from an image file, Web site, presentation, or PDF he can
quickly turn text areas into truly editable text that he can paste directly into an open
application, edit or save as Microsoft Word or Excel documents. Screenshot Reader will
convert the image of the screenshot into text.

c. Python Programming:

Python is a widely used general-purpose, high-level programming language. In this project

Python is used with the purpose of reading text from Clipboard and then setting up the serial
communication with the microcontroller board, which in this case is Arduino.

Using Smartphone:

The entire project was shifted to smartphone from laptop primarily due to portability issues.
The components used are:

a. ABBYY Mobile OCR:

This powerful software development kit (SDK) enables images and photographs to be
transformed into searchable and editable document formats and supports all of the most
popular mobile platforms and devices.

Steps for the app:

Step 1: Image import and processing.

Step 2: Document Analysis.

Step 3: OCR – includes the options of Business Card Processing or Barcode Recognition.

Step 4: Result Processing.

b. Android app: We need to establish a serial communication between the Android

environment and the Arduino prototyping board. We need to send the detected text character
by character to the Arduino for it to process. For that purpose, we designed a 'thread', which
reads the entire text and detects and prints character wise, using a delay of 5 seconds. To
establish the serial communication between the Android device and the arduino board, we
16

designed a .apk android file which sends the data from the Android device to the Arduino
using a USB cable. An effective user-interface support for the architecture is important, as an
embedded operating system would provide structure and low-level functionality. This is the
reason the Arduino Uno microcontroller development board was chosen.

Arduino:
Arduino Uno is a microcontroller board based on the ATmega328. It has

14 digital input /output pins, 6 analog inputs, a 16 MHz ceramic resonator, a USB connection,
a power jack, an ICSP header, and a reset button.

a. Memory:

Primary memory is 32 KB. It also has 2 KB of SRAM and 1 KB of EEPROM.

b. Communication:

The ATmega328 provides UART TTL (5V) serial communication, which is available on
digital pins 0 (RX) and 1 (TX).

c. Programming: The Arduino Uno can be programmed with the Arduino software . The
ATmega328 on the Arduino Uno comes preburned with a bootloader that allows one to upload
new code to it without the use of an external hardware programmer. It communicates using the
original STK500 protocol.

Steps:

1. The analog serial pins are at first assigned as six output pins.

2. The baud rate is specified.

3. For each character – all the alphabetic, numeric and a few alphanumeric, specific set of
output pins are high. Here the Braille convention is followed.

4. The program is terminated. All over the world, persons with visual handicaps have used
Braille as the primary means reading information. Standard Braille is an approach to creating
17

documents which could be read through touch. This is accomplished through the concept of a
Braille cell consisting of raised dots on thick sheet of paper.

A cell consists of six dots arranged in the form of a rectangular grid of two dots horizontally
and three dots vertically. With six dots arranged this way, one can obtain sixty three different
patterns of dots. A visually Handicapped person is taught Braille by training him or her in
discerning the cells by touch, accomplished through his or her fingertips.
18

CHAPTER 2

WORKFLOW PROCESS

BRAILLE GLOVE:
All over the world, persons with visual handicaps have used Braille as the primary means
reading information. Standard Braille is an approach to creating documents which could be
read through touch. This is accomplished through the concept of a Braille cell consisting of
raised dots on thick sheet of paper. A cell consists of six dots arranged in the form of a
rectangular grid of two dots horizontally and three dots vertically. With six dots arranged this
way, one can obtain sixty three different patterns of dots. A visually Handicapped person is
taught Braille by training him or her in discerning the cells by touch, accomplished through his
or her fingertips.

The Braille Cell

A printed sheet of Braille normally contains upwards of twenty five rows of text with forty
cells in each row. The physical dimensions of a standard Braille sheet are approximately 11
inches by 11 inches. The dimensions of the Braille cell are also standardized but these may
vary slightly depending on the country. The dimension of a Braille cell, as printed on an
embosser is shown below. The six dots forming the cell permit sixty three different patterns of
19

dot arrangements. Strictly, it is sixty four patterns but the last one is a cell without any dots
and thus serves the purpose of a space. A Braille cell is thus an equivalent of a six bit character
code, if we view it in the light of text representation in a computer! However, it is not related
to any character code in use with computers. In standard English Braille, many of the sixty
three cells will correspond to a letter of the Roman alphabet, or a punctuation mark. A few
cells will represent short words or syllables that are frequently encountered in English. This is
done so that the number of cells required to show a sentence may be reduced, which helps
minimize the space requirements while printing Braille.

The six dots forming the cell permit sixty three different patterns of dot arrangements. It is
matched with alphabets, numbers and special symbols of the English language. The braille
glove contains six vibration motors. These are fixed in five fingers and center palm. The basic
technique used in the hand glove based on retrieval value of English letter value from the user
20

typed in the keyboard. It is converted into Braille value and activated the corresponding
motors. So based on the position of vibration the blind person can understand the value of the
letter. For example if the user can type the letter “r”. It is converted int Braille value as 1,2,3,5
and this value activates the corresponding motors in Braille hand glove. This conversion
program in written in high tech C language and it is recorded in micro controller of the hand
glove. Any blind person can wear this glove in right hand, and understand the English letters
through vibration instead of touch the Braille sheet. Similarly the whole word or sentence is
converted into Braille vibration and send to blind person. Based on this method the visible
person and deaf and blind person can communicate effectively.

The Braille Hand glove will be comprised of the following key components

1. 89C51 Micro controller

2. RS 232 C

3. Relay Driver and Relay

4. power supplies
21

5. Vibrator motor in hand glove

MICROCONTROLLER:
Microcontroller is a general purpose device, which integrates a number of the components of a
microprocessor system onto a single chip. It has inbuilt CPU, memory and peripherals to
make it as a mini computer. A microcontroller is integrated with

1. CPU Core

2. RAM and ROM

3. Some parallel digital i/o ICs

The vibration hand glove contains a microcontroller AT89C51. It is the 40 pins, 8 bit
Microcontroller manufactured by Atmel group. It is the flash type reprogrammable memory.
Advantage of this flash memory is we can erase the program within a few minutes. It has 4KB
on chip ROM and 128 bytes internal RAM and 32 I/O pin as arranged as port 0 to port 3 each
has 8 bit bin .Port 0 contains 8 data line (D0-D7) as well as low order address line(AO-A7).
22

The position identification and controlling the motors is programmed in hi tech c language and
is loaded in microcontroller.

1) Crystal:

The heart of the microcontroller is the circuites which generate the clock pulse. Then
microcontroller provides the two pins. XTAL 1, XTAL 2 to correct the external crystal
resonator along with capacitor. The crystal frequency is the basic clock frequency of the
microcontroller. Based on the frequency rotation time of vibration motor inside the hand glove
is controlled by micro controller.

2) Reset:

The memory location for 89C51 0000H to 0FFFH. Whenever switch on the supply the
memory location starts from 0000H.The 89C51 micro controller provide 9th pin for Reset
Function. Here the reset circuitry consists of 10Mfcapacitor in series with 10k resistor. When
switch on the supply of the capacitor changes and discharged gives high low pulse to the 9th
pin through the 7414 inverter. Here we interface LCD display to microcontroller via port 0 and
port

2. LCD control lines are connected in port 2 and Data lines are connected in port 0. whenever
struggle in motor speed, it is used to restart the program.

3) LCD:

Liquid Crystal Display has 16 pins in which first three and 15th pins are used for power
supply. 4th pin is RS(Register Selection) if it is low data and if it is high command will be
displayed. 5th pin is R/W if it is low it performs write operation. 6th pin act as enable and
remaining pins are data lines In vieration hand glove RS-232 is a standard for serial binary
data interconnection between a DTE (Data terminal equipment) and a DCE (Data
Circuit-terminating Equipment). It is commonly used in computer serial ports.Here ascii
values are converted into binary signals and send to vibration glove to activates the vibration
motors.

.
23

Details of the character format and transmission bit rate are controlled by the serial port
hardware, often a single integrated circuit called a UART that converts data from parallel to
serial form. A typical serial port includes specialized driver and receiver integrated circuits to
convert

between internal logic levels and RS-232 compatible signa llevels. A relay is an electrically
operated switch. Current flowing through the coil of the relay creates a magnetic field which
attracts a lever and changes the switch contacts. The coil current can be on or off so relays
have two switch positions and they are double throw (changeover) switch. The coil of a relay
passes a relatively large current, typically 30mA for a 12V relay, but it can be as much as
100mA for relays designed to operate from lower voltages. Most ICs (chips) cannot provide
this current and a transistor is usually used to amplify the small IC current to the larger value
required for the relay coil. The maximum output current for the popular 555 timer IC is
200mA so these devices can supply relay coils directly without amplification.
24

CHAPTER 3

MODULES

MAXIMUM STABLE EXTERNAL REGIONS (MSER) –

EXTRACTION ALGORITHM

Step 1: Detect Candidate Text Regions Using MSER The MSER feature detector works well
for finding text regions. It works well for text because the consistent color and high contrast of
text leads to stable intensity profiles. Use the detect MSER Features function to find all the
regions within the image and plot these results. Notice that there are many non-text regions
detected alongside the text.

MSER Regions

Step 2: Remove Non-Text Regions Based On Basic Geometric Properties

After removing the non text regions based on geometric properties

Although the MSER algorithm picks out most of the text, it also detects many other stable
regions in the image that are not text. You can use a rule-based approach to remove non-text
regions. For example, geometric properties of text can be used to filter out non-text regions
using simple thresholds. Alternatively, you can use a machine learning approach to train a text
vs. non-text classifier. Typically, a combination of the two approaches produces better results.
This example uses a simple rule based approach to filter non-text regions based on geometric
properties. There are several geometric properties that are good for discriminating between
text regions including: Aspect ratio, Eccentricity, Euler number, Extent, solidity.

Step 3: Remove Non-Text Regions Based On Stroke Width Variation another common metric
used to discriminate between text and non-text is stroke width. Stroke width is a measure of
the width of the curves and lines that make up a character. Text regions tend to have little
stroke width variation, whereas non-text regions tend to have larger variations. To help
understand how the stroke width can be used to remove non-text regions, estimate the stroke
width of one of the detected MSER regions. You can do this by using a distance transform and
binary thinning operation. In the images shown above, notice how the stroke width image has
very little variation over most of the region. This indicates that the region is more likely to be
26

a text region because the lines and curves that make up the region all have similar widths,
which is a common characteristic of human readable text.

Region Image

Step 4: Merge Text Regions For Final Detection Result

At this point, all the detection results are composed of individual text characters. To use these
results for recognition tasks, such as OCR, the individual text characters must be merged into
words or text lines. This enables recognition of the actual words in an image, which carry
more meaningful information than just the individual characters. For example, recognizing the
string 'EXIT' vs. the set of individual characters {'X','E','T','I'}, where the meaning of the word
is lost without the correct ordering. One approach for merging individual text regions into
words or text lines is to first find neighbouring text regions and then form a bounding box
around these regions.

To find neighbouring regions, expand the bounding boxes computed earlier with region props.
This makes the bounding boxes of neighbouring text regions overlap such that text regions
27

that are part of the same word or text line form a chain of overlapping bounding boxes. Now,
the overlapping bounding boxes can be merged together to form a single bounding box around
individual words or text lines. To do this, compute the overlap ratio between all bounding box
pairs. This quantifies the distance between all pairs of text regions so that it is possible to find
groups of neighboring text regions by looking for non-zero overlap ratios. Once the pair-wise
overlap ratios are computed, use a graph to find all the text regions "connected" by a non-zero
overlap ratio.

After detecting the text regions, use the OCR function to recognize the text within each
bounding box. Note that without first finding the text regions, the output of the ocr function
would be considerably more noisy.

OPTICAL CHARACTER RECOGNITION

Optical character recognition (OCR) is a process of converting a printed document or scanned

page into ASCII characters that a computer can recognize. Computer systems equipped with
such an OCR system improve the speed of input operation, decrease some possible human
errors and enable compact storage, fast retrieval and other file manipulations. The range of
applications includes postal code recognition, automatic data entry into large administrative
systems, banking, automatic cartography and reading devices for blind. Accuracy, flexibility
and speed are the main features that characterize a good OCR system. Several algorithms for
character recognition have been developed based on feature selection. Some of them have
been found commercially viable and have gone into production like Omni Page, Word scan,
Type Reader etc. The performances of the systems have been constrained by the dependence
on font, size and orientation. The

1. Document
2. Gray scale conversion
3. Filtering
4. Stored
5. character
6. Gray scale conversion
7. Filtering
8. Features Extraction
9. Recognition pattern
29

10. Recognition output

Recognition rate in these algorithms depends on the choice of features. Most of the existing
algorithms involve extensive processing on the image before the features are extracted that
results in increased computational time. In this paper, we discuss a pattern matching based
method for character recognition that would effectively reduce the image processing time
while maintaining efficiency and versatility. The parallel computational capability of neural
network ensures a high speed of recognition which is critical to a commercial environment.
The key factors involved in the implementation are: an optimal selection of features which
categorically defines the details of the characters, the number of features and a low image
processing time.

POST PROCESSING:

The objective of post-processing is to correct errors or resolve ambiguities in OCR results by

using contextual information. There are a number of levels at which context may be operative.
It can be at the word level, at the sentence level and at the level of semantics. The most
common post-processing technique which operates at the word level is the dictionary look-up
method. The output of the OCR is compared to the system‟s built-in dictionary (lexicon) and
candidates are generated. According to the difference between the output of the OCR and the
output of the dictionary look-up, the numbers denoting the confidence level in the correct
classification are modified. The output sequence of suitable candidates is then ordered and the
best candidate selected. For instance, the correction candidates for the error word “popes” can
be “opposed”, “proposed”, “pops”, and “popes”. In point of fact, several non-trivial
dictionary-based error correction algorithms exist, one of which is the string matching
algorithm that weights the words in a text using a distance metric representing various costs.
The correction candidate with the lowest distance with respect to the misspelled word is the
best to fit as a correction. Another algorithm demonstrated that using the language syntactic
properties and the n-gram model can speed-up the process of generating correction candidates
and ultimately picking up the best matching candidate. The next proposed OCR post error
correction method based on pattern learning, wherein a list of correction candidates is first
generated from a lexicon, then the most proper candidate is selected as a correction based on
30

the vocabulary and grammar characteristics surrounding the error word. Other proposed
system is a statistical method for auto-correction of OCR errors; this approach uses a
dictionary to generate a list of correction candidates based on the n-gram model. Then, all
words in the OCR text are grouped into a frequency matrix that identifies the exiting sequence
of characters and their count. The correction candidate having the highest count in the
frequency matrix is then selected to substitute the error word.

TEXT TO SPEECH (TTS)

A Text-to-speech synthesizer is an application that converts text into spoken word, by

analyzing and processing the text using Natural Language Processing (NLP) and then using
Digital Signal Processing (DSP) technology to convert this processed text into synthesized
speech representation of the text. The text-to-speech (TTS) synthesis procedure consists of
two main phases. The first is text analysis, where the input text is transcribed into a phonetic
or some other linguistic representation, and the second one is the generation of speech
waveforms, where the output is produced from this phonetic and prosodic information. These
two phases are usually called high and low-level synthesis. The input text might be for
example data from a word processor, standard ASCII from e-mail, a mobile text-message, or
scanned text from a newspaper. The character string is then pre-processed and analyzed into
phonetic representation which is usually a string of phonemes with some additional
information for correct intonation, duration, and stress. Speech sound is finally generated with
the low-level synthesizer by the information from high-level synthesizer.

The structure of the text-to-speech synthesizer can be broken down into major modules:
Natural Language Processing (NLP) module: It produces a phonetic transcription of the text
read, together with prosody.

Digital Signal Processing (DSP) module:

It transforms the symbolic information it receives from NLP into audible and intelligible
speech. A text to speech system is composed of two parts: a front-end and a back-end. The
front-end has two major tasks. First, it converts raw text containing symbols like numbers and
abbreviations into the equivalent of written-out words. This process is often called text
normalization, pre-processing, or tokenization. The front-end then assigns phonetic
31

transcriptions to each word, and divides and marks the text into prosodic units, like phrases,
clauses, and sentences. The process of assigning phonetic transcriptions to words is called
text-to phone me or grapheme-to-phoneme conversion. Phonetic transcriptions and prosody
information together make up the symbolic linguistic representation that is output by the
front-end. The back-end often referred to as the synthesizer then converts the symbolic
linguistic representation into sound. In certain systems, this part includes the computation of
the target prosody (pitch contour, phoneme durations), which is then imposed on the output.

CAMERA

Digital camera is a camera that encodes digital images and videos digitally and stores them. It
shares an optical system, typically using a lens with a variable diaphragm to focus light onto
an image pickup device. The diaphragm and shutter admit the correct amount of light to the
imager, just as with film but the image pickup device is electronic rather than chemical. A
digital camera records and stores photographic image in digital form. Many current models are
also able to capture images

PRINCIPLE OF CAMERAS:

It captures light through a small lens at the front using a tiny grid of microscopic light
detectors built into an image sensing microchip (either a charge-coupled device (CCD) or
more likely these days, a CMOS image sensor). A simple Webcam setup consists of a digital
camera attached to your computer, typically through the USB port. The camera part of the
32

Webcam setup is just a digital camera -- there's really nothing special going on there. The
Webcam nature of the camera comes with the software.

USB PORT

DTV standards such as ATSC, DVB, as these are encapsulations of the MPEG data streams,
which are passed off to a decoder, and output as uncompressed video data, which can be
high-definition. This video data is then encoded into TMDS for transmission digitally over
HDMI. HDMI also includes support for 8-channel uncompressed digital audio. Beginning
with version 1.2, HDMI now supports up to 8 channels of one-bit audio. One-bit audio is what
is used on Super Audio CDs

RASPBERRY PI

The Raspberry pi is a single computer board with credit card size, that can be used for many
tasks that your computer. The main purpose of designing the raspberry pi board is, to
encourage learning, experimentation and innovation for school level students. The raspberry pi
board is a portable and low cost. Maximum of the raspberry pi computers is used in mobile
phones. In the 21st century, the growth of mobile computing technologies is very high, a huge
33

segment of this being driven by the mobile industries. The 98% of the mobile phones were
using ARM technology.

ULTRASONIC SENSOR

An Ultrasonic sensor is a device that can measure the distance to an object by using sound
waves. It measures distance by sending out a sound wave at a specific frequency and listening
for that sound wave to bounce back. By recording the elapsed time between the sound wave
being generated and the sound wave bouncing back, it is possible to calculate the distance
between the sonar sensor and the object.
34

CHAPTER 4

APPLICATIONS

● People with learning disabilities: Some people have difficulty reading large amounts of
text due to dyslexia and other learning disabilities. Offering them an easier option for
experiencing website content is a great way to engage them.
● People who have literacy difficulties: Some people have basic literacy levels. They often
get frustrated trying to browse the internet because so much of it is in text form. By
offering them an option to hear the text instead of reading it, they can get valuable
information in a way that is more comfortable for them.
● People who speak the language but do not read it: Having a speech option for the foreign
born will open up your audience to this under-served population. Many people who
come to a new country learn to speak and understand the native language effectively, but
may still have difficulty reading in a second language.
● Though they may be able to read content with a basic understanding, text to speech
technology allows them to take in the information in a way they are more comfortable
with, making your content easier to comprehend and retain.
● People who multitask: A busy life often means that people do not have time to do all the
reading they would like to do online. Having a chance to listen to the content instead of
reading it allows them to do something else at the same time. With the prevalence of
smartphones and tablets, it also provides an option for content consumption on the go,
taking content away from the computer screen and into any environment thats
convenient for the consumer.
● People with visual impairment: Text to speech can be a very useful tool for the mild or
moderately visually impaired. Even for people with the visual capability to read, the
process can often cause too much strain to be of any use or enjoyment. With text to
speech, people with visual impairment can take in all manner of content in comfort
instead of strain. People who access content on mobile devices:
35

● Reading a great deal of content on a small screen is not always easy. Having
text-to-speech software doing the work is much easier. It allows people to get the
information they want without a great deal of scrolling and aggravation
● People with different learning styles: Some people are auditory learners, some are visual
learners, and some are kinaesthetic learners – most learn best through a combination of
the three. Universal Design for
● Learning is a plan for teaching which, through the use of technology and adaptable
lesson plans, aims to help the maximum number of learners comprehend and retain
information by appealing to all learning styles.
36

CHAPTER 5

CONCLUSION
The results obtained from the procedure described above are indicated in the figures below.
The preprocessed image which is given to tesseract OCR engine to extract the text in the
image. However due to the less resolution of the webcam, the output obtained is not 100%
accurate. The accuracy can be improved by making use of a HD camera or mobile camera.
This project provides a novel concept for text reading for the blind, utilizing a local-sequential
scan. The system includes a text tracking algorithm that extracts words from a close-up
camera view. Text to speech synthesis is a rapidly growing aspect of computer technology and
is increasingly playing a more important role in the way we interact with the system and
interfaces across a variety of platforms. The planned system gives a very simple method for
text to speech conversion. Text inputs like the alphabets, sentences, words and numbers are
given to the system. Text to speech conversions is achieved and receives a better result which
is audible and perfect. This system is very much used in the web applications, email readings,
mobile applications and so on for making an intelligent speaking system. Suggested system, as
an independent program, is fairly cheap and it is possible to install onto smart phone held by
blind people. This allows blind people to easy excess the program. This project is a standalone
application developed in Python which can be installed on any system free of cost. The
motivation for the development of this algorithm was the simple fact that English alphabets
are fixed glyphs and they shall not be changed ever. In this project, we have described a
system to read printed text and hand held objects for assisting the blind people is described. To
extract text regions from complex backgrounds, a novel text localization algorithm based on
models of stroke orientation and edge distributions using canny algorithm is proposed. Block
patterns project the proposed feature maps of an image patch into a feature vector. Adjacent
character grouping is performed to calculate candidates of text patches prepared for text
classification. OCR is used to perform word recognition on the localized text regions and
transform into audio output for blind users. The camera acts as input for the paper. As the
Raspberry Pi board is powered, the camera starts streaming. Speech recognition technology is
of particular interest due to the direct support of communications in between humans and
computers. The streaming data will be displayed on the screen. Using Tesseract library the
image will be converted into data and the data detected from the image will be shown on the
37

status bar. The obtained data will be pronounced through the ear phones. An image to speech
conversion technique using raspberry pi is implemented. The simulation results have been
successfully verified and the hardware output has been tested using different samples. The
algorithm successfully processes the image and reads it out clearly and it provides significant
help for people with disabilities. This is an economical as well as efficient device for the
visually impaired people. We have applied our algorithm on many images and found that it
successfully does its conversion. The device is compact and helpful to society. The main
advantages of this project are it requires less

consumption of time in recognizing and reading text with lower operational costs also text of
different fonts can be recognized. This project can also be used by partial blind people and
elderly people with different eyesight problems. It plays a significant role for visually
impaired students in their education. Logically, if listening gets a reader through text more
quickly, then it must be considered more efficient when time is of concern. Other advantages
include more flexibility, high accuracy, it is best suited for different illuminant condition and it
can be executed easily. There are few limitations to this project. Font size below 20 cannot be
recognized and the camera does not auto-focus. The major challenge is that, it is hard to adjust
to the distance between the camera and book. Speech recognizers are not perfect listeners.
They make mistakes. A big challenge in designing speech applications is working with
imperfect speech recognition technology. The problem of adjusting the distance between book
and camera can be solved by designing a robotic table that flips the pages automatically. By
providing a battery backup to the raspberry pi, the main aim of the proposed project of
portability can be achieved. The future work will be concentrated on developing an efficient
portable product that can extract text from any image enabling the blind people to read text
present on the products, banners, books etc. This project can effectively distinguish the object
of interest from the background or other objects in the camera view, in future this project can
be implemented in hardware which is used to detect and recognize the object and vehicles on
the road, so that it will assist the person not to cross the road during vehicle movement. The
algorithm can also be extended to handle non horizontal text strings. Future work will extend
localization algorithm to process text strings with characters fewer than three and to design
more robust block patterns for text feature extraction.The alignment of camera can be adjusted
and use more function of ocr to enhance the application. By enhancing application the
38

electronics labels, vehicles number can be scanned and processed and can be used for traffic
monitoring.
39

BIBLIOGRAPHY
1. APPLE, Voice Over para OS X, online available in accessibility.
2. V. Ajantha devi1, dr. Santhosh baboo “Embedded optical character recognition on Tamil
text image using raspberry pi” international journal of computer science trends and
technology (IJCST) – volume 2 issue 4, Jul-Aug 2014
3. Bindu Philip and R. D. sudhaker Samuel 2009 “Human machine interface – a smart ocr for
the visually challenged” International Journal of Recent trends in Engineering, vol
no.3,November 2009
4. Ezaki N., Bulacu M., and Schomaker, L. Text detection from natural scene images: towards
a system for visually impaired persons. In ICPR(2004).
5. Gopinath , aravind , pooja et.Al “Text to speech conversion using Matlab” International
journal of emerging technology and advanced engineering. volume 5, issue 1, January 2015.
6. Khushali Desai, Jaiprakash verma, “Image to sound conversion” International journal of
advance research.
7. Peters, J.Thillou, S. Embedded reading device for blind people: a usercentered design in
ISIT 2004.
8. V. Bhope, Prachi khilari, “Online speech to text engine” International journal of innovative
research in science, engineering and technology, issue 7, July 2015.
40

WEB REFERENCES

➔ https://www.researchgate.net/publication/282270189_A_TEXT_READING_SYSTEM
_FOR_THE_VISUALLY_DISABLED
➔ https://www.researchgate.net/publication/321883136_A_device_to_assist_blind_in_rea
ding_text
➔ https://www.ijitee.org/wp-content/uploads/papers/v8i6s3/F10350486S319.pdf

MAST 6474 Introduction To Data Analysis I MAST 6478 Data Analytics
No ratings yet
MAST 6474 Introduction To Data Analysis I MAST 6478 Data Analytics
4 pages
Natg12 Media Information Literacy
No ratings yet
Natg12 Media Information Literacy
4 pages
Guide Component Testing
0% (1)
Guide Component Testing
4 pages
01 TMSS 01 R0
0% (1)
01 TMSS 01 R0
0 pages
Blind Assistance System Using Image Processing
No ratings yet
Blind Assistance System Using Image Processing
11 pages
Final Project - Report - 2021
No ratings yet
Final Project - Report - 2021
59 pages
C++ Quiz
No ratings yet
C++ Quiz
4 pages
Quarter 2 Lesson 1
No ratings yet
Quarter 2 Lesson 1
48 pages
PEST Analysis: P E S T
100% (1)
PEST Analysis: P E S T
25 pages
Entreprenureship MCQS For Practice
No ratings yet
Entreprenureship MCQS For Practice
11 pages
DT in Entrep
100% (3)
DT in Entrep
2 pages
Entrep. Final Exam - Hossam
No ratings yet
Entrep. Final Exam - Hossam
8 pages
History of Management Trend - Stephen Robbins
No ratings yet
History of Management Trend - Stephen Robbins
24 pages
Module 3: Human Resources and Job Design: Multiple Choice
No ratings yet
Module 3: Human Resources and Job Design: Multiple Choice
4 pages
EC2004IM ch06
No ratings yet
EC2004IM ch06
15 pages
UNIT-2 Sources of Finance
No ratings yet
UNIT-2 Sources of Finance
33 pages
INTRAPRENEURSHIP
No ratings yet
INTRAPRENEURSHIP
9 pages
BC CH18
No ratings yet
BC CH18
36 pages
Structured Analysis Part 1
No ratings yet
Structured Analysis Part 1
19 pages
NAME: - SECTION: - Classwork: Financial Planning, Tools, and Concept (Budgeting) Business Finance I. True or False
100% (1)
NAME: - SECTION: - Classwork: Financial Planning, Tools, and Concept (Budgeting) Business Finance I. True or False
3 pages
Discovering Computers Chapter 9 Practice Test
100% (1)
Discovering Computers Chapter 9 Practice Test
4 pages
Chapter 5 Case Study-P&G, Unilever, Panasonic: The $2-A-Day Initiative
0% (1)
Chapter 5 Case Study-P&G, Unilever, Panasonic: The $2-A-Day Initiative
3 pages
End Term Practice Pev107 1
No ratings yet
End Term Practice Pev107 1
139 pages
Role of CCTV in Business Organization: A Case Study: Literature Review
No ratings yet
Role of CCTV in Business Organization: A Case Study: Literature Review
5 pages
Week 1 PPT 1
No ratings yet
Week 1 PPT 1
19 pages
Working Capital
No ratings yet
Working Capital
2 pages
Labreportnew PDF
100% (1)
Labreportnew PDF
8 pages
2013 New Pre-Mid Dept Exam
No ratings yet
2013 New Pre-Mid Dept Exam
6 pages
Fabm 1 SS 11 Q3 0402
No ratings yet
Fabm 1 SS 11 Q3 0402
7 pages
HRM Chapter 3 - Multiculturalism and Diversity
No ratings yet
HRM Chapter 3 - Multiculturalism and Diversity
7 pages
Chapter 1
No ratings yet
Chapter 1
38 pages
Nat 2024 Reviewer
No ratings yet
Nat 2024 Reviewer
51 pages
Powerpoint Activity
No ratings yet
Powerpoint Activity
14 pages
Service Rate 3 Minutes/call or 10 Calls/30 minutes/CSR Utilization (U) Demand Rate/ ( (Service Rate) (Number of Servers) )
No ratings yet
Service Rate 3 Minutes/call or 10 Calls/30 minutes/CSR Utilization (U) Demand Rate/ ( (Service Rate) (Number of Servers) )
8 pages
Intrapreneurship: Developing Corporate Entrepreneurship
100% (1)
Intrapreneurship: Developing Corporate Entrepreneurship
20 pages
PSA3 Technical - PHP Arrays and User Defined Functions
No ratings yet
PSA3 Technical - PHP Arrays and User Defined Functions
13 pages
Activity Sheet - Preparing A Business Plan (2nd)
No ratings yet
Activity Sheet - Preparing A Business Plan (2nd)
9 pages
Job Letter of Intent
No ratings yet
Job Letter of Intent
1 page
Identify The Choice That Best Completes The Statement or Answers The Question
No ratings yet
Identify The Choice That Best Completes The Statement or Answers The Question
13 pages
CSS Grade 11
No ratings yet
CSS Grade 11
4 pages
Chapter 1 Introduction To Services Marketing
100% (1)
Chapter 1 Introduction To Services Marketing
28 pages
C++ Programming 3RD Quarter Test
No ratings yet
C++ Programming 3RD Quarter Test
2 pages
Wiley Basic Actg
No ratings yet
Wiley Basic Actg
30 pages
Entrep Midterm Pretest
No ratings yet
Entrep Midterm Pretest
3 pages
Holy Cross of Bansalan College Bansalan, Davao Del Sur A.Y 2011-2012 Syllabus in IT 124 (Program Logic Formulation) Curriculum Pacing Guide
No ratings yet
Holy Cross of Bansalan College Bansalan, Davao Del Sur A.Y 2011-2012 Syllabus in IT 124 (Program Logic Formulation) Curriculum Pacing Guide
8 pages
Foreign Literature
No ratings yet
Foreign Literature
3 pages
Requirements For The Future Digital Library PDF
No ratings yet
Requirements For The Future Digital Library PDF
4 pages
Lecture Note 15 - Pitching 08.10.2021
No ratings yet
Lecture Note 15 - Pitching 08.10.2021
73 pages
Report On New Venture Creation - A Sample
No ratings yet
Report On New Venture Creation - A Sample
8 pages
Make or Buy Decision Analysis Exercises
100% (1)
Make or Buy Decision Analysis Exercises
1 page
CSR MCQ 2
No ratings yet
CSR MCQ 2
3 pages
Preparing A Proper Ethical and Legal Foundation: Bruce R. Barringer R. Duane Ireland
No ratings yet
Preparing A Proper Ethical and Legal Foundation: Bruce R. Barringer R. Duane Ireland
38 pages
BUSINESS-Plan2.docx FINAL
No ratings yet
BUSINESS-Plan2.docx FINAL
48 pages
PM 211 Chapter 5
No ratings yet
PM 211 Chapter 5
18 pages
Chapter 11 - Short-Run Technology Constraint
No ratings yet
Chapter 11 - Short-Run Technology Constraint
2 pages
IS333 Assignment 1 2022 - Specification - Solution
No ratings yet
IS333 Assignment 1 2022 - Specification - Solution
11 pages
Learning Intervention Plan
No ratings yet
Learning Intervention Plan
10 pages
Business Computing Quiz (Part 3)
No ratings yet
Business Computing Quiz (Part 3)
8 pages
Chapter 3.2 - Functions and Purposes of Translators (Cambridge AL 9691)
100% (1)
Chapter 3.2 - Functions and Purposes of Translators (Cambridge AL 9691)
4 pages
Screenshot 2024-05-11 at 8.17.17 PM
No ratings yet
Screenshot 2024-05-11 at 8.17.17 PM
67 pages
Research Paper 1
No ratings yet
Research Paper 1
8 pages
Speech Mentor For Visually Impaired
No ratings yet
Speech Mentor For Visually Impaired
10 pages
Long Range AESA Air & Surface Surveillance Radar
100% (2)
Long Range AESA Air & Surface Surveillance Radar
2 pages
User Manual 1
No ratings yet
User Manual 1
31 pages
Users Manual - Ocarina Player - V3
No ratings yet
Users Manual - Ocarina Player - V3
12 pages
Ross Phaser Handmade
No ratings yet
Ross Phaser Handmade
1 page
Cambio Rapido Lado Robot Asr
No ratings yet
Cambio Rapido Lado Robot Asr
1 page
GST200 2
No ratings yet
GST200 2
4 pages
Bushings For High Voltage AC Applications
No ratings yet
Bushings For High Voltage AC Applications
12 pages
56-SDMS-09 Rev.0 Final
No ratings yet
56-SDMS-09 Rev.0 Final
16 pages
Unit I and II
No ratings yet
Unit I and II
31 pages
Havells IP Pricelist 15 March 2024 - LV Switchgear
No ratings yet
Havells IP Pricelist 15 March 2024 - LV Switchgear
39 pages
Designing A Simple RISC Computer (Mini SRC) : Phase 2: ELEC-374 Digital Systems Engineering Laboratory Project
No ratings yet
Designing A Simple RISC Computer (Mini SRC) : Phase 2: ELEC-374 Digital Systems Engineering Laboratory Project
10 pages
Isu Tables
No ratings yet
Isu Tables
1 page
Construction of DC Generator
No ratings yet
Construction of DC Generator
58 pages
Unit 3 Inverter
No ratings yet
Unit 3 Inverter
23 pages
Current Relays Under Current CSG140
No ratings yet
Current Relays Under Current CSG140
2 pages
Past Board Exam Problem.........
100% (2)
Past Board Exam Problem.........
11 pages
ECS IITM Students Chapter PSGIAS Workshop 2024
No ratings yet
ECS IITM Students Chapter PSGIAS Workshop 2024
3 pages
Mo Theory
0% (1)
Mo Theory
80 pages
PV Lab Manual PDF
No ratings yet
PV Lab Manual PDF
15 pages
Transducers and Instrumentation Lab Manual: Mte-317 (L)
No ratings yet
Transducers and Instrumentation Lab Manual: Mte-317 (L)
28 pages
MCP 79410
No ratings yet
MCP 79410
34 pages
Tesla Infinity Hle 18 KW Inverter
No ratings yet
Tesla Infinity Hle 18 KW Inverter
2 pages
KBW Moisture Meter
No ratings yet
KBW Moisture Meter
1 page
Geely 16
100% (1)
Geely 16
416 pages
NBC 225-12i (Eng)
No ratings yet
NBC 225-12i (Eng)
2 pages
2014 01 01 MAN-cats-parts-catalogue
No ratings yet
2014 01 01 MAN-cats-parts-catalogue
16 pages
Mobrey: Boiler Water Level Controls
No ratings yet
Mobrey: Boiler Water Level Controls
12 pages
3aua00000145616 Revb
No ratings yet
3aua00000145616 Revb
316 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Blind Aid Report

Uploaded by

Blind Aid Report

Uploaded by

A

TECHNICAL SEMINAR REPORT ON

Submitted in partial fulfillment of the requirement

Sri Lakshmi Naga Sahithi Karlapalem - 16P61A0575

VIGNANA BHARATHI INSTITUTE OF TECHNOLOGY

(​A UGC Autonomous Institution, ​ Approved by AICTE,

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

SEMINAR IN-CHARGE HEAD OF THE DEPARTMENT

​1.2 PROBLEM STATEMENT

1.3 EXISTING SYSTEM

Slate and stylus:

Personal digital assistant (PDA):

1.3 PROPOSED SYSTEM

a. Independence in daily activities is the topmost priority for blind people.

c. Products targeted specifically at blind people tend to be more expensive.

1. A high resolution camera.

2. A laptop or a smart phone.

Its salient features:

1. Convert scans & PDFs into searchable files.

2. One-click OCR software.

3. Convert documents with unmatched recognition accuracy, virtually eliminating retyping.

Let’s take a look on how FineReader OCR recognizes text.

a. The program analyzes the structure of document image.

d. The program compares the characters with a set of pattern images.

e. It advances numerous hypotheses about what this character is.

1. Open (Scan) the document

3. Save in a convenient format

The entire project has been developed in two stages:

2. Using a Smart Phone.

b. ABBYY Screenshot Reader: The Screenshot Reader is an advanced version of ABBYY

ii) Text Screenshots:

Python is a widely used general-purpose, high-level programming language. In this project

a. ABBYY Mobile OCR:

Steps for the app:

Step 1: Image import and processing.

Step 2: Document Analysis.

Step 4: Result Processing.

b. Android app: We need to establish a serial communication between the Android

Primary memory is 32 KB. It also has 2 KB of SRAM and 1 KB of EEPROM.

2. The baud rate is specified.

The Braille Cell

1. 89C51 Micro controller

3. Relay Driver and Relay

5. Vibrator motor in hand glove

2. RAM and ROM

3. Some parallel digital i/o ICs

MAXIMUM STABLE EXTERNAL REGIONS (MSER) –

Step 2: Remove Non-Text Regions Based On Basic Geometric Properties

After removing the non text regions based on geometric properties

Step 4: Merge Text Regions For Final Detection Result

OPTICAL CHARACTER RECOGNITION

Optical character recognition (OCR) is a process of converting a printed document or scanned

10. Recognition output

The objective of post-processing is to correct errors or resolve ambiguities in OCR results by

TEXT TO SPEECH (TTS)

A Text-to-speech synthesizer is an application that converts text into spoken word, by

Digital Signal Processing (DSP) module:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

(A UGC Autonomous Institution, Approved by AICTE,

1.2 PROBLEM STATEMENT