0% found this document useful (0 votes)

15 views4 pages

Irjet V10i1080

The document discusses a project focused on image-to-text-to-speech conversion using machine learning, which aims to enhance accessibility for users, particularly those with visual impairments. By integrating Optical Character Recognition (OCR) and Text-to-Speech (TTS) technologies, the project enables the extraction of text from images and its subsequent conversion to speech. The proposed system demonstrates high accuracy and efficiency, with potential applications in education, research, and daily navigation for individuals with disabilities.

Uploaded by

sanjana.devarapalli7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views4 pages

Irjet V10i1080

Uploaded by

sanjana.devarapalli7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 10 Issue: 10 | Oct 2023 www.irjet.net p-ISSN: 2395-0072

IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNING

Jeevanantham L1, Venkatesh V2, Gowri P3, Mariaamutha R4

1Student, Dept. of Electronics and Communication, Bannari Amman Institute of Technology, Tamil Nadu, India
2 Student, Dept. of Electronics and Communication, Bannari Amman Institute of Technology, Tamil Nadu, India
3Student, Dept. of Electronics and Communication, Bannari Amman Institute of Technology, Tamil Nadu, India
4Professor, Dept. of Electronics and Communication, Bannari Amman Institute of Technology, Tamil Nadu, India

---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Image-to-text-to-speech conversion using evaluate our models. This project aims to develop a tool that
machine learning is a rapidly developing field with the takes an image as input and extracts characters like symbols,
potential to revolutionize the way we interact with alphabets, and digits from it. The image can include a printed
information. By combining the technologies of optical document, newspaper It is used as a type of data entry from
character recognition (OCR) and text-to-speech (TTS), the printed records.
machine learning can be used to extract text from images and
convert it to speech in a more accurate, efficient, and robust Image to text to speech conversion using machine
way than ever before. This technology has the potential to learning is a challenging task, but deep learning models can
make information more accessible and engaging for a wide be used to develop ITTS systems that are more accurate and
range of users, including people with visual impairments, robust. ITTS systems have the potential to improve the
students, tourists, researchers, and musicians. For example, a accessibility of information for people with visual
student with a visual impairment could use image-to-text-to- impairments and to provide access to information in images
speech conversion to convert scanned textbooks and other in a more convenient way.
course materials into speech, making them easier to access
and study. A tourist could use image-to-text-to-speech 2. RELATED WORKS
conversion to translate signs and other text in a foreign
In this study, the author suggested that, Image
language into speech, making it easier to navigate and get
captioning is a fundamental task in the realm of computer
around. A researcher could use image-to-text-to-speech
vision and natural language processing. Several state-of-the-
conversion to extract data from scientific papers and other
art models have been proposed for generating textual
documents, making it easier to analyze and synthesize the
descriptions of images. In recent years, there has been a
information. A musician could use image-to-text-to-speech
growing interest in developing image to text to speech
conversion to create new musical compositions by converting
(ITTS) converters using machine learning (ML). Here is a
text to speech and then manipulating the audio output.
summary of some of the most notable existing works:
Machine learning is also being used to improve the quality and
naturalness of the synthesized speech in image-to-text-to- [1] Bedford, 2017 proposed a deep learning-based ITTS
speech conversion systems. For example, machine learning converter that uses a cascaded network of convolutional
algorithms can be used to take into account factors such as the neural networks (CNNs) to perform image pre-processing,
language, accent, and prosody of the speaker. This can lead to OCR, and TTS. The converter achieved state-of-the-art
more realistic-sounding speech that is easier to understand. results on several public ITTS datasets.

Key Words: Accuracy of algorithm, Machine learning, [2] Caulfield et al., 2018 proposed an end-to-end ITTS
Picture-to-text synthesis algorithms. converter that uses a single deep learning model to perform
all three steps of the ITTS process. The model achieved
1.INTRODUCTION comparable performance to the cascaded network approach
proposed by Bedford (2017), but with improved efficiency.
Our project is capable to recognize the text
and convert the input into audio. The input can be given in [3] Davis et al., 2019 proposed an ITTS converter that uses a
many formats such as text, pdf, docx, format and image (jpg, multi-task deep learning model to learn the relationships
png). Image acquisition, recognition and speech conversion between the three steps of the ITTS process. The model
using Optical Character Recognition (OCR). An Image achieved state-of-the-art results on several public ITTS
Processing Technology used to convert the image containing datasets, including datasets with handwritten and distorted
horizontal text into text documents and the extracted text is text.
converted into speech. Our approach combines state-of-the-
4] Benjamin Z. Yao, Xiong Yang, Liang Lin, Mun Wai Lee and
art deep learning techniques for image captioning with
Song-Chun Zhu proposed an image parsing to text
advanced TTS technology. We will use established machine
description that generates text for images and video content.
learning libraries and frameworks to implement and

© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 553
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 10 | Oct 2023 www.irjet.net p-ISSN: 2395-0072

Image parsing and text description are the two major tasks converted to speech for reference. It is planned to develop a
of his framework. It computes a graph of most probable web application where image acts as a input from which text
interpretations of an input image. This parse graph includes is extracted and converted into speech.
a tree structured decomposition contents of scene, pictures
or parts that cover all pixels of image.

[5] Paper introduced by Yi-Ren Yeh, Chun-Hao Huang, and

Yu-Chiang Frank Wang presents a novel domain adaptation
approach for solving cross domain pattern recognition
problem where data and features to be processed and
recognized are collected for different domains.

[6] S. Shahnawaz Ahmed, Shah Muhammed Abid Hussain and

Md. Sayeed Salam introduced a model of image to text
conversion for electricity meter reading of units in kilo-watt
by capturing its image and sending that image in the form of
Multimedia Message Service (MMS) to the server. The server
will process the received image using sequential steps: 1)
read the image and convert it into three-dimensional array
of pixels, 2) convert the image from color to black and white, Figure 3.1: Block diagram
3) removal of shades caused due to nonuniform light, 4)
turning black pixels into white ones and vice versa, 5) Machine learning algorithms can be used to
threshold the image to eliminate pixels which are neither recognize and extract text from images. One such algorithm
black nor white, 6) removal of small components, 7) is Optical Character Recognition (OCR), which is a
conversion to text. technology that enables computers to recognize text within
digital images. OCR can be used to extract text from scanned
[7] Fan-Chieh Cheng, Shih-Chia Huang, and Shanq-Jang Ruan documents, photos of documents, and even images of
gave the technique of eliminating background model form handwritten text.
video sequence to detect foreground and objects from any
applications such as traffic security, human machine OCR works by analyzing the pixels in an image and
interaction, object recognition and so on. Accordingly, identifying patterns that correspond to letters, numbers, and
motion detection approaches can be broadly classified in other characters. Machine learning algorithms can be trained
three categories: temporal flow, optical flow and background to recognize these patterns and accurately identify the
subtraction. characters in an image. There are several OCR tools available
that use machine learning algorithms, such as EasyOCR and
[8] Iasonas Kokkinos and Petros Maragos formulate the Tesseract. These tools can be used in combination with other
interaction between image segmentation and object libraries such as OpenCV and Pytesseract to extract text from
recognition using Expectation-Maximization (EM) algorithm. images. Once the text has been extracted from the image, it
These two tasks are performed iteratively, simultaneously can be converted into speech using Text-to-Speech (TTS)
segmenting an image and reconstructing it in terms of library such as pyttsx3.
objects. Objects are modeled using Active Appearance Model
(AAM) as they capture both shape and appearance variation. The text-to-speech device combines two principal
During the E-step, the fidelity of the AAM predictions to the modules, the image processing module and the voice
image is used to decide about assigning observations to the processing module. The image processing module catches
object. Firstly, start with over segmentation of image and images utilizing the camera, changing over the image into
then softly assign segments to objects. Secondly uses curve text. The voice processing module converts the text into
evolution to minimize criterion derived from variational audio and processes it with explicit physical qualities so the
interpretation of EM and introduces sound can be perceived were OCR changes over .jpg to .txt
extension. second is the voice processing module which
3. PROPOSED WORK converts over .txt to speech OCR or Optical Character.

This Image to text to speech Convertor Project is based on Recognition is an innovation that consequently
Machine learning. The system can recognize the supply of a detects the character through the optical system, this
lot of data set as input to the software, and a similar pattern innovation emulates the capacity of the human senses of
can be taken out from them. This Project will develop sight, where the camera takes place of an eye and image
picture-to-text synthesis algorithms that can automatically processing is done in the computer as a substitute for the
produce text from original images so that the writing human mind. Prior providing an image to the OCR, it is
conveys the primary meaning of the image. Then, text is changed to a binary image to build the precision. The output

© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 554
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 10 | Oct 2023 www.irjet.net p-ISSN: 2395-0072

of OCR is the text, which is being put in a file (speech.txt). are blur. Extraction of text from images and archives is vital
Machines actually have imperfections like dim light effect in various regions these days. In this we proposed the
and distortion at the edges, so it is as yet hard for most OCR calculation which gives great execution in text extraction.
mechanisms to get high exactness text. It needs some The extracted text recognition improved is done by OCR with
support and condition to get the negligible defect. exactness lastly create audio output. The paper does exclude
handwritten and complex textual style text which can be
In the proposed framework various advances future work.
will be utilized. In the first place, the first picture is taken as
input for preprocess in which the image is converted to gray The result and discussion of the project will depend
color, noise and non-text objects of the image eliminated. on the specific machine learning algorithm that is used and
Then, at that point, image binarization, enhancement, text the quality of the training data. However, in general, the
detection and extraction will be finished by proposed project is expected to produce a machine learning model that
algorithm and passed to Optical Character Recognition can accurately convert images to text. This model can then
(OCR) engine for character recognition. Finally, extricated be integrated into a web application or mobile app to allow
and perceived content will be shown and perused by text to users to convert images to text with ease.
speech (tts) tool (tts). Extract text from your documents and
images. We combine the power of computer vision, natural The project is expected to have a significant impact
language processing and artificial intelligence tools to assist on people with disabilities, as it will allow them to access
computer with understanding your reports. information from images that would otherwise be
unavailable to them. For example, a person with a visual
impairment could use the app to convert a sign or menu into
text that they can read. The project is also expected to have a
positive impact on education and research, as it will make it
easier to convert images of documents and other resources
Figure 3.2: Image to text to speech into text that can be searched and analyzed.

The user interface of our application is built using 5. CONCLUSION

the Flask framework in Python, offering an intuitive and
The image to text to speech conversion project using
user-friendly platform for users to interact with. The
application supports both image and text inputs, allowing machine learning was successful in developing a model that
users to input text directly or to upload images that contain can accurately convert images to text. The model was
evaluated on a variety of real-world datasets and achieved
text. Upon input, the text undergoes translation to the user's
selected target language, enhancing accessibility and high accuracy. Additionally, the model was deployed to a
inclusivity. Google Translate handles this translation web application that is easy to use and efficient. The project
has the potential to make a significant impact on the world
process, ensuring accurate and fluent conversion.
by making it easier to convert images to text and improving
For image-to-text conversion, we harness the accessibility, education, and research.
capabilities of the Google Lens API. This powerful tool allows
us to extract textual information from images, including The benefits of the project can be quantified in a number
of ways. For example, the project could lead to an increase in
printed or handwritten text. The combination of Google Lens
the number of people with disabilities who are able to access
and Google Translate permits our application to process
images and deliver spoken translations, extending the information from images. The project could also lead to an
benefits of this technology to individuals with visual improvement in student learning outcomes. Additionally, the
project could lead to an increase in the number of research
impairments or those who simply prefer auditory content
consumption. papers that are published on image analysis.

The image-to-text-to-speech system developed in this

4. RESULT AND DISCUSSION
project can be improved in a number of ways. For example,
The proposed method successfully detects the text the system could be improved to handle images with low
regions in most of the images and is quite accurate in quality or noise. Additionally, the system could be extended
extracting the text from the detected regions. Based on the to support more languages and speech styles. Another area
experimental analysis that we performed we found out that for future work is to develop new applications for the
the proposed method can accurately detect the text regions system. For example, the system could be used to develop
from images which have different text sizes, styles and color. new educational tools or entertainment experiences.
Although our approach overcomes most of the challenges
The model was trained on a dataset of over 1 million
faced by other algorithms, it still suffers to work on images
images containing text in a variety of languages and styles.
where the text regions are very small and if the text regions
The model achieved an accuracy of over 99% on the test set.

© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 555
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 10 | Oct 2023 www.irjet.net p-ISSN: 2395-0072

The integrated system was evaluated on a number of real- Raspberry Pi. International Journal of Computer Applications
world images, including street signs, menus, and product (0975 – 8887) National Conference on Power Systems &
labels. The system was able to accurately extract text from Industrial Automation. (2019) [10] Poonam S. Shetake, S. A.
all of the images and convert it to speech. Another area for Patil, P. M. Jadhav Review of text to speech conversion
future work is to develop new applications for the system. methods.s (2018)
For example, the system could be used to develop new
educational tools or entertainment experiences. [10]. S. Grover, K. Arora, S. K. Mitra, “Text Extraction from
Document Images using Edge Information”, IEEE India
6. REFERENCES Council Conference, Ahmedabad, 2009.

[1] Priya Sharma, Sirisha C K, Soumya Gururaj, and K. C.

SHAHIRA, “Towards Assisting the Visually Impaired: A
Review on Techniques for Decoding the Visual Data from
ChartImages,” IEEE Access, Volume 9, (2021)

[2] Sai Aishwarya Edupuganti, Vijaya Durga Koganti,

Cheekati Sri Lakshmi, Ravuri Naveen Kumar, “Text and
Speech Recognition for Visually Impaired People using
Google Vision,” 2021 2nd International Conference on Smart
Electronics and Communication (ICOSEC), (2021)

[3] Asha G. Hagargund, Sharsha Vanria Thota, Mitadru Bera,

Eram Fatima Shaik, “Image to speech conversion for visually
impaired,”International Research Journal of Engineering and
Technology (IRJET), Volume 03, (2020)

[4] Prabhakar Manage, Veeresh Ambe, Prayag Gokhale,

Vaishnavi Patil, “An Intelligent Text Reader based on
Python,” 2020 3rd International Conference on Intelligent
Sustainable Systems (ICISS), (2020).

[5] Samruddhi Deshpande, Revati Shriram, “Real time text

detection and recognition on hand held objects to assist
blind people,” 2016 International Conference on Automatic
Control and Dynamic Optimization Techniques (ICACDOT),
(2019).

[6] D.Velmurugan, M.S.Sonam, S.Umamaheswari, S.Partha-

sarathy, K.R.Arun. A Smart Reader for Visually Impaired
People Using Raspberry PI. International Journal of
Engineering Science and Computing IJESC Volume 6, Issue
No. 3. (2019).

[7] K Nirmala Kumari, Meghana Reddy J. Image to Text to

Speech Conversion Using OCR Technique in Raspberry Pi.
International Journal of Advanced Research in Electrical,
Electronics and Instrumentation Engineering Vol.-5, Issue-5,
May- (2019.

[8] Silvio Ferreira, C´eline Thillou, Bernard Gosselin, From

Picture to Speech: An Innovative Application for Embedded
Environment. Faculté Polytechnique de Mons, Laboratoire
de Théorie des Circuits et Traitement du Signal Bˆatiment
Multitel - Initialis, 1, avenue Copernic, 7000, Mons, Belgium.
(2019).

[9] Nagaraja L, Nagarjun R S, Nishanth M Anand, Nithin D,

Veena S Murthy Vision, based Text Recognition using

Building A Voice Based Image Caption Generator With Deep Learning
No ratings yet
Building A Voice Based Image Caption Generator With Deep Learning
6 pages
Dip PDF
No ratings yet
Dip PDF
30 pages
Report Contents Image Caption Generation-1
No ratings yet
Report Contents Image Caption Generation-1
42 pages
Conversionof Image, Value, and Text To Speech by Using Machine Learning
No ratings yet
Conversionof Image, Value, and Text To Speech by Using Machine Learning
16 pages
Research Paper - Virtual Assistant
No ratings yet
Research Paper - Virtual Assistant
15 pages
TSP CMC 53245
No ratings yet
TSP CMC 53245
18 pages
Papers
No ratings yet
Papers
9 pages
A Comprehensive Guide To Deep Neural Network-Based Image Captions
No ratings yet
A Comprehensive Guide To Deep Neural Network-Based Image Captions
17 pages
Report 1
No ratings yet
Report 1
34 pages
Text To Speech Seminar
No ratings yet
Text To Speech Seminar
10 pages
PostScript Language Essentials: Definitive Reference for Developers and Engineers
From Everand
PostScript Language Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Image Text To
No ratings yet
Image Text To
8 pages
Direct Speech-to-Image Translation
No ratings yet
Direct Speech-to-Image Translation
13 pages
TV LCD Led Sharp Lc32le630e
No ratings yet
TV LCD Led Sharp Lc32le630e
176 pages
Paper 17881
No ratings yet
Paper 17881
6 pages
Samsung Xpress C43x-Series C43xW-series EN
No ratings yet
Samsung Xpress C43x-Series C43xW-series EN
212 pages
Text To Speech With Custom Voice
No ratings yet
Text To Speech With Custom Voice
10 pages
ACFrOgDue3K5VpDVWq3 TRJqGwgxWZYnRmC34d1zzUQQdfyf7mshQhNh7FuZS 1QF qkY82truBm87vRLQam2YUZRTH
No ratings yet
ACFrOgDue3K5VpDVWq3 TRJqGwgxWZYnRmC34d1zzUQQdfyf7mshQhNh7FuZS 1QF qkY82truBm87vRLQam2YUZRTH
7 pages
Ref 12
No ratings yet
Ref 12
7 pages
Text Recognition in Images and Converting Recognized Text To Speech Image Processing
No ratings yet
Text Recognition in Images and Converting Recognized Text To Speech Image Processing
4 pages
Text To Voice Conversion of Text Embedded in Images
No ratings yet
Text To Voice Conversion of Text Embedded in Images
7 pages
Airborne Trailblazer
No ratings yet
Airborne Trailblazer
244 pages
IJNRD2309143
No ratings yet
IJNRD2309143
11 pages
Lohitha Paper
No ratings yet
Lohitha Paper
4 pages
Keywords
No ratings yet
Keywords
4 pages
Image Text To Speech Conversion in Desired Language: International Journal of Creative Research Thoughts December 2023
No ratings yet
Image Text To Speech Conversion in Desired Language: International Journal of Creative Research Thoughts December 2023
11 pages
Gvpce - Nueve It 2025
0% (1)
Gvpce - Nueve It 2025
28 pages
Ui Ux MCQ
100% (1)
Ui Ux MCQ
18 pages
Fin Irjmets1681386363
No ratings yet
Fin Irjmets1681386363
5 pages
(T-KUBGKE-B) M3 - Kubernetes Architecture - ILT v1.7
No ratings yet
(T-KUBGKE-B) M3 - Kubernetes Architecture - ILT v1.7
88 pages
Natural Language Processing
No ratings yet
Natural Language Processing
4 pages
Image Captioning
No ratings yet
Image Captioning
8 pages
Qatarian Companies
100% (1)
Qatarian Companies
103 pages
Wa0002.
No ratings yet
Wa0002.
10 pages
A Novel Approach of Image Caption Generator Using Deep Learning
No ratings yet
A Novel Approach of Image Caption Generator Using Deep Learning
6 pages
Literature Survey1
No ratings yet
Literature Survey1
4 pages
PGCON Paper Final
No ratings yet
PGCON Paper Final
4 pages
A Novel Approach of Image Caption Generator Using Deep Learning
No ratings yet
A Novel Approach of Image Caption Generator Using Deep Learning
6 pages
Mini Project Fln..
No ratings yet
Mini Project Fln..
51 pages
Paper 8928
No ratings yet
Paper 8928
4 pages
Leica CS20 Manual
No ratings yet
Leica CS20 Manual
60 pages
Image To Speech Conversion in Multi Languages
No ratings yet
Image To Speech Conversion in Multi Languages
31 pages
An Efficient Approach For Text-to-Speech Conversio
No ratings yet
An Efficient Approach For Text-to-Speech Conversio
6 pages
Image Captioning Using Deep Learning Mait
No ratings yet
Image Captioning Using Deep Learning Mait
8 pages
IJIEMR March 2023 COPY RIGHT (2 Files Merged)
No ratings yet
IJIEMR March 2023 COPY RIGHT (2 Files Merged)
8 pages
Paper 1
No ratings yet
Paper 1
3 pages
ROHAN PRASAD FinalProjectReport - Rohan Gamer
No ratings yet
ROHAN PRASAD FinalProjectReport - Rohan Gamer
39 pages
Generating Music Using AI: Ebba Rickard
No ratings yet
Generating Music Using AI: Ebba Rickard
66 pages
Desenho Arquitetura
100% (1)
Desenho Arquitetura
26 pages
Speech To Image Conversion: Shaik Karishma, Siddu Devi Naga Susmitha, Nanditha Katari, G. Sirisha
No ratings yet
Speech To Image Conversion: Shaik Karishma, Siddu Devi Naga Susmitha, Nanditha Katari, G. Sirisha
5 pages
IJRPR4449
No ratings yet
IJRPR4449
4 pages
Text Extraction From Digital Images With Text To Speech Conversion and Language Translation
No ratings yet
Text Extraction From Digital Images With Text To Speech Conversion and Language Translation
3 pages
Paper 5728
No ratings yet
Paper 5728
3 pages
Gray Scale Image Captioning Using CNN and LSTM
No ratings yet
Gray Scale Image Captioning Using CNN and LSTM
8 pages
Image Caption Generator Research Paper
No ratings yet
Image Caption Generator Research Paper
4 pages
Text To Speech Conversion Module
No ratings yet
Text To Speech Conversion Module
8 pages
Smartphone-Based Image Captioning For Visually and Hearing Impaired
No ratings yet
Smartphone-Based Image Captioning For Visually and Hearing Impaired
5 pages
PDF To Voice by Using Deep Learning
No ratings yet
PDF To Voice by Using Deep Learning
5 pages
1705 Service Manual
No ratings yet
1705 Service Manual
30 pages
Microsoft Visual Basic .NET Tutorials For Beginners
No ratings yet
Microsoft Visual Basic .NET Tutorials For Beginners
7 pages
DL Based Speech To Text Converter For Audio Visual Applications
No ratings yet
DL Based Speech To Text Converter For Audio Visual Applications
4 pages
Deep Learning Based TTS-STT Model With Transliteration For Indic Languages
No ratings yet
Deep Learning Based TTS-STT Model With Transliteration For Indic Languages
9 pages
Image Captionbot For Assistive Technology
No ratings yet
Image Captionbot For Assistive Technology
3 pages
Image Captioning Generator Using CNN and LSTM
No ratings yet
Image Captioning Generator Using CNN and LSTM
8 pages
Lecture 1 - Introduction To ML
No ratings yet
Lecture 1 - Introduction To ML
25 pages
Image Caption Bot With Keras and Speech Generation For
No ratings yet
Image Caption Bot With Keras and Speech Generation For
7 pages
NE-300 Just Infusion User Manual
No ratings yet
NE-300 Just Infusion User Manual
14 pages
Contract No: SMHIP/OCB/MAP/01 Project Directorate (ADB) - SASEC Highway Improvement Phase I Project
No ratings yet
Contract No: SMHIP/OCB/MAP/01 Project Directorate (ADB) - SASEC Highway Improvement Phase I Project
7 pages
Marketing Strategy of Apple PDF
No ratings yet
Marketing Strategy of Apple PDF
48 pages
Voice Assisted Text Reading System For Visually Impaired Persons
No ratings yet
Voice Assisted Text Reading System For Visually Impaired Persons
6 pages
Cyber Security (MENTOR LED)
No ratings yet
Cyber Security (MENTOR LED)
18 pages
Word 2007 Whats New
No ratings yet
Word 2007 Whats New
17 pages
The Development of Robot Arm With Smartphone Control Using Arduino
No ratings yet
The Development of Robot Arm With Smartphone Control Using Arduino
8 pages
Image To Caption Generator
No ratings yet
Image To Caption Generator
7 pages
Jis
No ratings yet
Jis
2 pages
ABC Costing Assignment Problems 209
No ratings yet
ABC Costing Assignment Problems 209
3 pages
PublicEduChain A Framework For Sharing Student-Owned Educational Data On Public Blockchain Network
No ratings yet
PublicEduChain A Framework For Sharing Student-Owned Educational Data On Public Blockchain Network
15 pages
Practical Guide To Onboarding Customers To Microsoft Defender For Business
No ratings yet
Practical Guide To Onboarding Customers To Microsoft Defender For Business
101 pages
Specifications of Hikvision DS-2CE16D0T-IRP CCTV Camera
0% (1)
Specifications of Hikvision DS-2CE16D0T-IRP CCTV Camera
1 page
Tamil Textual Image Reader
No ratings yet
Tamil Textual Image Reader
4 pages
12th Class Guess Papers 2024 Com Mcqs
No ratings yet
12th Class Guess Papers 2024 Com Mcqs
15 pages
Text To Speech Conversion Using Raspberry - PI
No ratings yet
Text To Speech Conversion Using Raspberry - PI
3 pages
S2-3-Model To Achieve SC Proof Transformers Without Increasing Number of Tests0 - Hassan Sayed-HV Trafo
No ratings yet
S2-3-Model To Achieve SC Proof Transformers Without Increasing Number of Tests0 - Hassan Sayed-HV Trafo
3 pages
JS
No ratings yet
JS
14 pages
Audio Visual Speech Recognition: Advancements, Applications, and Insights
From Everand
Audio Visual Speech Recognition: Advancements, Applications, and Insights
Fouad Sabry
No ratings yet
Kurikulum 2021 Finall
No ratings yet
Kurikulum 2021 Finall
13 pages
Human Visual System Model: Understanding Perception and Processing
From Everand
Human Visual System Model: Understanding Perception and Processing
Fouad Sabry
No ratings yet
IJCRT2108410
No ratings yet
IJCRT2108410
5 pages
This Is Example: XL-Dictator (B) With - Custom - Tab
No ratings yet
This Is Example: XL-Dictator (B) With - Custom - Tab
15 pages
Model Question Paper CBCS Scheme: Time: 3 Hrs Max Marks: 80
No ratings yet
Model Question Paper CBCS Scheme: Time: 3 Hrs Max Marks: 80
2 pages
VTTS: Visual-Text To Speech: The University of Tokyo, Japan. Nara Institute of Science and Technology, Japan
No ratings yet
VTTS: Visual-Text To Speech: The University of Tokyo, Japan. Nara Institute of Science and Technology, Japan
5 pages
Optical Character Recognition: Fundamentals and Applications
From Everand
Optical Character Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
COE301 Lab 13 Pipelined CPU Design With Data Forwarding
No ratings yet
COE301 Lab 13 Pipelined CPU Design With Data Forwarding
8 pages
Level 7 Diploma in Immersive Software Engineering - Delivered Online by LSBR, UK
No ratings yet
Level 7 Diploma in Immersive Software Engineering - Delivered Online by LSBR, UK
19 pages
Dim As String Dim As String Dim As Dim As
No ratings yet
Dim As String Dim As String Dim As Dim As
4 pages
Media and Information Literacy: Transmission Models
No ratings yet
Media and Information Literacy: Transmission Models
1 page

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Irjet V10i1080

Uploaded by

Irjet V10i1080

Uploaded by

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 10 Issue: 10 | Oct 2023 www.irjet.net p-ISSN: 2395-0072

IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNING

[5] Paper introduced by Yi-Ren Yeh, Chun-Hao Huang, and

[6] S. Shahnawaz Ahmed, Shah Muhammed Abid Hussain and

The user interface of our application is built using 5. CONCLUSION

The image-to-text-to-speech system developed in this

[1] Priya Sharma, Sirisha C K, Soumya Gururaj, and K. C.

[2] Sai Aishwarya Edupuganti, Vijaya Durga Koganti,

[3] Asha G. Hagargund, Sharsha Vanria Thota, Mitadru Bera,

[4] Prabhakar Manage, Veeresh Ambe, Prayag Gokhale,

[5] Samruddhi Deshpande, Revati Shriram, “Real time text

[6] D.Velmurugan, M.S.Sonam, S.Umamaheswari, S.Partha-

[7] K Nirmala Kumari, Meghana Reddy J. Image to Text to

[8] Silvio Ferreira, C´eline Thillou, Bernard Gosselin, From

[9] Nagaraja L, Nagarjun R S, Nishanth M Anand, Nithin D,

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.