0% found this document useful (0 votes)
25 views6 pages

JETIR2211403

This document summarizes research on machine translation between Telugu and English. It discusses the need for machine translation between languages in India. It then reviews different approaches that have been used to build machine translation systems between Telugu and English, including rule-based systems, statistical systems, and neural machine translation. Several papers are summarized that used various techniques like LSTM networks and pre-processing to improve accuracy when translating poems, proverbs, and other linguistic features between the two languages. Overall, the document provides an overview of work that has been done on formal language translation between Telugu and English.

Uploaded by

xxxtent301
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views6 pages

JETIR2211403

This document summarizes research on machine translation between Telugu and English. It discusses the need for machine translation between languages in India. It then reviews different approaches that have been used to build machine translation systems between Telugu and English, including rule-based systems, statistical systems, and neural machine translation. Several papers are summarized that used various techniques like LSTM networks and pre-processing to improve accuracy when translating poems, proverbs, and other linguistic features between the two languages. Overall, the document provides an overview of work that has been done on formal language translation between Telugu and English.

Uploaded by

xxxtent301
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

© 2022 JETIR November 2022, Volume 9, Issue 11 www.jetir.

org (ISSN-2349-5162)

A SURVEY ON FORMAL LANGUAGE


TRANSLATION (TELUGU - ENGLISH)
Mrs. P. Swaroopa1, Kalakonda Vishnu 2, Pulipati Akshay3, S. Siva Sai Sandip4,
Vennam Vivek5
1
Assistant Professor, Department of Computer Science and Engineering, ACE Engineering College, Hyderabad,
Telangana, India
2,3,4,5
IV BTech Students, Department of Computer Science and Engineering, ACE Engineering College,
Hyderabad, Telangana, India

ABSTRACT
Language, as the Information carrier, is most important for humans to communicate and share information.
Language barriers occur when people with different language backgrounds communicate. To solve
this problem, we need translators who can translate from one language to another quickly and effectively.
This is made possible by machine translation. It has been 50 years, since computer scientists have been
working on Machine Translation. Machine Translation is not a new term it was in trending from past 50 years,
it comes under Machine Learning technology called Natural Language Processing (NLP) [1]. Our paper deals
with a Machine translator which converts from Telugu to English, it is a formal translator which aims to
translate different shatakas, proverbs, and poems in Telugu to English. We can say that in our system Telugu
is the source language and English is the target language, when compared to Telugu, English is
morphologically simple language. We have different types of approaches to achieve machine translation
namely rule based system, statistical translation system and neural machine translation. This work is a study
of pre-processing the corpus and different methodologies used to construct a Machine Translator System.
INTRODUCTION
India is multilingual country, where there exist more than 18 languages, there are different troops
speaking different languages and people in India are not familiar with all languages they can neither
understand nor speak all the languages. Most of the population in India speak languages like English and
Hindi. Here we can observe a necessity of a translator when ever different people from different regions try
to interact with each other, at this time we need a translator which can translate from one language to another
quickly and effectively which can be achieved using a Machine Translator.
A Machine Translator is a program which can convert the text from one language to another, so the
language in which the input is given is known as source language and the language in which the output is
obtained is known as target language. At recent times we have different machine translators in use, which are
effective and can solve this problem, but most of the times these existing translators fail when the have to
convert poems, proverbs of the source language to target language.

JETIR2211403 Journal of Emerging Technologies and Innovative Research (JETIR) d762


© 2022 JETIR November 2022, Volume 9, Issue 11 www.jetir.org (ISSN-2349-5162)
Every language has its own proverbs and poems, which are created based on the ideologies of poets,
people and culture. While translating these language specific poems and proverbs a special care must be taken,
which is our problem statement, We can say our problem statement is constructing a Machine Translator
which can translate different proverbs and shatakas in Telugu to English correctly. So we can say in our case
the source language is Telugu where as target language is English. when compared to Telugu, English is
morphologically simple language. So our model translates from morphologically rich language to a
morphologically simple language.
Machine Translation system can be of mainly three types rule based system, statistical translation
system and neural machine translation. Rule based system uses grammar rules and combination of languages
with dictionaries for common words. It uses manually created dictionaries. These a robust and cost effective.
Statistical translation system are another type of system which do not follow any grammar rules, they can
learn by themselves, the accuracy is less when compared to other systems. Neural Machin Translation (NMT)
is popular approach which uses neural networks to translate and a network model called Recurrent Neural
Network (RNN), the accuracy is high and most of the developers and researches follow NMT for accurate
performance in many language pairs when compared with others. We do have different other approaches like
Encoder – Decoder architecture, LSTM architecture etc. However, there are different approaches to construct
a Machine Translator some of the methods are listed in this paper.
LITERATURE REVIEW
There have been many researches on machine translator with an aim of improving the accuracy of the system,
starting from the data gathering, pre-processing to the type of algorithm used to build a better machine
translator many researches have been done, I have read some of the papers to understand the problem in a
better approach and to gain knowledge on different techniques their drawbacks used to build a Machine
Translator.

[1] The problem statement of Sentence wise translator from Telugu to English is to translate the Vemana
sathakam from Telugu to English. They have used neural machine translation (NMT) with long- and short-
term Memory (LSTM). Neural machine translation is a new technique for machine translation which uses
artificial neural networks (ANN) to increase accuracy and performance. LSTM is very similar to RNN the
main difference is the number of layers in the network the RNN uses only one layer whereas LSTM uses 4
neural networks. They have used a bidirectional LSTM to translate poems from Telugu to English. According
to the paper NMT with LSTM solves the problem of accuracy and the need of large data set, both for training
and evaluation of the results. The drawback of this paper was the idea which is limited to vamana shatakam,
this can be elaborated to different shatakas, proverbs and holy books from Telugu to English.
[2] Pre-processing of English to Hindi Corpus for statistical Machine Translation presents that improving the
pre-processing technique and giving attention to it would improve the accuracy. The impact of experimenting
on the per-processing of the input are observed on translation quality improvement through BLEU
(BILINGUAL EVOLUTION UNDERSTUDY an algorithm for evaluating the quality of the output which
has been translated). The pre-processing methodology they have followed includes Casing, Punctuations,
Spell Normalization etc. These experiments have proven the improved accuracy in English – Hindi
translation. They state that the best combination of pre-processing can be used to improve the accuracy. The
drawbacks could be not including the linguistic features like re-ordering source sentences to match the target
word order using the source side phrase information.
[3] Statistical Machine Translation is a technique which can be used to solve the problem of Machine
Translation. It is a Machine Learning based technique which examines many samples of human-produced
translations, using which SMT algorithm learns how to translate automatically. This paper raises an important
point involving the morphologically rich languages and other languages, It states that if we translate different
morphologically rich languages like German, Arabic into a morphologically simple language like English,
can been visualized as movement from higher dimensional space to a lower dimensional space, in which there
are less chances of loss of meaning and nuance is harmless. Whereas extra attention must be taken while
translating.
JETIR2211403 Journal of Emerging Technologies and Innovative Research (JETIR) d763
© 2022 JETIR November 2022, Volume 9, Issue 11 www.jetir.org (ISSN-2349-5162)
[4] This paper explores the Neural Machine Translation Methodology and tries to improve the its performance
by not using the fixed length vector, It mainly focuses on the translation from English to French. The data set
they have used is very large, with more than 275M English-French words, They have taken two different
models RNN Encoder-Decoder and Stochastic gradient descent algorithm, the have trained this model for 5
days. The drawbacks could be testing only for English to French and Training model only for 5 days, data set
can be improved.
[5] This paper explores the hybrid approach, this approach is designed to translate from Malayalam to English,
when compared to English ,Malayalam is Morphologically rich language, This approach combines two
approaches, the first one is machine translation based on example and the next one is transfer approaches for
better efficiency and increased correctness. They have used dataset of Malayalam to English, the drawbacks
could be it cannot translate complex sentences.
[6] This paper mainly focuses on translation from English to Telugu with emphasis on prepositions, it mainly
focuses on the prepositions and converts the preposition in English to the proportion in Telugu it mainly
focuses on different kinds of prepositions that are being used in English which can be translated as post
positions in Telugu. time, gender, context and many other features play an important role in selecting the
appropriate postposition in Telugu.
[7] This paper talks about the Cross Language Information Retrieval(CLIR) which is a sub topic in IR, it talks
about the dictionary based translation approaches, it presents different methodologies like machine translation
and corpus based translation. In this process we could face problems like selection of translation for query,
selection of the dictionary for possible translation and so-on.
[8] This paper deals with different approaches to construct a machine translator and focuses on machine
translation system design based on declension rules. In this paper a Machine tr anslator is built which
translates from English to Hindi, the efficiency of this method is high but it cannot deal with complex
sentences for translation, It requires a heavy database.
[9] This article explains the procedure to develop a machine translator for five languages. It describes
about language components, CRL systems, semantic procedures, pragmatic procedures, The drawbacks
could be it cannot handle complex statements, They have taken less vocabulary data, the results obtained
are not perfect, they were some semantical errors with lexical failures.
[10] This article analyses some of the systems on the basis of translation of English texts into Hindi. the
presented results, systems using statistical approach or hybrid approach are more accurate than those using
rule-based approaches. They extend by stating that rule based systems have its own benefits for translation.
The hybrid approach which is a combination of both rules based and statistical approach will be seen as a
future of machine translation systems.

SNO PROBLEM EXISTING SYSTEM DRAWBACK


STATEMENT
1 To translating vemana Neural machine translator 1. Cannot handle rare or
sathakam using Sentence with bidirectional LSTM unknown words
wise Telugu to English to obtain higher accuracy 2.It was just limited to
translation. without large dataset. vemana sathakas.

2. Pre-processing of English The improvised pre- not including the linguistic


to Hindi Corpus for processing methodology features like re-ordering
statistical Machine they followed includes source sentences to match
Translation Casing, Punctuations, the target word order using
Spell Normalization etc. the source side phrase
Their experiments have information
proven the improved
accuracy in English –
Hindi translation.

JETIR2211403 Journal of Emerging Technologies and Innovative Research (JETIR) d764


© 2022 JETIR November 2022, Volume 9, Issue 11 www.jetir.org (ISSN-2349-5162)
3. Machine Translator using It is a Machine Learning Extra attention must be
Statistical machine based technique which taken while translating
translation technique. examines many samples from morphologically
of human-produced simple language to
translations, using which morphologically rich
SMT algorithm learns language.
how to translate Cannot handle rare or
automatically. unknown words.
4. Building a Machine It mainly focuses on the be testing only for English
Translator English to translation from English to French and Training
French using Neural to French. The data set model only for 5 days, data
Machine Translation they have used is very set can be improved.
approach large, with more than
275M English-French
words, They have taken
two different models
RNN Encoder-Decoder
and Stochastic gradient
descent algorithm, the
have trained this model
for 5 days
5. English to Malayalam this approach is designed It cannot handle rare words.
machine translation using to translate from Lack of well aligned
hybrid approach Malayalam to English, bitexts
when compared to Cannot handle complex
English ,Malayalam is sentences
Morphologically rich
language, This approach
combines two
approaches, the first one
is machine translation
based on example and the
next one is transfer
approaches for better
efficiency and increased
correctness. They have
used dataset of
Malayalam to English
6. Machine translation from it mainly focuses on the It cannot handle unknown
English to Telugu using prepositions and converts words.
rule-based approach the preposition in English
to in Telugu it mainly
focuses on different kinds
of prepositions that are
being used in English
which can be translated as
post positions in Telugu
7 Different approaches to It discusses about Accuracy of Dictionary
perform better Cross different approaches for based translation is less
Language Information CLIR, it deals with when compared with other
Retrieval different approaches like approaches like WSD etc.
dictionary based
translation, identifying
and translating phrases
and compound words, out
of vocabulary terms etc.

JETIR2211403 Journal of Emerging Technologies and Innovative Research (JETIR) d765


© 2022 JETIR November 2022, Volume 9, Issue 11 www.jetir.org (ISSN-2349-5162)
8 To construct a Machine approaches to construct cannot deal with complex
translator which translates a machine translator and sentences for translation,
from English to Hindi focuses on machine It requires a heavy
translation system database.
design based on
declension rules. In this
paper a Machine
translator is built which
translates from English
to Hindi, the efficiency
of this method is high
9. Constructing a multilingual the procedure to develop cannot handle complex
machine translator a machine translator for statements, They have
five languages. It taken less vocabulary
describes about data, the results obtained
language components, are not perfect, they were
CRL systems, semantic some semantical errors
procedures, pragmatic with lexical failures.
procedures,
10 To improve the efficiency they analysed some of the They have only considered
of machine translator by systems on the basis of translation from English to
examining different translation of English Hindi
approaches. texts into Hindi. the
presented results, systems
using statistical approach
or hybrid approach are
more accurate than those
using rule-based
approaches.
COCLUSION
The reason for the study is to build up a Machine Translator which translates different shatakas, proverbs, and
poems in Telugu to English with out any semantic ad syntactic errors. When any Telugu statement is given
as an input, a proper English sentence with same meaning is displayed on the screen. The study include major
challenges like Content of poor quality, Technology problems, social Issues etc. This paper talks about
different methodologies followed to construct Machine Translator it highlights different approaches using
which a Machine Translator can be built, and also explains the comparisons among all the methods.

ACKNOWLEDGEMENT
We would like to thank our guide Ms. P. Swaroopa and our project coordinator Mrs. Soppari Kavitha for
their continuous support and guidance. We are also extremely grateful to Dr. M.V.VIJAYA SARADHI, Head,
Department of Computer Science and Engineering, ACE Engineering College for his support and invaluable
time.

REFERENCES
1. P.Sujatha, D. Lalitha Bhaskari “ Sentence Wise Telugu to English Translation of Vemana Sathakam
using LSTM “International Journal of Recent Technology and Engineering (IJRTE)
2. Pre-processing English-Hindi Corpus for Statistical Machine Translation ( Karunesh Kumar Arora
local Sharma S Agarwal centre for development of advanced computing Noida India, KIIT group of
institutions, Sohna road, Bonci Gurugram India go)
3. Adam Lopez, “statistical machine translation”, ACM computing surveys, vol 40, issue no 3, Article
8, August 2008,
JETIR2211403 Journal of Emerging Technologies and Innovative Research (JETIR) d766
© 2022 JETIR November 2022, Volume 9, Issue 11 www.jetir.org (ISSN-2349-5162)
Doi: https://doi.org/10.1145/1380584.1380586
4. “Neural Machine Translation By Jointly Learning to Align and Translate”, by Dzmitry Bahdanau and
KyungHyun Cho Yoshua Bengio, Published as a conference paper at ICLR 2015
5. Rosna P Haroon and Shaharban T A, “Malayalam Machine Translation using Hybrid Approach”,
International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) 2016
6. Keerthi Lingam, E. Rama Lakshmi and L Ravi Theja, “Rule-based Machine Translation From English
to Telugu With Emphasis on Prepositions”, 2014 First International Conference on Networks & Soft
Computing
7. B.N.V Narasimha Raju and M S V S Bhadri Raju,” Statistical Machine Translation System for Indian
Languages”, 2016 IEEE 6th International Conference on Advanced Computing
8. Jayashree Nair, Amrutha Krishnan K and Deetha R, “An Efficient English to Hindi Machine
Translation System Using Hybrid Mechanism”, 2016 ICACCI, Sept. 21-24, 2016, Jaipur, India
9. ULTRA: A Multilingual Machine Translator David Farwell and Yorick Wilks Computing Research
Laboratory New Mexico State University Box 30001, Las Cruces, NM 88003
10. Sandeep Kharb, Hemant Kumar, Manoj Kumar and Dr. Arun Kumar Chaturvedi, “Efficiency of a
Machine Translation System”, International Conference on Electronics, Communication and
Aerospace Technology ICECA 2017

JETIR2211403 Journal of Emerging Technologies and Innovative Research (JETIR) d767

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy