0% found this document useful (0 votes)

8 views5 pages

Fin Irjmets1702791465

Uploaded by

sv8482288

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views5 pages

Fin Irjmets1702791465

Uploaded by

sv8482288

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

e-ISSN: 2582-5208

International Research Journal of Modernization in Engineering Technology and Science

( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:05/Issue:12/December-2023 Impact Factor- 7.868 www.irjmets.com

HINGLISH TO ENGLISH TRANSLATION SYSTEM

Komal Potdar*1, Namrata Gaikwad*2, Meenakshi Sutar*3, Aditya Kurapati*4,
Ronak Dabade*5, Gaurang Khanderay*6
*1,2,3,4,5,6Department Of Multidisciplinary Engineering Vishwakarma Institute Of Technology,
Pune, 411037, Maharashtra, India.
ABSTRACT
The use of code-mixing, a linguistic phenomenon where speakers blend multiple languages, is widespread
among non-English speakers globally, particularly on social media. In the Indian context, individuals frequently
engage in code-mixing, combining English and Hindi in their online conversations, a practice colloquially
known as Hinglish. This linguistic fusion creates a vast amount of unstructured text on platforms such as social
media, blogs, and reviews. As code-mixing becomes increasingly prevalent, it poses a significant challenge to
machine translation systems. In this research paper, explore the algorithmic techniques developed to address
the complexities of handling code-mixed messages, specifically focusing on the Hinglish context. It discuss the
limitations of the system and its implications in the evolving field of text mining, shedding light on the modern
yet localized way of expression prevalent in Indian online communication. Through this study, the aim is to
contribute to the understanding of code-mixing challenges and advance the capabilities of natural language
processing systems in accommodating the diverse linguistic practices observed in social media discourse.
Keywords: Code-Mixing, Devnagari, English, Hinglish, Natural Language Processing, NLP, Transliteration,
Translation.
I. INTRODUCTION
In today's tech-driven world, it's becoming normal to mix different languages when people talk or post on social
media. People aren't using just one language as much anymore. This mix of languages makes a lot of data that
machines find tricky to understand. While there's been a bunch of work on translating pure languages, we now
need to pay more attention to studying and figuring out content in mixed languages. This study focuses on
translation model specifically for code-mixed language, like Hinglish, in Natural Language Processing (NLP).
This model helps machines better handle and make sense of the mix of languages when used in everyday
communication.
Hinglish is a language that mixes Hindi and English, blending them in conversations, individual sentences, and
even words. For example, you might say, "nahi mei nahi aa sakta," which means "no, I cannot come." This style
of speaking is becoming popular because it's a modern way of talking that still connects with local culture. In
India, a diverse country, there are people on both ends of the spectrum. On one side, there are Hindi speakers
who can read and understand the Devanagari script. On the other side, there are tourists from abroad who may
or may not fully understand the language. Since many local Indian markets, a big draw for foreign tourists, have
vendors who only understand Devanagari, there's a need to use Hinglish as a way for these two extremes to
communicate effectively.
The idea of this design is to restate Hinglish( Hindi English) which is combination of Hindi and English language
to pure English language. The proposed model operates by exercising Hinglish as a standalone language,
performing as a direct translator of code-mixed language into a pure form. This facilitates the analysis of
content expressed in mixed languages and serves to enhance the commerce between machines and humans,
fostering a more authentic communication experience.
II. RELATED WORK
S. H. Attri, T. V. Prasad, and G. Ramakrishna in [5] first determined if the sentence contained an expression or
idiom, then extracted it. The phrase was tokenized and classed as Hinglish, English, or Hindi, depending on its
original language. They then used morphological and reverse morphological analysis on each term. POS
Tagging sorted the words after analysis and Translation was carried out. "MujhE file send kar as soon as
possible" and "asap" were translated into Hindi. The resulting phrase in Hindi was "mujhE yathA shIghra
sanchikA bhEj" which translates to "Send me the file as soon as possible" in English. 12,000 Hinglish terms were

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

[1354]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:05/Issue:12/December-2023 Impact Factor- 7.868 www.irjmets.com
labelled with idioms and translations. Pure Hindi sentences were much more accurate than pure English
sentences. Thus, Hinglish is Hindi with English words added, using Hindi syntactic and semantic components
instead of English ones. Because of this, it was found that Hinglish sentences translated into pure Hindi were
more accurate than those translate into English.
There is a lot of study being done on code-mixed material, especially on language tagging. The Jhamtani et al.
model was an ensemble model, (2014)[4] which was combination of two classi-fiers to form a LID mixed with
Hindi-English code. The first classifier employed features such as word frequency, modified edit distance, and
character n-grams, while the second classifier used the output from the prior one for the current word as well
as languages and pos tag for neighboring words to provide the final tag.
Authors in [2] propose a four-phase pipeline for automatic Hinglish-to-English translation. They also compare
"code switching" and "code mixing" and examine contextual problems such as "chalega" which meaning both
"moving" and "will It work?" This study used no comparable corpus. The language was tagged, transliterated
into Devanagari, translated from English to Hindi, combined with Hindi, and translated back into English.
III. METHODOLOGY
Overview
The methodology provides a systematic approach to develop Hinglish to English Translation System. It focuses
on accurate translation, replace short notation and user interactivity. It uses a dataset that contains Hindi
idioms for linguistic enrichment. It has different feature like virtual keyboard, virtual assistant, voice module
and file module.
Technological Stack
Frontend
The user interface is developed using HTML, CSS, and JavaScript, encompassing the design elements, virtual
keyboard functionality, and the integration of voice features.
Backend
Python is employed for backend development to handle translation logic, short notation replacement, and
interaction with the SQLite database.
Database
SQLite serves as the database, storing short notations, long notations, and a dataset of Hindi idioms.
Database and Dataset
The database is structured with two primary columns: "Short Notation" and "Long Notation". This design
facilitates efficient retrieval of long notations associated with replaced short notations.
The dataset consists of Hindi idioms in Devnagari script paired with their corresponding English translations.
This dataset enhances the linguistic component of the translation system, contributing to improved accuracy
and coverage.
Proposed System
The system translates the user's Hinglish text to pure English. Long notations take the place of short notations
in the text. It also provides the text box with a speech capability so that people may hear it. Finding and
replacing all brief notations in Hinglish text and translating it into full English are the primary objectives of the
current study. From a provided Hinglish text, the model recognizes the short notation and extracts it. Hindi
translations of the remaining texts will be provided. After that, the module receives the Hindi text and
translates it into English. We'll use the dataset to identify and replace all of the short notations with long ones.
The system has several input modes analogous as keyboard, voice, or in a train, as well as virtual keyboard as
an early- stage creations of the design. Input type file is useful because voice type will be useful if the user can't
write/ type properly or if the user chooses to submit the Hinglish text file. In addition, the virtual assistant is
assigned to answer the user's questions.

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

[1355]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:05/Issue:12/December-2023 Impact Factor- 7.868 www.irjmets.com

Figure 1: Proposed Architecture

IV. RESULTS
As shown in the given figures, a text is passed to the system which contains a short notation and Hinglish text.
The system first checks for short notation in the given input. If any present then converts in the corresponding
long notation from the database. Then it checks for English words in the given input and find its Hindi meaning.
It performs transliteration and converts the whole text into devnagari script. Then system checks for idiom and
fetch data from the dataset. Then translation is performed and final output is given.

Figure 2: Output1
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1356]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:05/Issue:12/December-2023 Impact Factor- 7.868 www.irjmets.com

Figure 3: Output2
For any translation to be successful, the system needs input in proper format without any spelling mistakes and
grammatical errors otherwise it may get confused and will provide abnormal output. Fig.1 and Fig.2 outputs
are from console where step by step explanation is given.
V. LIMITATIONS
 Hinglish is most frequently encountered on informal platforms. Since informal writing rarely adheres to
punctuation, correct spelling, and correct grammar, this adds an additional layer of complication to the
translations.
 Several words used in Hinglish are also found in English [2], in such cases it creates an obstacle for the
model to determine the source language and thus choose between transliterations and translations.
 There are no standard spellings in Hinglish; most users rely on the phonetics of the word to determine its
romanised spelling [1], thus resulting in a variety of words with the same meaning but different spellings.
For instance, “Nahi”, “Nai”, “Nhi” all mean “No” in English. This makes complication for the model.
 A large number of Hinglish terms have several meanings that can be determined from the sentence context
alone [1]. An example of this could be the term “chalega” which in some sentences means “will work” and in
some sentences means “walk”.
VI. FUTURE SCOPE
The identified limitations in the current Hinglish to English translation system pave the way for promising
future developments. Future work could focus on refining the system to better handle the informal nature of
Hinglish by incorporating advanced natural language processing (NLP) techniques that account for variations
in punctuation, spelling, and grammar. Additionally, exploring context-aware machine learning models may
contribute to improved disambiguation of meanings associated with Hinglish terms, addressing the challenge of
multiple interpretations. Introducing dynamic spell-checking algorithms that adapt to the phonetic variations
in Hinglish could enhance the accuracy of translations. Furthermore, the integration of sentiment analysis and
cultural context recognition could add another layer of sophistication to the translation process, ensuring that
nuances unique to Hinglish expressions are accurately captured. Overall, future advancements could involve a
holistic approach, combining linguistic analysis, machine learning, and cultural context understanding to create
a more robust and contextually aware Hinglish to English translation system.
VII. CONCLUSION
This paper addresses the complexities of translating Hinglish to English, recognizing challenges related to
informal expressions and diverse linguistic variations. Users are advised to input accurate spellings for optimal
results. While the system is robust, further advancements, such as dynamic spell-checking, remain avenues for

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

[1357]
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:05/Issue:12/December-2023 Impact Factor- 7.868 www.irjmets.com
exploration. This project lays the groundwork for future research, aiming to create a more precise and
culturally sensitive Hinglish to English translation experience.
VIII. REFERENCES
[1] Jadhav, I., Kanade, A., Waghmare, V., Chandok, S.S., Jarali, A.: Code-Mixed Hinglish to English Language
Translation Framework. In: 2022 International Conference on Sustainable Computing and Data
Communication Systems (ICSCDS). pp. 684–688. IEEE (2022).
https://doi.org/10.1109/ICSCDS53736.2022.9760834.
[2] Dr. S.V. Kedar, Sakshi Bhangale, Kunal Deokar, “Translation: Code-Mixed Language (Hinglish) to
English”, IJARSCT Volume 2, Issue 3, May 2022.
[3] IJCSI International Journal of Computer Science Issues, Vol. 11, Issue 5, No 2, September 2014 ISSN
(Print): 1694-0814 | ISSN (Online): 1694-0784www.IJCSI.org
[4] Attri, S.H., Prasad, T.V., Ramakrishna, G.: HiPHET: A Hybrid Approach to Translate Code Mixed
Language (Hinglish) to Pure Languages (Hindi and English). Computer Science. 21, (2020).
https://doi.org/10.7494/csci.2020.21.3.3624.
[5] Rao, Ashwini and DSouza, Nicole and Patel, Devarsh and Saravta, Jigyashu, Development & Study of
Hinglish to English Translation and Classification Techniques. Available at SSRN:
https://ssrn.com/abstract=4510958 or http://dx.doi.org/10.2139/ssrn.4510958.
[6] Chéragui, Mohamed Amine. Theoretical Overview of Machine Translation. Proceedings ICWIT.
2012.
[7] Hutchins W.J, Somers H L. An introduction to machine translation. London: Academic Press.1992:

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

[1358]

Breton of Briec
100% (2)
Breton of Briec
336 pages
Translating the Future: Exploring the Impact of Technology and AI on Modern Translation Studies
From Everand
Translating the Future: Exploring the Impact of Technology and AI on Modern Translation Studies
Tian Chuanmao
No ratings yet
Implementing Domain-Specific Languages with Xtext and Xtend - Second Edition
From Everand
Implementing Domain-Specific Languages with Xtext and Xtend - Second Edition
Lorenzo Bettini
4/5 (1)
Vishalthesis
No ratings yet
Vishalthesis
348 pages
Marathi To English Sentence Translator For Simple
No ratings yet
Marathi To English Sentence Translator For Simple
5 pages
IPD - 2024-25 - Real-Time CodeMix Translation
No ratings yet
IPD - 2024-25 - Real-Time CodeMix Translation
18 pages
A Review of Machine Transliteration, Translation, Evaluation Metrics and Datasets in Indian Languages
No ratings yet
A Review of Machine Transliteration, Translation, Evaluation Metrics and Datasets in Indian Languages
32 pages
English Amharic Document Translation Using Hybrid Approach - by Samrawit Zewgneh - Addis Ababa University
100% (1)
English Amharic Document Translation Using Hybrid Approach - by Samrawit Zewgneh - Addis Ababa University
62 pages
Machine Learning in Translation Corpora Processing
No ratings yet
Machine Learning in Translation Corpora Processing
281 pages
Week2 English
No ratings yet
Week2 English
166 pages
Paper 4
No ratings yet
Paper 4
9 pages
Final Research Paper
100% (1)
Final Research Paper
5 pages
Acr39DF TMP
100% (2)
Acr39DF TMP
4 pages
06 Chapter2
No ratings yet
06 Chapter2
10 pages
A Hybrid Approach Using Phrases and Rules For Hindi To English Machine Translation
100% (1)
A Hybrid Approach Using Phrases and Rules For Hindi To English Machine Translation
17 pages
JETIR1806940
No ratings yet
JETIR1806940
12 pages
Lattice Based Lexical Transfer in Bengal
No ratings yet
Lattice Based Lexical Transfer in Bengal
8 pages
Machine Translation Approaches and Survey For Indian Languages
No ratings yet
Machine Translation Approaches and Survey For Indian Languages
18 pages
Seminar Sample Report
No ratings yet
Seminar Sample Report
20 pages
Y13 1040 Transliteration
No ratings yet
Y13 1040 Transliteration
9 pages
Research - Abesec - Mlir Neel
No ratings yet
Research - Abesec - Mlir Neel
7 pages
Hindi To Chhattisgarhi Translator
75% (4)
Hindi To Chhattisgarhi Translator
4 pages
An English-Assamese Machine Translation System: Moirangthem Tiken Singh Rajdeep Borgohain
No ratings yet
An English-Assamese Machine Translation System: Moirangthem Tiken Singh Rajdeep Borgohain
6 pages
Ijcsit 2023140502
No ratings yet
Ijcsit 2023140502
3 pages
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
From Everand
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
Timothy King
No ratings yet
Interactive English To Urdu Machine Translation Using Example-Based Approach
100% (2)
Interactive English To Urdu Machine Translation Using Example-Based Approach
8 pages
11 VII July 2023
No ratings yet
11 VII July 2023
8 pages
ChatGPT for Linguists: Revolutionize Language Research and Analysis with AI-Driven Insights (2024 Guide)
From Everand
ChatGPT for Linguists: Revolutionize Language Research and Analysis with AI-Driven Insights (2024 Guide)
JED RAMOS
No ratings yet
App PRJ
No ratings yet
App PRJ
11 pages
Lan Detecttion For Translit Text
No ratings yet
Lan Detecttion For Translit Text
4 pages
NLP Report
No ratings yet
NLP Report
23 pages
Article 16
No ratings yet
Article 16
8 pages
Machine Translation Development For Indian Languages and Its Approaches
No ratings yet
Machine Translation Development For Indian Languages and Its Approaches
21 pages
Abstract
No ratings yet
Abstract
1 page
Proceedings of International Ethical Hacking Conference 2018
No ratings yet
Proceedings of International Ethical Hacking Conference 2018
5 pages
2016 - An Efficient English To Hindi Machine Translation System Using Hybrid Mechanism
No ratings yet
2016 - An Efficient English To Hindi Machine Translation System Using Hybrid Mechanism
5 pages
Multilingual Translator and Interpreter
No ratings yet
Multilingual Translator and Interpreter
6 pages
Hunting Elusive English in Hinglish and Benglish
No ratings yet
Hunting Elusive English in Hinglish and Benglish
7 pages
1.1 General: Resourced" Languages. To Enhance The Translation Performance of Dissimilar Language
No ratings yet
1.1 General: Resourced" Languages. To Enhance The Translation Performance of Dissimilar Language
18 pages
Demos 040
No ratings yet
Demos 040
8 pages
A Unified Computational Lexicon For Hindi-English Code-Switching
No ratings yet
A Unified Computational Lexicon For Hindi-English Code-Switching
10 pages
2017 Oct Conf Machine Translation PDF
No ratings yet
2017 Oct Conf Machine Translation PDF
9 pages
Sanskrit-English Translator With NLP
No ratings yet
Sanskrit-English Translator With NLP
4 pages
JETIR2211403
No ratings yet
JETIR2211403
6 pages
Systematic Review On Techniques of Machine Translation For Indian Languages
No ratings yet
Systematic Review On Techniques of Machine Translation For Indian Languages
6 pages
Ijnlc 020305
No ratings yet
Ijnlc 020305
6 pages
Machine Translation For English To Kanna
No ratings yet
Machine Translation For English To Kanna
8 pages
Temp Research Paper
No ratings yet
Temp Research Paper
5 pages
Syntactic and Semantic
No ratings yet
Syntactic and Semantic
4 pages
Reported Speech - Slides
No ratings yet
Reported Speech - Slides
18 pages
Comparative Study of Machine Translation Techniques
No ratings yet
Comparative Study of Machine Translation Techniques
16 pages
Fin Irjmets1704109137
No ratings yet
Fin Irjmets1704109137
3 pages
T2 E 2105 Complete The Sentences With Modal Verbs Differentiated Activity Sheets - Ver - 5
No ratings yet
T2 E 2105 Complete The Sentences With Modal Verbs Differentiated Activity Sheets - Ver - 5
5 pages
Department of Education: Weekly Home Learning Plan For Grade Five-Einstein Modular Distance Learning
No ratings yet
Department of Education: Weekly Home Learning Plan For Grade Five-Einstein Modular Distance Learning
20 pages
CBSE Sample Papers For Class 3 English - Mock Paper 1
No ratings yet
CBSE Sample Papers For Class 3 English - Mock Paper 1
6 pages
Developing English-Urdu Machine Translation Via Hindi: R. Mahesh K. Sinha
No ratings yet
Developing English-Urdu Machine Translation Via Hindi: R. Mahesh K. Sinha
7 pages
IJSRET V10 Issue3 125
No ratings yet
IJSRET V10 Issue3 125
3 pages
Voice Based Translator
No ratings yet
Voice Based Translator
4 pages
Project Report
No ratings yet
Project Report
12 pages
Typology of The Main Parts of The Sentence
No ratings yet
Typology of The Main Parts of The Sentence
22 pages
Cambridge Grammar of English Carter. Types of Sentences
100% (1)
Cambridge Grammar of English Carter. Types of Sentences
5 pages
Hindi To English Machine Translation
No ratings yet
Hindi To English Machine Translation
4 pages
Prefixes and Suffixes - Advanced English
No ratings yet
Prefixes and Suffixes - Advanced English
8 pages
Extending Capabilities of English To Marathi Machine Translator
No ratings yet
Extending Capabilities of English To Marathi Machine Translator
8 pages
Extending Capabilities of English To Marathi Machi PDF
No ratings yet
Extending Capabilities of English To Marathi Machi PDF
8 pages
Silent Letters
No ratings yet
Silent Letters
17 pages
Prueba Corta Modales de Obligación Inglés
No ratings yet
Prueba Corta Modales de Obligación Inglés
3 pages
General Evaluator
No ratings yet
General Evaluator
2 pages
First Conditional
No ratings yet
First Conditional
2 pages
Introduction to Programming Languages
From Everand
Introduction to Programming Languages
IntroBooks Team
4/5 (1)
Provim Perfundimtar Niveli C2
No ratings yet
Provim Perfundimtar Niveli C2
4 pages
Parts of Speech - Mind Map
No ratings yet
Parts of Speech - Mind Map
1 page
Levels of Comprehension and Question Types
No ratings yet
Levels of Comprehension and Question Types
14 pages
Ling313 Midterm
No ratings yet
Ling313 Midterm
49 pages
Inanimate Nouns As Subjects in Mi'gmaq Consequences For Agreement Morphology PDF
No ratings yet
Inanimate Nouns As Subjects in Mi'gmaq Consequences For Agreement Morphology PDF
15 pages
Natural Language Understanding: Fundamentals and Applications
From Everand
Natural Language Understanding: Fundamentals and Applications
Fouad Sabry
No ratings yet
Business Meeting Handouts
No ratings yet
Business Meeting Handouts
14 pages
Tugas 3 - PBIS 4102
No ratings yet
Tugas 3 - PBIS 4102
3 pages
Language Identification: Fundamentals and Applications
From Everand
Language Identification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Statistical Semantics: Fundamentals and Applications
From Everand
Statistical Semantics: Fundamentals and Applications
Fouad Sabry
No ratings yet
English 5 (Week 7)
No ratings yet
English 5 (Week 7)
13 pages
Tavernier 2001 JNES
No ratings yet
Tavernier 2001 JNES
16 pages
2013 KB Mock English Paper I
No ratings yet
2013 KB Mock English Paper I
5 pages
Unit 2 Adverbs of Frequency
No ratings yet
Unit 2 Adverbs of Frequency
5 pages
Key 31-40
No ratings yet
Key 31-40
11 pages
Cho (2016)
No ratings yet
Cho (2016)
12 pages
Past Perfect 1 American English Student B1 B2
No ratings yet
Past Perfect 1 American English Student B1 B2
6 pages
Modal Verbs
No ratings yet
Modal Verbs
2 pages
Present Tenses - Rules
No ratings yet
Present Tenses - Rules
2 pages
American English File Book2 WB Removed
No ratings yet
American English File Book2 WB Removed
1 page

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Fin Irjmets1702791465

Uploaded by

Fin Irjmets1702791465

Uploaded by

e-ISSN: 2582-5208

International Research Journal of Modernization in Engineering Technology and Science

HINGLISH TO ENGLISH TRANSLATION SYSTEM

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

Figure 1: Proposed Architecture

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.