0% found this document useful (0 votes)
29 views18 pages

FINAL - MINI - PROJECT Report 2 (

Weird doc on voice assistant

Uploaded by

aryapatil624
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views18 pages

FINAL - MINI - PROJECT Report 2 (

Weird doc on voice assistant

Uploaded by

aryapatil624
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Juhu-Versova Link Road Versova, Andheri(W), Mumbai-53

Department of Computer Engineering


University of Mumbai

NOVEMBER- 2023

A MINI-PROJECT REPORT
ON

“ARTIFICIAL VOICE ASSISTANT”


BY

Arya Patil
Soham Vede
Yash Vichare
Saakshi Wagh

Under the guidance of

Internal Guide
Prof. B. N. Panchal

1
Juhu-Versova Link Road Versova, Andheri(W), Mumbai-53

CERTIFICATE

Department of Computer Engineering


This is to certify that
1. Arya Patil
2. Soham Vede
3. Yash Vichare
4. Saakshi Wagh

Have satisfactorily completed this project entitled

“ARTIFICIAL VOICE ASSISTANT”

Towards the partial fulfilment of the

SECOND YEAR BACHELOR OF ENGINEERING


IN
(COMPUTER ENGINEERING)

as laid by University of Mumbai.

Guide H.O.D.
Prof. B. N. Panchal Prof. Sunil P. Khachane

Principal
Dr. Sanjay Bokade

2
Project Report Approval for T. E.

This project report entitled “ARTIFICIAL VOICE ASSISTANT” by


Arya Patil, Soham Vede, Yash Vichare and Saakshi Wagh is
approved for the degree of Third-Year Bachelor of Computer
Engineering.

Examiners:

1---------------------------------------------

2.--------------------------------------------

Date:
Place:

3
Declaration

We wish to state that the work embodied in this project titled “Artificial Voice
Assistant” forms our own contribution to the work carried out under the guidance of “B.N
Panchal” at Rajiv Gandhi Institute of Technology.
I declare that this written submission represents my ideas in my own words and where
others' ideas or words have been included, I have adequately cited and referenced the original
sources. I also declare that I have adhered to all principles of academic honesty and integrity
and have not misrepresented or fabricated or falsified any idea/data/fact/source in my
submission. I understand that any violation of the above will be cause for disciplinary action
by the Institute and can also evoke penal action from the sources which have thus not been
properly cited or from whom proper permission has not been taken when needed.

(Students’ Signatures)

Arya Patil (B-316) _________________

Soham Vede (B-362) _________________

Yash Vichare (B-363) _________________

Saakshi Wagh(B-364) _________________

4
Abstract

In this project, we explore the process of creating a voice assistant. We begin by discussing
the various components of a voice assistant, including speech recognition, natural language
processing, and text-to-speech synthesis. We then describe the architecture of our system,
which is based on a modular design that allows for easy customization and extension. We
shall also discuss key algorithms and packages required, how they work and why they are
used. Finally, we present the results of our experiments, which demonstrate the effectiveness
of our approach in terms of accuracy and speed. This research project focuses on the
development of a Python-based Artificial Voice Assistant capable of understanding and
responding to natural language voice commands. The key objectives encompass technology
choice, algorithm enhancement, API integration, user interaction, testing and evaluation,
usability testing, and comprehensive documentation.
The literature survey provides valuable insights into the growth of voice-based assistants,
their applications, and relevant research. Notable works include "An Intelligent Voice
Assistant Using Android Platform" by Sutar Shekhar and colleagues, "Smart Home Using
Internet of Things" by Keerthana S and colleagues, "On the track of Artificial Intelligence:
Learning with Intelligent Personal Assistant" by Nil Goksel and colleagues, and "Speech
recognition using flat models" by Patrick Nguyen and colleagues.
The proposed concept involves the implementation of a personal voice assistant that utilizes a
Speech Recognition library to understand user commands and respond in voice through Text
to Speech functions. Underlying algorithms convert voice to text for processing user
instructions.
The workflow involves capturing user commands and triggering the virtual assistant based on
specific keywords, including "Wikipedia," "Google chrome," "Time," and more. These
trigger words initiate corresponding actions by the voice assistant.
The architecture comprises four phases: capturing input through a microphone, recognizing
audio data and converting it into text using NLP, processing the data through Python scripts,
and presenting the output as text or speech using TTS.
The algorithm encompasses setting up the speech engine, defining functions for user
greetings, capturing and recognizing user speech, and executing commands based on trigger
words.

5
Contents

Introduction 1
1 1.1 Introduction Description ...………………………………………. 1
1.2 Organization of Report ………………………………………… 2
Literature Review 3
2.1 Survey Existing system …………………………………………….. 3
2
2.2 Limitation Existing system or research gap ……………………… 4
2.3 Problem Statement and Objective………………………………….. 5
Proposed System
3.1 Algorithm……………………………………
3.2 Details of Hardware & Software…………………………………….
3.2.1 Hardware Requirement..……………………………………..
3
3.2.2 Software Requirement……………………………………….
3.3 Design Details……………………………………………………….
3.3.1 System Flow/System Architecture……………………………
3.3.2 Detailed Design
3.4 Methodology/Procedures…..
Results & Discussions
4
4.1 Results……………………………………………………………….
4.2 Discussion-Comparative study/ Analysis…………………………..
5 Conclusion and Future Work

References

6
CHAPTER 1
Introduction

1.1 Introduction Description


A voice assistant can be a digital assistant that uses human voice, language process
algorithms, and synthesis to pay attention to particular voice commands and come
applicable information or perform particular functions as appealed by the user supported
commands, commonly known as intents, tell by the user, voice assistants will come
applicable information by hearing for particular keywords and filtering out the close noise.
While voice assistants may be completely a software system primarily builds on and ready
to combine into all devices, some assistants are sketched individually for every unique
device applications, like the Amazon Alexa clock. Now a days, voice assistants are
combined into some of the devices we intend to use on a daily, like cell phones, computers,
and good speakers.
In this project, we have developed a static voice assistant using Python which will perform
operations like copy and paste files from one location to another location, to send a message
to the users mobile also order pizza and mobile using voice commands. The mass adoption
of AI in users’ everyday lives is additionally refueling the shift towards voice. One of the
most popular voice assistants are Siri, from Apple, Amazon Echo, which responds to the
name of Alexa from Amazon, Cortana from Microsoft, Google Assistant from Google, and
the recently appeared intelligent assistant under the name AIVA.
Furthermore, the voice assistant supports a basic GUI interface that has a button. The voice
assistant waits user to press button and greets the user. It can also open computer drives C:,
D: and E: (if present in computer). Note that to open any files stored offline, the user has to
modify the path or append elif ladder to add any new instructions.

1.2 Organization of Report


 Ch.1 Introduction
 Ch.2 Literature Review
 Ch.3 Proposed System
 Ch.4 Results & Discussion
 Ch.5 Conclusion and Future Work

7
CHAPTER 2
Literature Review

2.1 Survey existing system


Voice-based assistants have witnessed rapid advancements and widespread adoption due to
their integration into various devices like smart watches, speakers, and mobile phones. This
growth is driven by the demand for more convenient human-computer interactions
• An Intelligent Voice Assistant Using Android Platform" by Sutar Shekhar and
colleagues: This research emphasizes the use of voice commands by mobile users for
daily tasks, reducing the need for typing or manual inputs. The paper also mentions
the implementation of prediction technology for user activity-based recommendations
• Smart Home Using Internet of Things" by Keerthana S and colleagues: This
paper discusses the application of smart assistants in creating smart home systems
using Wi-Fi and IoT technology. It utilizes the CC3200MCU with built-in Wi-Fi
modules and temperature sensors to monitor and control electronic equipment in
response to sensed data
• On the track of Artificial Intelligence: Learning with Intelligent Personal
Assistant by Nil Goksel and colleagues: This research explores the potential of
Intelligent Personal Assistants (IPAs) that leverage advanced computing technologies
and Natural Language Processing (NLP) for learning and AI applications
• "Speech recognition using flat models" by Patrick Nguyen and colleagues: This
paper introduces the "Flat Direct Model (FDM)," a novel approach to speech
recognition that improves sentence consistency. The model deviates from the
traditional Markov model and offers a 3% absolute improvement in sentence error
rates over the baseline.

2.2 Limitation of existing system


Voice assistants, while increasingly prevalent, come with a range of shortcomings. Privacy
concerns arise due to their constant listening for wake words, potentially leading to unwanted
recordings. They often struggle with understanding accents, dialects, and non-native
speakers, hindering effective communication. Maintaining context during conversations can
be a challenge, causing interactions to feel disjointed. Misinterpretation of spoken words may
result in incorrect responses. Voice assistants may lack support for various device or software
functions and exhibit inconsistent performance based on internet connectivity. Security
vulnerabilities and susceptibility to hacking pose risks to user data. Monotonous responses
without emotional depth are common, and complex tasks or those requiring visual feedback

8
may not be handled well. Dependency on wake words, limited language support,
intrusiveness, and incompatibility issues also persist. Accessibility concerns and over-
reliance on these systems raise questions about their broader societal impact. Data privacy,
update-related changes, and tone appropriateness add further complexities to the voice
assistant landscape.

2.3 Objectives
1. Voice Assistant Development - Create a functional python based Artificial Voice
Assistant capable of understanding and responding to natural language voice
commands.
2. Algorithm Enhancement - Implement and optimize algorithms to enhance the
efficiency and accuracy of voice recognition and natural language processing within
the assistant.
3. API Integration - Integrate relevant APIs to expand the voice assistant's capabilities,
enabling it to perform actions such as opening Google, YouTube, setting timers, and
alarms based on user commands.
4. Python Language - Utilize the Python programming language to build the core
functionality of the voice assistant, ensuring code modularity and maintainability.
5. User Interaction - Develop an intuitive user interface that facilitates user interaction
with the voice assistant on Android devices.
6. Testing and Evaluation - Conduct extensive testing to ensure the accuracy of voice
recognition, the effectiveness of API integration, and the overall performance of the
voice assistant
7. Usability Testing - Gather user feedback through usability testing to identify areas for
improvement and refine the user experience.
8. Documentation - Maintain comprehensive documentation of the development
process, including algorithms, codebase, API integration, and user interface design.

2.4 Scope
The future scope of artificial voice assistants is exceptionally promising, with a trajectory
poised to revolutionize the way we interact with technology. As advancements in natural
language processing and artificial intelligence continue, these digital companions will evolve
to offer unparalleled user experiences. Their ability to comprehend and respond to natural
language will become even more seamless, enhancing their utility across various domains.
Moreover, the integration of voice assistants with emerging technologies like augmented
reality and virtual reality promises to usher in an era of immersive and interactive
interactions. With a focus on personalization, voice assistants will adapt to individual
preferences and needs, making them indispensable in daily life. In addition, they will exhibit
enhanced context awareness, ensuring more intuitive and contextually relevant responses.
Industries, ranging from healthcare to smart homes, will increasingly rely on voice assistants
for tasks, and as global support expands, these assistants will reach a more diverse audience.

9
While emphasizing privacy and security, voice assistants will continue to advance,
incorporating emotional intelligence, expanding their roles in education, commerce, and even
content creation. The future holds immense potential for these digital aides, and their
integration into various aspects of our lives is inevitable, driving us towards a more
connected and efficient future.

CHAPTER 3
Proposed System

3.1 Algorithm
1. Step 1: Import necessary packages.
2. Step 2: Set up the speech engine using pyttsx3 with voice options (Male or Female).
3. Step 3: Define a function to greet the user based on the current time.
4. Step 4: Create a function to capture and recognize user speech using a microphone
and Google's speech recognition.
5. Step 5: In the main function, listen for trigger words in user input. If trigger words are
detected, execute corresponding commands like opening websites, providing
information, or performing actions.
6. Step 6: End the program.

10
3.2 Hardware and Software:
3.2.1 Software Details:
Libraries and Modules:
speech_recognition: Used for voice recognition.
pyttsx3: Used for text-to-speech
conversion.datetime: Used for working with date and time.
wikipedia: Used to fetch information from Wikipedia.webbrowser
os: Used for interacting with the operating
system.time: Used for adding delays.
subprocess: Used for executing system commands.
wolframalpha: Used for answering factual question.

11
requests: Used for making HTTP requests to fetch weather data
APIs:Wolfram Alpha API: Used for answering factual questions.
OpenWeatherMap API: Used for fetching weather information.
PyQt5: Python library create graphical user interfaces (GUIs) for desktop applications.

Tkinter: GUI interface for Python

3.2.2 Hardware Details:


Microphone: The code expects access to a microphone for voice input.
Speakers: It requires speakers or audio output for text-to-speech responses.
Internet Connection: To fetch data from external sources such as Wikipedia, Wolfram
Alpha, and OpenWeatherMap, an internet connection is needed
Desktop Requirements: Any Windows OS: Windows 7 and above

3.3 Design Details


In this proposed concept effective way of implementing a Personal voice assistant, Speech
Recognition library has many in-built functions, that will let the assistant understand the
command given by user and the response will be sent back to user in voice, with Text to
Speech functions. When assistant captures the voice command given by user, the under lying
algorithms will convert the voice into text.

12
3.3.1 System flow
The commands given by the humans is stored in the variable statement.
If the following trigger words are present in the statement given by the user it invokes the
virtual assistant to execute the statement of the trigger word.
• Trigger words:-
• Wikipedia
• Google chrome , G-Mail and YouTube
• Time
• News
• Search
• Log off, Sign out
• who are you, what can you do
• who made you
• stop good bye

13
3.3.2 Architecture
The system design consists of:
1. Taking the input as speech patterns through microphone.
2. Audio data recognition and conversion into text.
3. Comparing the input with predefined commands.
4. Giving the desired output

The initial phase includes the data being taken in as speech patterns from the microphone. In
the second phase the collected data is worked over and transformed into textual data using
NLP. In the next step, this resulting string the data is manipulated through Python Script to
finalize the required output process. In the last phase, the produced output is presented either
in the form of text or converted from text to speech using TTS.

14
Features the System shall be developed to offer the following
features:
1) It keeps listening continuously in inaction and wakes up into action when called with
a particular predetermined functionality.
2) Browsing through the web based on the individual’s spoken parameters and then
issuing a desired output through audio and at the same time it will print the output on
the screen.

3.4 Methodology
In the development of our voice assistant, we have set clear objectives, focusing on its
fundamental purpose – to listen and open specific applications as per user commands. We
identified a range of target applications, encompassing web browsers, music players, and
various business tools, broadening the utility of the assistant. To enhance user experience, we
carefully designed the user interface, enabling seamless voice command input and providing
feedback for command confirmation. For accurate command recognition, we integrated a
voice recognition system through APIs. Our command recognition logic was meticulously
crafted to interpret distinct voice commands such as "Open Chrome" or "Start Spotify." We
ensured a user-friendly interaction by implementing a response generation mechanism that
confirms command execution and gracefully handles errors. This includes offering alternative
suggestions and seeking clarification for ambiguous commands, ensuring a smoother and
more effective user experience.

15
CHAPTER 4

Conclusion and Future Work

The future of artificial voice assistants holds great promise, with numerous avenues for
advancement. Enhancing natural language understanding (NLU) is a critical focus. Current
voice assistants are impressive, but further improvements in NLU will make them more
accurate and context-aware, supporting complex conversations and nuanced responses.
Multimodal integration is another key aspect. Combining voice commands with other sensory
inputs like vision and touch can lead to more immersive and intuitive interactions. Privacy
and security concerns need to be addressed, with a focus on stronger encryption, on-device
processing, and user data control.
Personalization is essential, tailoring responses to individual users, while expanding language
support fosters inclusivity. Emotional intelligence, accessibility, and ethical considerations
should guide future development. Voice assistants must seamlessly integrate with a growing
number of smart devices and continually improve to ensure responsible and meaningful
benefits to society.

Discussion
In this project, a voice assistant was created with basic experimental features. The voice
assistant includes a GUI interface implemented using tkinter in Python. GUI implementation
aims to keep the interface simple and is still at a primary level.
The voice assistant has default features for the implementation of a virtual voice assistant. At
a primary level, it is capable of performing basic tasks and opening required applications
such as YouTube, Gmail, and Google. However, there may be rare instances where the voice
assistant may not catch wrong or no input from the user due to environment disturbance.
It is important to note that the voice assistant's capabilities are limited and it is still in the
early stages of development. Further enhancements and improvements can be made to
expand its functionality and usability.
Overall, this project focuses on creating a voice assistant with basic experimental features and
a simple GUI interface using tkinter in Python. It serves as a starting point for further
development and refinement of the voice assistant's capabilities.
Note: The information provided above is based on the available feedback dated 13th October,
2023.

16
REFERENCES

• Voice Assistant using Artificial Intelligence Ms. Preethi Ginformation Technology


PSG Polytechnic College Coimbatore, India Poster: APIs for IPAs? Towards End-
User Tailoring of Intelligent Personal Assistants Daniel Rough University College
Dublin Dublin, Ireland daniel.roug
• 2017 2nd International Conference on Advanced Robotics and Mechatronics
(ICARM) An Intelligent Personal Assistant Robot: BoBi Secretary Jiansheng Liu
Shanghai NewReal Auto-System Co., Ltd, New Real.
• 2019 IEEE security and privacy workshops Smart Speaker Privacy Control - Acoustic
Tagging for Personal Voice Assistants Peng Cheng, Ibrahim Ethem BagcI Lancaster
University Lancaster, United Kingdom {p.cheng2.
• IWSDS 2020 Conference. LEARNING TO RANK INTENTS IN VOICE
ASSISTANTS Raviteja Anantha Apple Inc. Srinivas Chappidi Apple Inc.
• Pypi.com
• Wikepedia.com

17
Acknowledgement

We wish to express our sincere gratitude to Dr. Sanjay U. Bokade, Principal


and Prof. S. P. Khachane, H.O.D. of the Department Computer Engineering of Rajiv
Gandhi Institute of Technology for providing us an opportunity to do our project work on
“Artificial Voice Assistant”.

This project bears on imprint of many peoples. We sincerely thank our project guide, Prof.
B.N. Panchal for his/her guidance and encouragement in carrying out this synopsis work.

Finally, we would like to thank all our colleagues and friends who helped us in completing
project work successfully.

1. Arya Patil

2. Soham Vede

3. Yash Vichare

4. Saakshi Wagh

18

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy