0% found this document useful (0 votes)
26 views37 pages

Voice Recogination Report

The main objectives of building this AI-based virtual assistant project are: 1. To develop a virtual assistant that can understand natural language queries and respond intelligently. 2. To build an assistant that is user-friendly and provides an intuitive interface for interaction. 3. To equip the assistant with knowledge about various domains like weather, time, search engines etc. so it can answer different types of questions. 4. To use machine learning and natural language processing techniques to improve the assistant's understanding capabilities over time based on user interactions. 5. To develop a scalable and reliable system using cloud technologies so the assistant can be accessed from anywhere. 6. To automate routine tasks and workflows to save

Uploaded by

Prem Brahman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views37 pages

Voice Recogination Report

The main objectives of building this AI-based virtual assistant project are: 1. To develop a virtual assistant that can understand natural language queries and respond intelligently. 2. To build an assistant that is user-friendly and provides an intuitive interface for interaction. 3. To equip the assistant with knowledge about various domains like weather, time, search engines etc. so it can answer different types of questions. 4. To use machine learning and natural language processing techniques to improve the assistant's understanding capabilities over time based on user interactions. 5. To develop a scalable and reliable system using cloud technologies so the assistant can be accessed from anywhere. 6. To automate routine tasks and workflows to save

Uploaded by

Prem Brahman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 37

A Project Report on AI BASED - VIRTUAL ASSISTANT

Submitted in partial fulfillment of the


requirement for the award of the
degree of

B. Tech Computer Science


and Engineering

Under The Supervision


of Name of Supervisor
Dr. Suman Devi

Submitted By :
Vivek Kumar Mishra
20scse1010603

SCHOOL OF COMPUTING SCIENCE AND ENGINEERING


DEPARTMENT OF COMPUTER SCIENCE AND
ENGINEERING GALGOTIAS UNIVERSITY, GREATER
NOIDA
INDIA
SCHOOL OF COMPUTER SCIENCE
AND ENGINEERING
GALGOTIAS UNIVERSITY, GREATER NOIDA

CANDIDATE’S DECLARATION
I hereby certify that the work which is being presented in the project, entitled “AI BASED -
VIRTUAL ASSISTANT {GENESIS}” in partial fulfillment of the requirements for the award of
B.tech the submitted in the School of Computing Science and Engineering of Galgotias University,
Greater Noida, , Department of Computer Science and Engineering, Galgotias University, Greater
Noida.
The matter presented in the project has not been submitted by me/us for the award of any other
degree of this or any other places.
Vivek Kumar Mishra
20scse1010603

This is to certify that the above statement made by the candidates is correct to
the best of my knowledge.
Dr. Suman Devi
CERTIFICATE

The Final Project Viva-Voce examination of Vivek Kumar Mishra (20scse1010603) has been held
on and his/her work is recommended for the award of
B.Tech.

Signature of Examiner(s) Signature of Supervisor(s)

Signature of Project Coordinator: Signature of Dean:

Date:
Place: Greater Noida
Table of Content
Abstract
1. Introduction
1.1 Motivation behind this Project
1.2 Objectives
1.3 Purpose, Scope & Applicability

2. Literature Survey

3. System Architecture

4. Requirement & Analysis

4.1 Problem Definition


4.2 Requirements Specification
4.3 Hardware & Software Requirements

5. System Design
5.1 Activity Diagram

5.2 Class Diagram

5.3 Use Case Diagram

5.4 Sequence Diagram

5.5 Data Flow Diagram

5.6 Component Diagram

6. Result

6.1 Methodology
6.2 Code
6.3 Output

7. Conclusion

8. References
Abstract

The advantages that AI-based virtual assistants provide, such as improved


efficiency, convenience, personalisation, scalability, and innovation, might be used
as justification for developing one. AI-based virtual assistants automate processes,
improve workflows, and save time so people or organisations may concentrate on
important work. They are a practical choice for on-the-go support because they are
accessible from anywhere at any time. In order to provide individualised
recommendations and suggestions, virtual assistants learn from user behaviour
and preferences. They are a desirable alternative for businesses since they can
manage numerous jobs at once and are simple to scale to meet increasing demand.
Overall, creating an AI-based virtual assistant project is a unique and interesting
endeavour because it necessitates the use of cutting-edge technology

In recent years, virtual assistants have become increasingly popular due to their
ability to provide helpful information and assist users in various tasks. This project
aims to develop a virtual assistant that can interact with users, understand their
queries, and respond with appropriate information.
The virtual assistant is designed to be user-friendly and intuitive, making it easy for
users to access the information they need. It also learns from user interactions,
improving its responses over time. The assistant use natural language processing
(NLP) techniques to understand user queries and respond with relevant
information. It performs numerous tasks, such as it apprises about the date and
time,searching wikipedia, open youtube, google and stackoverflow, playing music.

The virtual assistant is developed using machine learning algorithms and cloud-
based technologies, ensuring scalability and reliability The project also include an
intuitive user interface, making it easy for users to interact with the virtual
assistant. Overall, this project aims to develop a powerful and user-friendly virtual
assistant that can assist users with various tasks and provide them with useful
information in an efficient and effective manner.
1. INTRODUCTION

In today’s era almost all tasks are digitalized. We have Smartphone in hands and it is
nothing less than having world at your finger tips. These days we aren’t even using
fingers. We just speak of the task and it is done. There exist systems where we can
say Text Dad, “I’ll be late today.” And the text is sent. That is the task of a Virtual
Assistant. It also supports specialized task such as booking a flight, or finding
cheapest book online from various ecommerce sites and then providing an interface
to book an order are helping automate search, discovery and online order operations.

Virtual Assistants are software programs that help you ease your day to day tasks,
such as showing weather report, creating reminders, making shopping lists etc. They
can take commands via text (online chat bots) or by voice. Voice based intelligent
assistants need an invoking word or wake word to activate the listener, followed by
the command. For my project the wake word is GENESIS.

We have so many virtual assistants, such as Apple’s Siri, Amazon’s Alexa and
Microsoft’s Cortana. For this project, wake word was chosen GENESIS. This system
is designed to be used efficiently on desktops. Personal assistant software improves
user productivity by managing routine tasks of the user and by providing information
from online sources to the user. GENESIS is effortless to use. Call the wake word
‘GENESIS’ followed by the command. And within seconds, it gets executed.

Virtual assistants are turning out to be smarter than ever. Allow your intelligent
assistant to make email work for you. Detect intent, pick out important information,
automate processes, and deliver personalized responses.

This project was started on the premise that there is sufficient amount of openly
available data and information on the web that can be utilized to build a virtual
assistant that has access to making intelligent decisions for routine user activities.
1.1 MOTIVATION BEHIND THIS PROJECT

1. Faster than text

Voice search is 3.7x faster than typing and takes lesser energy too. With a voice AI
assistant, your users are able to explain an issue in detail, which helps to assess the
problem at hand better.

Unlike text, voice AI bot understands emotions and intents to ensure transparent
communication with the user.

2. Multitasking with hands-free support

Your customer prefers frictionless and smooth customer service. Voice AI makes
this possible. Interacting and getting support through a voice AI assistant is hands-
free. All your users need to do is speak to the assistant and get their queries answered
effortlessly.

Voice AI also lets your customers multi-task. Since they don’t really need to hold a
device to contact customer support, users can focus on other tasks and save time.

Speed, precision, and convenience are key aspects users look for in customer support
– which voice technology fulfils.
3. Higher customer satisfaction
The kind of ease and simplicity voice AI brings makes it quite popular among
people. On average, 80% of people who shop using a voice AI assistant are satisfied
with their experience.

Voice AI runs with a powerful smooth NLP-enabled algorithm that quickly


understands what the user is talking about to give real-time responses. With no
frustration because of long waiting queues, your users can easily make purchases.

A voice AI assistant can also give them speedy resolutions if they are stuck
somewhere during the process. Subsequently, this leads to lower cart abandonment
rates and improved customer satisfaction – which looks great to your untapped
market!

4. Two-way communication

More often than not, conventional communication with emails and call centre IVRs
leaves consumers impatient and dissatisfied. This is because there is no natural flow
or consistency in conversations between a business and the customer.

However, voice technology is built on the cornerstones of promptness, clarity, and


precision. A voice AI assistant works together with the customer to arrive at accurate
and speedy results. It encourages a two-way conversation, which lets the user clearly
state their queries and get instant answers and resolutions.
5. Consumer insights with more personalisation

Voice AI technology is sharp. It keeps a track and record of any important data
provided by customers on your platform during the interaction.

Your voice AI assistant can store and utilize this information and the previously
entered data to offer accurate and relevant suggestions. Where humans are prone to
forget things, voice technology-based chatbots have a record of minute details.

No loss in data also equips you with better consumer insights. Voice AI makes it
easier for you to assess, understand, and tailor your products and services according
to different cohorts.
1.2 OBJECTIVES

Main objective of building personal assistant software (a virtual assistant) is using


semantic data sources available on the web, user generated content and providing
knowledge from knowledge databases. The main purpose of an intelligent virtual
assistant is to answer questions that users may have. This may be done in a
business environment, for example, on the business website, with a chat interface.
On the mobile platform, the intelligent virtual assistant is available as a call-button
operated service where a voice asks the user “What can I do for you?” and then
responds to verbal input.

Virtual assistants can tremendously save you time. We spend hours in online
research and then making the report in our terms of understanding. JIA can do that
for you. Provide a topic for research and continue with your tasks while JIA does
the research. Another difficult task is to remember test dates, birthdates or
anniversaries. It comes with a surprise when you enter the class and realize it is
class test today. Just tell JIA in advance about your tests and she reminds you well
in advance so you can prepare for the test.

One of the main advantages of voice searches is their rapidity. In fact, voice is
reputed to be four times faster than a written search: whereas we can write about40
words per minute, we are capable of speaking around 150 during the same period
of time15. In this respect, the ability of personal assistants to accurately recognize
spoken words is a prerequisite for them to be adopted by consumers.
1.3 PURPOSE, SCOPE AND APPILCABILITY

Purpose

Purpose of virtual assistant is to being capable of voice interaction, music


playback, making to-do lists, setting alarms, streaming podcasts, playing
audiobooks, and providing weather, traffic, sports, and other real-time information,
such as news. Virtual assistants enable users to speak natural language voice
commands in order to operate the device and its apps. There is an increased overall
awareness and a higher level of comfort demonstrated specifically by millennial
consumers. In this ever-evolving digital world where speed, efficiency, and
convenience areconstantly being optimized, it’s clear that we are moving towards
less screen interaction.

Scope

Voice assistants will continue to offer more individualized experiences as they get
better at differentiating between voices. However, it’s not just developers that need
to address the complexity of developing for voice as brands also need to
understand the capabilities of each device and integration and if it makes sense for
their specific brand. They will also need to focus on maintaining a user experience
that is consistent within the coming years as complexity becomes more of a
concern. This is because the visual interface with voice assistants is missing. Users
simply cannot see or touch a voice interface.

Applicability

The mass adoption of artificial intelligence in users’ everyday lives is also fueling
the shift towards voice. The number of IoT devices such as smart thermostats and
speakers are giving voice assistants more utility in a connected user’s life. Smart
speakers are the number one way we are seeing voice being used. Many industry
experts even predict that nearly every application will integrate voice technology
in some way in the next 5 years. The use of virtual assistants can also enhance the
system of IoT (Internet of Things). Twenty years from now, Microsoft and its
competitors will be offering personal digital assistants that will offer the services
of a full-time employee usually reserved for the rich and famous.
2. REQUIREMENT AND ANALYSIS

System Analysis is about complete understanding of existing systems and finding


where the existing system fails. The solution is determined to resolve issues in the
proposed system. It defines the system. The system is divided into smaller parts.
Their functions and inter relation of these modules are studied in system analysis.
The complete analysis is followed below.

2.1 Problem definition

Usually, user needs to manually manage multiple sets of applications to complete


one task. For example, a user trying to make a travel plan needs to check for
airport codes for nearby airports and then check travel sites for tickets between
combinations of airports to reach the destination. There is need of a system that
can manage tasks effortlessly.

We already have multiple virtual assistants. But we hardly use it. There are
number of people who have issues in voice recognition. These systems can
understand English phrases but they fail to recognize in our accent. Our way of
pronunciation is way distinct from theirs. Also, they are easy to use on mobile
devices than desktop systems. There is need of a virtual assistant that can
understand English inIndian accent and work on desktop system.

When a virtual assistant is not able to answer questions accurately, it’s because it
lacks the proper context or doesn’t understand the intent of the question. Its ability
to answer questions relevantly only happens with rigorous optimization, involving
both humans and machine learning. Continuously ensuring solid quality control
strategies will also help manage the risk of the virtual assistant learning undesired
bad behaviors. They require large amount of information to be fed in order for it to
work efficiently.

Virtual assistant should be able to model complex task dependencies and use these
models to recommend optimized plans for the user. It needs to be tested for
finding optimum paths when a task has multiple sub-tasks and each sub-task can
have its own sub-tasks. In such a case there can be multiple solutions to paths, and
the it should be able to consider user preferences, other active tasks, priorities in
order to recommend a particular plan.
2.2 REQUIREMENT SPECIFICATION

Personal assistant software is required to act as an interface into the digital world
by understanding user requests or commands and then translating into actions or
recommendations based on agent’s understanding of the world.

GENESIS focuses on relieving the user of entering text input and using voice as
primary means of user input. Agent then applies voice recognition algorithms to
this input and records the input. It then use this input to call one of the personal
information management applications such as task list or calendar to record a new
entry or to search about it on search engines like Google, Bing or Yahoo etc.
Focus is on capturing the user input through voice, recognizing the input and
thenexecuting the tasks if the agent understands the task. Software takes this input
in natural language, and so makes it easier for the user to input what he or she
desiresto be done.

Voice recognition software enables hands free use of the applications, lets users to
query or command the agent through voice interface. This helps users to have
access to the agent while performing other tasks and thus enhances value of the
system itself. GENESIS also have ubiquitous connectivity through Wi-Fi or LAN
connection, enabling distributed applications that can leverage other APIs exposed
on the web without a need to store them locally.

Virtual assistants must provide a wide variety of services. These include:

• Providing information such as weather, facts from e.g. Wikipedia etc. • Set an
alarm or make to-do lists and shopping lists.

• Remind you of birthdays and meetings.

• Play music from streaming services such as Saavan and Gaana.

• Play videos, TV shows or movies on televisions, streaming from e.g. Netflix or


Hotstar.

• Book tickets for shows, travel and movies.


Feasibility Study

Feasibility study can help you determine whether or not you should proceed with
your project. It is essential to evaluate cost and benefit. It is essential to evaluate
cost and benefit of the proposed system. Five types of feasibility study are taken
into consideration.

1. Technical feasibility: It includes finding out technologies for the project, both
hardware and software. For virtual assistant, user must have microphone to convey
their message and a speaker to listen when system speaks. These are very cheap
now a days and everyone generally possess them. Besides, system needs internet
connection. While using JIA, make sure you have a steady internet connection. It
is also not an issue in this era where almost every home or office has Wi-Fi.

2. Operational feasibility: It is the ease and simplicity of operation of proposed


system. System does not require any special skill set for users to operate it. In fact,
it is designed to be used by almost everyone. Kids who still don’t know to write
can read out problems for system and get answers.

3. Economical feasibility: Here, we find the total cost and benefit of the proposed
system over current system. For this project, the main cost is documentation cost.
User also would have to pay for microphone and speakers. Again, they are cheap
and available. As far as maintenance is concerned, JIA won’t cost too much.

4. Organizational feasibility: This shows the management and organizational


structure of the project. This project is not built by a team. The management tasks
are all to be carried out by a single person. That won’t create any management
issues and will increase the feasibility of the project.

5. Cultural feasibility: It deals with compatibility of the project with cultural


environment. Virtual assistant is built in accordance with the general culture. The
project is named JIA so as to represent Indian culture without undermining local
beliefs.
This project is technically feasible with no external hardware requirements. Also it
is simple in operation and does not cost training or repairs. Overall feasibility
study of the project reveals that the goals of the proposed system are achievable.
Decision is taken to proceed with the project.
2.3 HARDWARE AND SOFTWARE REQUIREMENTS

The software is designed to be light-weighted so that it doesn’t be a burden on the


machine running it. This system is being build keeping in mind the generally
available hardware and software compatibility.
Here are the minimum hardware and software requirement for virtual assistant.

Hardware:
• Pentium-pro processor or later.
• RAM 512MB or more.

Software:
• Windows 7(32-bit) or above.
• Python 2.7 or later
• Chrome Driver
• Selenium Web Automation
• SQLite
3.1 ACTIVITY DIAGRAM

Initially, the system is in idle mode. As it receives any wake up call it begins
execution.

The received command is identified whether it is a questionnaire or a task to be


performed. Specific action is taken accordingly. After the Question is being
answered or the task is beingperformed, the system waits for another command.
This loop continues unless it receives quitcommand. At that moment, it goes back
to sleep.
3.2 CLASS DIAGRAM

The class user has 2 attributes command that it sends in audio and the response it
receives which is also audio. It performs function to listen the user command.
Interpret it and then reply or sends back response accordingly. Question class has
the command in string form as it is interpreted by interpret class. It sends it to
generalor about or search function based on its identification.

The task class also has interpreted command in string format. It has various
functions like reminder, note, mimic, research and reader.
3.3 USE CASE DIAGRAM

In this project there is only one user. The user queries command to the
system. Systemthen interprets it and fetches answer. The response is sent
back to the user.
3.4 SEQUENCE DIAGRAM

Sequence diagram for Query-Response

The above sequence diagram shows how an answer asked by the user is
being fetched from internet. The audio query is interpreted and sent to
Web scraper. The web scraper searches and finds the answer. It is then
sent back to speaker, where it speaks the answer to user.
Sequence diagram for Task Execution

The user sends command to virtual assistant in audio form. The command is
passed to the interpreter. It identifies what the user has asked and directs it to task
executer. If the task ismissing some info, the virtual assistant asks user back
aboutit. The received information is sent back to task and it is accomplished.
After execution feedback is sent back to user.
3.5 DATA FLOW DIAGRAM

4.6.1 DFD Level 0 (Context Level Diagram)

DFD Level 1
DFD Level 2

Data Flow in Assistance

Managing User Data


Data Flow in Kid Zone

Settings of
virtual
Assistance
3.6 COMPONENT DIAGRAM

The main component here is the Virtual Assistant. It provides two specific service,
executing Task or Answering your question.
4. LITERATURE SURVEY

The field of voice-based assistants has observed major advancements and


innovations. The main reason behind such rapid growth in this field is its demand in
devices like smartwatches or fitness bands, speakers, Bluetooth earphones, mobile
phones, laptop or desktop, television, etc. Most of the smart devices that are being
brought in the market today have built in voice assistants.
The amount of data that is generated nowadays is huge and in order to make our
assistant good enough to tackle these enormous amounts of data and give better
results we should incorporate our assistants with machine learning and train our
devices according to their uses. Along with machine learning other technologies
which are equally important are IoT, NLP, Big data access management. The use of
voice assistants can ease out a lot of tasks for us. Just give voice command input to
the system and all tasks will be completed by the assistant starting from converting
your speech command to text command then taking out the keywords from the
command and execute queries based on those keywords.
In the paper “Speech recognition using flat models” by Patrick Nguyen and all, a
novel direct modelling approach for speech recognition is being brought forward
which eases out the measure of consistency in the sentences spoken. They have
termed this approach as Flat Direct Model (FDM). They did not follow the
conventional Markov model and their model is not sequential. Using their approach,a
key problem of defining features has been solved. Moreover, the template-based
features improved the sentence error rate by 3% absolute over the baseline [2].

Again, in the paper “On the track of Artificial Intelligence: Learning with Intelligent
Personal Assistant” by Nil Goksel and all, the potential use of intelligent personal
assistants (IPAs) which use advanced computing technologies and Natural Language
Processing (NLP) for learning is being examined. Basically, they have reviewed the
working system of IPAs within the scope of AI [4].

The application of voice assistants has been taken to some higher level in the paper
“Smart Home Using Internet of Things” by Keerthana S and all where they have
discussed how the application of smart assistants can lead to developing a smart
home system using Wireless Fidelity (Wi-Fi) and Internet of Things. They have used
CC3200MCU that has in-built Wi-Fi modules and temperature sensors. The
temperature that is sensed by the temperature sensor is sent to the microcontroller
unit (MCU) which is then posted to a server and using that data the status of
electronic equipment like fan, light etc is monitored and controlled [5].
The application of voice assistants has been beautifully discussed in the paper “An
Intelligent Voice Assistant Using Android Platform'' by Sutar Shekhar and all where
they have stressed on the fact that mobile users can perform their daily task using
voice commands instead of typing things or using keys on mobiles. They have also
used a prediction technology that will make recommendations based on the user
activity [6].

The incorporation of natural language processing (NLP) in voice assistants is really


necessary which will also lead to the creation of a trendsetting assistant. These factors
have been the key focus of the paper “An Intelligent Chatbot using Natural Language
Processing” by Rishabh Shah and all. They have discussed how NLP can help to
make assistants smart enough to understand commands in any native language and
thus does not prevent any part of the society form enjoying its perks [7].

We also studied the systems developed by Google Text To Speech – Electric Hook
Up (GTTS-EHU) for Query-by- example Spoken Term Detection (QbE-STD) and
Spoken Term Detection (STD) tasks of the Albayz in 2018 Search on Speech
Evaluation. For representing audio documents and spoken queries Stacked bottleneck
features (sBNF) are used as frame level acoustic representation. Spoken queries are
synthesized, average of sBNF representations is taken and then the average query is
used for Qbe- STD [8].

We have seen the integration of technologies like gTTS, AIML (Artificial


Intelligence Mark-up Language) in the paper “JARVIS: An interpretation of AIML
with integration of gTTS and Python” by Tanvee Gawand and all where they have
adopted the dynamic base Python pyttsx which is a text to speech conversion library
in python and unlike alternative libraries, it works offline [9].
5. SYSTEM ARCHITECTURE

Basic Workflow
The figure below shows the workflow of the main method of voice assistant. Speech
recognition is used to convert speech input to text. This text is then sent to the
processor, which determines the character of the command and calls the appropriate
script for execution. But that's not the only complexity. No matter how many hours
of input, another factor plays a big role in whether a package notices you. Ground
noise simply removes the speech recognition device from the target. This may be due
to the inability to essentially distinguish between the bark of a dog or the sound near
hearing that a helicopter is flying overheadfrom your voice.

Fig 2 Basic Workflow of the voice assistant


Detailed Workflow
Voice assistants such as Siri, Google Voice, and Bixby are already available on our
phones. According to a recent NPR study, around one in every six Americans
already has a smart speaker in their home, such as the Amazon Echo or Google
Home, and sales are growing at the same rate as smart phone sales a decade ago. At
work, though, the voicerevolution may still seem a long way off. The move toward
open workspaces is one deterrent: nobody wants to be that obnoxious idiot who can't
stop ranting at his virtual assistant.

Fig 3 Detailed Workflow of the voice assistant


6. Result

Voice assistant applications work based on Automatic Speech Recognition


(ASR) system. ASR systems record the speech and then break it down into
phonemes, which are later get processed into text. A phoneme (not words of
syllables) is a basic unit of measurement for human speech recognition.

The main purpose is to facilitate the users' daily lives by sensing the voice
and interpreting it into action.

6.1 Imported modules

Speech Recognition module


Since we're creating a voice assistant app, one of the most critical features is that
your assistant knows your voice. In the terminal, execute the following command to
install this module.

Date and Time module


Datetime package is used to showing Date and Time. Thisdatetime module comes
with builtin Python.
Wikipedia
We all know Wikipedia is a great and huge source of knowledge just like Geeks for
Geeks or any other sourceswe have used the Wikipedia module in our project to get
more information from Wikipedia or to perform a Wikipedia search. To install this
Wikipedia module use pipinstall wikipedia.

Web browser
To perform web search. This module comes built-in with Python.

OS
The OS module in Python provides functions for interacting with the os. OS comes
under Python’s standard utility modules. This module provides a way of using
operating system dependent functionality.

Pyaudio
PyAudio is a set of Python bindings for PortAudio, a cross- platform C++ library
interfacing with audio drivers.

PyQt5
PyQt5 is a comprehensive set of Python bindings for Qt v5.It is implemented as
more than 35 extension modules and enables Python to be used as an alternative
application development language to C++ on all supported platforms including iOS
and Android.PyQt5 may also be embedded in C++ based applications to allow users
of those applications to configure or enhance the functionality of those applications.
Python Backend
The python backend gets the output from the speech recognition module and then
identifies whether the command or the speech output is an API Call and Context
Extraction. The output is then sent back to the python backend to give the required
output to the user.
Text to speech module
Text-to-Speech (TTS) refers to the ability of computers to read text aloud. A TTS
Engine converts written text to a phonemic representation, then converts the
phonemic representation to waveforms that can be output as sound. TTS engines with
different languages, dialects and specialized vocabularies are available through third-
party publishers.

Speech to text conversion


Speech Recognition is used to convert speech input to textual output. It decodes the
voice and converts it into a textual format which will be understood by pc easily.

Content Extraction
Context extraction (CE) is the task of automatically extracting structured information
from unstructured and/or semi-structured machine-readable documents. In most
cases, this activity concerns processing human language texts using natural language
processing (NLP). Recent activities in multimedia document processing like
automatic annotation and content extraction out of images/audio/video could be seen
as context extraction test results.

Textual output
It decodes the voice command and performs the operation then shows the voice
command as textual output in the terminal.
6.2 Code

import pyttsx3
import speech_recognition as sr
import datetime
import wikipedia
import
webbrowser
import os

engine = pyttsx3.init('sapi5')
voices = engine.getProperty('voices')

engine.setProperty('voice',voices[0].id)

def speak(audio):
engine.say(audio)
engine.runAndWait()

def wishMe():
hour = int(datetime.datetime.now().hour)
if hour>=0 and hour<=12:
speak("Good Morning!")
elif hour>=12 and
hour<18:
speak("Good Afternoon!")
else:
speak("Good Evening!")

speak("I am Genesis Sir. Please tell me how may I help you")


print("I am Genesis Sir. Please tell me how may I help you")

def takeCommand():

r = sr.Recognizer()
with sr.Microphone() as
source: print("Listening
...................")
r.pause_threshold = 1
audio = r.listen(source)

try:
print("Recognizing. .")
query = r.recognize_google(audio, language='en-in')
print(f"User said: , {query}\n")

except Exception as e:

print("Say that again please. . .")


return "None"
return query

if name == " main ":


wishMe()
while True:
query = takeCommand().lower()

if 'wikipedia' in query:
speak('Searching Wikipedia..... .')
query = query.replace("Wikipedia", "")
results = wikipedia.summary(query, sentences=2)
speak("According to Wikipedia")
print(results)
speak(results)

elif 'open youtube' in query:


webbrowser.open("youtube.com")

elif 'open google' in query:


webbrowser.open("google.com")

elif 'open stackoverflow' in query:


webbrowser.open("stackoverflow.com")

elif 'the time' in query:


strTime = datetime.datetime.now().strftime("%H:%M:%S")
speak(f"Sir , the time is {strTime}")

elif 'open code' in


query: codePath =
"C:\\Users\\Rishabh\\AppData\\Local\\Programs\\Microsoft VS Code\\Code.exe"
os.startfile(codePath)

elif 'play music' in query or "play song" in query:


speak("Here you go with music")
# music_dir = "G:\\Song"
music_dir = "C:\\Users\\Rishabh\\Music"
songs = os.listdir(music_dir)
print(songs)
random = os.startfile(os.path.join(music_dir, songs[1]))

elif "why you came to world" in query:


speak("Thanks to You and Your Team. further It's a secret")
print("Thanks to You and Your Team. further It's a secret")

elif 'is love' in query:


speak("It is 7th sense that destroy all other senses")
print("It is 7th sense that destroy all other senses")
6.3 Output
7. CONCLUSION

This paper presents a comprehensive overview of the design and development of a


Static Voice enabled personal assistant for pc using Python programming language.
This Voice enabled personal assistant, in today's life style willbe more effective in
case of saving time and helpful to differently abled people, compared to that of
previous days. This Assistant works properly to perform some tasks given by user.
Furthermore, there are many things that this assistant is capable of doing, like
sendingmessage to user mobile, YouTube automation, gathering information from
Wikipediaand Google, with just one voice command.
Through this voice assistant, we have automated various services using a single line
command. It eases most of the tasks of the user like searching the web etc. We aim to
make this project a complete server assistant and make it smart enough to act as a
replacement for a general server administration. The project is built using open
source software modules with PyCharm community backing which can accommodate
any updates shortly. The modular nature of this project makes it more flexible and
easy to add additional features without disturbing current system functionalities.
8.REFRENCES

o [1] Shaughnessy, IEEE, “Interacting with Computers by Voice: Automatic


Speech Recognition and Synthesis” proceedings of the IEEE, vol. 91, no. 9,
september 2003.
o [2] Patrick Nguyen, Georg Heigold, Geoffrey Zweig, “Speech Recognition with

Flat Direct Models”, IEEE Journal of Selected Topics in Signal Processing,


2010.
o [3] Mackworth (2019-2020), Python code for voice assistant: Foundations of
Computational Agents- David L. Poole and Alan
 K. Mackworth.
o [4] Nil Goksel, CanbekMehmet ,EminMutlu, “On the track of Artificial

Intelligence: Learning with Intelligent Personal Assistant”, proceedings of


International Journal of Human Sciences, 2016.
o [5] Keerthana S, Meghana H, Priyanka K, Sahana V. Rao, Ashwini B “Smart

Home Using Internet of Things”, proceedings of Perspectives in


Communication, Embedded -systems and signal processing, 2017.
o [6] Sutar Shekhar, P. Sameer, Kamad Neha, Prof. Devkate Laxman, “An

Intelligent Voice Assistant Using Android Platform”, IJARCSMS, ISSN:


232-7782, 2017.
o [7] Rishabh Shah, Siddhant Lahoti, Prof. Lavanya. K, “An Intelligent Chatbot

using Natural Language Processing”, International Journal of Engineering


Research, Vol.6, pp.281-286, 2017.
o [8] Luis Javier Rodríguez-Fuentes, Mikel Peñagarikano, AparoVarona, Germán

Bordel, “GTTS-EHU Systems for the Albayzin 2018 Search on Speech


Evaluation”, proceedings of IberSPEECH, Barcelona, Spain, 2018.
o [9] Ravivanshikumar ,Sangpal,Tanvee ,Gawand,SahilVaykar, “JARVIS: An

interpretation of AIML with integration of gTTS and Python”, proceedings of


the 2019 2nd International Conference on Intelligent Computing,
Instrumentation and Control Technologies (ICICICT), Kanpur, 2019.
o [10] Luis Javier Rodríguez-Fuentes, Mikel Peñagarikano, AparoVarona,

Germán Bordel, “GTTS-EHU Systems for the Albayzin 2018 Search on


Speech Evaluation”, proceedings of IberSPEECH, Barcelona, Spain, 2018.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy