Alex Project Report
Alex Project Report
REPORT
ON
ALEX: The
Personal Virtual
Assistant
Submitted in the partial fulfilment
of award of
BACHELOR OF
TECHNOLOGY
Degree In
Computer Science and
Engineering
❖ Declaration
❖ Certificate
❖ Acknowledgement
❖ Abstract
❖ Problem Statement
❖ Purpose
❖ Scope
❖ Applicability
❖ Introduction
❖ Objectives
❖ Survey of Technology
❖ Feasibility
❖ Sequence diagram
PROBLEM STATEMENT
Usually, user needs to manually manage multiple sets of applications to complete one
task. For example, a user trying to make a travel plan needs to check for airport codes
for nearby airports and then check travel sites for tickets between combinations of
airports to reach the destination. There is need of a system that can manage tasks
effortlessly. We already have multiple virtual assistants. But we hardly use it. There
are number of people who have issues in voice recognition. These systems can
understand English phrases but they fail to recognize in our accent. Our way of
pronunciation is way distinct from theirs. Also, they are easy to use on mobile devices
than desktop systems. There is need of a virtual assistant that can understand English
in Indian accent and work on desktop system. When a virtual assistant is not able to
answer questions accurately, it’s because it lacks the proper context or doesn’t
understand the intent of the question. Its ability to answer questions relevantly only
happens with rigorous optimization, involving both humans and machine learning.
Continuously ensuring solid quality control strategies will also help manage the risk
of the virtual assistant learning undesired bad behaviors. They require large amount of
information to be fed in order for it to work efficiently. Virtual assistant should be
able to model complex task dependencies and use these models to recommend
optimized plans for the user. It needs to be tested for finding optimum paths when a
task has multiple sub-tasks and each sub-task can have its own sub-tasks. In such a
case there can be multiple solutions to paths, and the it should be able to consider user
preferences, other active tasks, priorities in order to recommend a particular plan
PURPOSE
Purpose of virtual assistant is to being capable of voice interaction, music playback,
making to-do lists, setting alarms, streaming podcasts, playing audiobooks, and
providing weather, traffic, sports, and other real-time information, such as news.
Virtual assistants enable users to speak natural language voice commands in order to
operate the device and its apps. There is an increased overall awareness and a higher
level of comfort demonstrated specifically by millennial consumers. In this ever-
evolving digital world where speed, efficiency, and convenience are constantly being
optimized, it’s clear that we are moving towards less screen interaction.
Scope
Voice assistants will continue to offer more individualized experiences as they get
better at differentiating between voices. However, it’s not just developers that need to
address the complexity of developing for voice as brands also need to understand the
capabilities of each device and integration and if it makes sense for their specific
brand. They will also need to focus on maintaining a user experience that is consistent
within the coming years as complexity becomes more of a concern. This is because
the visual interface with voice assistants is missing. Users simply cannot see or touch
a voice interface.
Applicability
The mass adoption of artificial intelligence in users’ everyday lives is also fueling the
shift towards voice. The number of IoT devices such as smart thermostats and
speakers are giving voice assistants more utility in a connected user’s life. Smart
speakers are the number one way we are seeing voice being used. Many industry
experts even predict that nearly every application will integrate voice technology in
some way in the next 5 years. The use of virtual assistants can also enhance the
system of IoT (Internet of Things). Twenty years from now, Microsoft and its
competitors will be offering personal digital assistants that will offer the services of a
full-time employee usually reserved for the rich and famous.
INTRODUCTION
VIRTUAL ASSISTANT
In today’s era almost all tasks are digitalized. We have Smartphone in hands and it is
nothing less than having world at your finger tips. These days we aren’t even using
fingers. We just speak of the task and it is done. There exist systems where we can
say Text Dad, “I’ll be late today.” And the text is sent. That is the task of a Virtual
Assistant. It also supports specialized task such as booking a flight, or finding
cheapest book online from various ecommerce sites and then providing an interface to
book an order are helping automate search, discovery and online order operations.
Virtual Assistants are software programs that help you ease your day to day tasks,
such as showing weather report, creating reminders, making shopping lists etc. They
can take commands via text (online chat bots) or by voice. Voice based intelligent
assistants need an invoking word or wake word to activate the listener, followed by
the command. For my project the wake word is ALEX. We have so many virtual
assistants, such as Apple’s Siri, Amazon’s Alexa and Microsoft’s Cortana. For this
project, wake word was chosen ALEX.
Voice searches have dominated over text search. Web searches conducted via mobile
devices have only just overtaken those carried out using a computer and the analysts
are already predicting that 50% of searches will be via voice by 2020.Virtual
assistants are turning out to be smarter than ever. Allow your intelligent assistant to
make email work for you. Detect intent, pick out important information, automate
processes, and deliver personalized responses.
This project was started on the premise that there is sufficient amount of openly
available data and information on the web that can be utilized to build a virtual
assistant that has access to making intelligent decisions for routine user activities.
OBJECTIVES
Main objective of building personal assistant software (a virtual assistant) is using
semantic data sources available on the web, user generated content and providing
knowledge from knowledge databases. The main purpose of an intelligent virtual
assistant is to answer questions that users may have. This may be done in a business
environment, for example, on the business website, with a chat interface. On the
mobile platform, the intelligent virtual assistant is available as a call-button operated
service where a voice asks the user “What can I do for you?” and then responds to
verbal input.
Virtual assistants can tremendously save you time. We spend hours in online research
and then making the report in our terms of understanding. ALEX can do that for you.
Provide a topic for research and continue with your tasks while ALEX does the
research. Another difficult task is to remember test dates, birthdates or anniversaries.
It comes with a surprise when you enter the class and realize it is class test today. Just
tell ALEX in advance about your tests and she reminds you well in advance so you
can prepare for the test.
One of the main advantages of voice searches is their rapidity. In fact, voice is reputed
to be four times faster than a written search: whereas we can write about 40 words per
minute, we are capable of speaking around 150 during the same period of time15. In
this respect, the ability of personal assistants to accurately recognize spoken words is
a prerequisite for them to be adopted by consumers.
SURVEY OF TECHNOLOGY
Python
Python is an OOPs (Object Oriented Programming) based, high level, interpreted
programming language. It is a robust, highly useful language focused on rapid
application development (RAD). Python helps in easy writing and execution of codes.
Python can implement the same logic with as much as 1/5th code as compared to
other OOPs languages. Python provides a huge list of benefits to all. The usage of
Python is such that it cannot be limited to only one activity. Its growing popularity has
allowed it to enter into some of the most popular and complex processes like Artificial
Intelligence (AI), Machine Learning (ML), natural language processing, data science
etc. Python has a lot of libraries for every need of this project. For ALEX, libraries
used are speechrecognition to recognize voice, Pyttsx for text to speech, selenium for
web automation etc. Python is reasonably efficient. Efficiency is usually not a
problem for small examples. If your Python code is not efficient enough, a general
procedure to improve it is to find out what is taking most the time, and implement just
that part more efficiently in some lower-level language. This will result in much less
programming and more efficient code (because you will have more time to optimize)
than writing everything in a low-level language.
DBpedia
Knowledge bases are playing an increasingly important role in enhancing the
intelligence of Web and enterprise search and in supporting information integration.
The DBpedia leverages this gigantic source of knowledge by extracting structured
information from Wikipedia and by making this information accessible on the Web.
The DBpedia knowledge base has several advantages over existing knowledge bases:
it covers many domains; it represents real community agreement; it automatically
evolves as Wikipedia changes, and it is truly multilingual. The DBpedia knowledge
base allows you to ask quite surprising queries against Wikipedia for instance “Give
me all cities in New Jersey with more than 10,000 inhabitants” or “Give me all Italian
musicians from the 18th century”.
Quepy
Quepy is a python framework to transform natural language questions to queries in a
database query language. It can be easily customized to different kinds of questions in
natural language and database queries. So, with little coding you can build your own
system for natural language access to your database.
Pyttsx
Pyttsx stands for Python Text to Speech. It is a cross-platform Python wrapper for
textto-speech synthesis. It is a Python package supporting common text-to-speech
engines on Mac OS X, Windows, and Linux. It works for both Python2.x and 3.x
versions. Its main advantage is that it works offline. Speech Recognition This is a
library for performing speech recognition, with support for several engines and APIs,
online and offline. It supports APIs like Google Cloud Speech API, IBM Speech to
Text, Microsoft Bing Voice Recognition etc.
SQLite
SQLite is a capable library, providing an in-process relational database for efficient
storage of small-to-medium-sized data sets. It supports most of the common features
of SQL with few exceptions. Best of all, most Python users do not need to install
anything to get started working with SQLite, as the standard library in most
distributions ships with the sqlite3 module. SQLite runs embedded in memory
alongside your application, allowing you to easily extend SQLite with your own
Python code. SQLite provides quite a few hooks, a reasonable subset of which are
implemented by the standard library database driver.
Feasibility
Study Feasibility study can help you determine whether or not you should proceed
with your project. It is essential to evaluate cost and benefit. It is essential to evaluate
cost and benefit of the proposed system. Five types of feasibility study are taken into
consideration.
1. Technical feasibility: It includes finding out technologies for the project, both
hardware and software. For virtual assistant, user must have microphone to convey
their message and a speaker to listen when system speaks. These are very cheap now a
days and everyone generally possess them. Besides, system needs internet connection.
While using ALEX, make sure you have a steady internet connection. It is also not an
issue in this era where almost every home or office has Wi-Fi.
3. Economical feasibility: Here, we find the total cost and benefit of the proposed
system over current system. For this project, the main cost is documentation cost.
User also would have to pay for microphone and speakers. Again, they are cheap and
available. As far as maintenance is concerned, ALEX won’t cost too much.
Hardware:
• Pentium-pro processor or later.
Software:
• Windows 7(32-bit) or above.
• Chrome Driver
• SQLite
USE CASE DIAGRAM
In this project there is only one user. The user queries command to the system.
System then interprets it and fetches answer. The response is sent back to the user.
SEQUENCE DIAGRAM
The above sequence diagram shows how an answer asked by the user is being fetched
from internet. The audio query is interpreted and sent to Web scraper. The web
scraper searches and finds the answer. It is then sent back to speaker, where it speaks
the answer to user.
REFERENCE AND BIBLIOGRAPHY
• Websites referred
www.stackoverflow.com
www.pythonprogramming.net
www.codecademy.com
www.tutorialspoint.com
www.google.co.in
• Books referred
Python Programming - Kiran Gurbani
edureka!
• Documents referred
Designing Personal Assistant Software for Task Management using Semantic Web
Technologies and Knowledge Databases - Purushotham Botla