0% found this document useful (0 votes)
6K views26 pages

AI - Book 10 - Part B - Answer Key (New Version)

Answer key to the KIPS Class 10th Code:417 solve book

Uploaded by

hisanskarsingh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6K views26 pages

AI - Book 10 - Part B - Answer Key (New Version)

Answer key to the KIPS Class 10th Code:417 solve book

Uploaded by

hisanskarsingh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Artificial Intelligence

(Part B)
Class 10
Unit 1: Introduction to AI
A. Short answer type questions.
1. Intelligence can be defined as the ability to solve complex problems and make decisions
and it enables living beings to adapt to different environments for their survival. It gives
humans the abilities to learn from experience, adapt to new situations, understand and
handle abstract concepts and control their environment.

2. People who possess this intelligence are skilled with words. They enjoy reading, express
themselves well in writing and are able to easily recognise and understand the meaning
and sounds of words.

3. Data bias arises when the data used to train an AI system is faulty or contains in-built
biases. For example, if an AI system is trained to recognise faces but the training data
primarily consists of lighter-skinned individuals, it may find it difficult to identify or
categorise persons with darker skin tones.

4. Machine Learning is a field of AI that enables machines to learn on their own and
improve with time through experience. In ML, machines learn from data fed to them
during the training phase and use this knowledge to improve their performance in
making accurate predictions.

5. Two important features of AI are:


 Thinks and learns like humans
 Mimics human intelligence

B. Long answer type questions.


1. Computer Vision (CV) helps computers see and derive meaningful information from
digital content, such as photographs and videos, analyse the information, and then take
decisions from it the same way as humans do. The main objective of Computer Vision is
to teach machines to collect information from pixels and make sense of it. The entire
process involves image acquiring, screening, analysing, identifying, and extracting
information.
In Computer Vision, AI first perceives the image with a technology, and then Computer
Vision and other AI algorithms identify and classify the elements in the image to
recognise it.
Self-driving automobiles: Self-driving automobiles use Computer Vision extensively.
Automated cars from companies like Tesla can detect the 360-degree movements of
pedestrians, cyclists, vehicles, etc. Computer Vision helps them detect and analyse
objects in real-time and take decisions like breaking, stopping or keep driving.
Facial recognition: Facial recognition is also an application of Computer Vision. Facial
recognition is a technology that is capable of identifying a person from a digital image or
video. It is used as security for unlocking devices like cell phones and tables and also by
investigation agencies to identify criminals.

2. Every person has different ways of learning and everyone uses different intelligences in
their daily lives. People possess different amounts and types of intelligence. These
intelligences are located in different areas of the brain and can either work
independently or together. For example, some people are good at understanding
rhythms and sound, some are good at physical activity like sports, while others are good
at logical and mathematical thinking. These multiple intelligences include the use of
words, numbers, pictures, music, logical thinking, the importance of social interactions,
introspection, physical movement and being in tune with nature. This difference in
intelligences is reflected in the theory of multiple intelligences. The theory of multiple
intelligences describes the different ways in which people learn and acquire
information.

3. a. Streaming Platforms like Netflix, Amazon, Sony live use AI powered recommendation
systems to suggest content based on our viewing history.
b. Navigation apps like Google maps use AI to provide voice-guided instructions on how
to arrive at a given destination as well as suggest the best route to avoid traffic.

4. Bias is the tendency to be partial to one person or thing over another. AI bias occurs
when an algorithm produces results that are biased because it is trained on biased data.
AI cannot think on its own and, hence, cannot have biases of its own. Bias can transfer
from the developer to the machine while the algorithm is being developed. The data fed
into an AI algorithm could cause bias for the following three reasons:
• The data does not reflect the main population.
• The data has been unethically manipulated.
• It is based on historic data, which itself is biased.

5. a. AI for kids: Young children today are tech-savvy and well-versed with technology.
Consider the scenario of a young child given an assignment to write an essay. In this
scenario, the child uses the AI powered ChatGPT application to automatically generate
and write an essay. This definitely raises some concerns. Though the child may seem
smart and skilled at using technology, getting the essay written by AI will cause the child
to lose the opportunity to think and learn.
b. Data privacy: We avail of many free services on the internet, leaving behind a trail of
data, but we are often not made aware of it. Companies such as Amazon, Alphabet,
Microsoft, Apple, Meta and others use AI to collect this data to gain, maintain, and direct
our attention. We can even say that these AI algorithms may know us better than we
know ourselves. Our data can be used to manipulate our behaviour by using it for
marketing and earning profits.

6. a. Price comparison websites: Websites like Compare India use Big Data to provide a
comparison of the prices of products from multiple vendors in one place.
b. Search engines: Search engines like Google collect massive amounts of data from
various sources, including search queries, web pages, and user behaviour, analyses this
data to provide better search results to users.

7. Artificial Intelligence (AI), Machine learning (ML), and Deep Learning (DL) are different
concepts:
AI refers to the field of computer science that can mimic human intelligence. The AI
machines are capable of learning on their own without human intervention. AI is a broad
term that includes both Machine Learning and Deep Learning.
Machine Learning enables machines to learn on their own and improve with time
through experience.
Deep Learning enables machines to learn and perform tasks on a large amount of data or
Big data. Due to the large amount of data, the system learns on its own by using multiple
machine learning algorithms working together to perform a specific task.
Machine Learning is a sub-category of AI, and Deep Learning is a sub-set of Machine
Learning, as it includes multiple machine learning modules. Deep Learning is the most
advanced form of Artificial Intelligence among these three. Next in line is Machine
Learning that demonstrates intermediate intelligent. Artificial Intelligence includes all the
concepts and algorithms that mimic human intelligence.
8. A machine that is trained with data, can think and make predictions on its own is an AI
machine. Not all devices which are termed as "smart" are AI enabled. Some of these
machines, equipped with IoT technology can connect to the internet and be operated
from remote distances but are not trained to think and take decisions on their own.
For example, an automatic washing machine can run on its own but it requires a human
to do the relevant settings everytime before washing. Hence it cannot be termed as an AI
machine. IoT based machines like remotely operated A/Cs that can be switched on and
off via the internet need humans to operate them, so they cannot be considered as AI
machines.

Unit 2: AI Project Cycle


A. Short answer type questions.
1. Data exploration is the process of analysing data to discover patterns and gain insights
using data visualisation methods like graphs. It simplifies complex data, helps in selecting
AI models and makes it easy to communicate insights to others.

2. Regression is a learning-based AI model used to predict continuous numerical values.


Continuous data means data that can have any value within a certain range, for example,
the price of a product or the salary of an employee. It is used to predict the behaviour of
one variable depending on the value of another variable.

3. Artificial Neural Networks (ANNs) are computational networks that are at the heart of
deep learning algorithms, a subfield of Artificial Intelligence. They are designed to mimic
the structure of the human brain and are inspired by how the human brain interprets and
processes information.

4. When it comes to large datasets, neural networks perform much better than traditional
machine learning algorithms. Unlike traditional machine-learning algorithms that reach a
saturation point and stop improving, large neural networks show better performance
with large amounts of data.

5. The output layer receives the data from the last hidden layer and gives it as the final
output to the user. Similar to the input layer, output layer too does not process the data.
It serves as a user-interface, presenting the final outcome of the network's computations
to the user.

6. This stage involves the exploration and analysis of the collected data to interpret
patterns, trends, and relationships. The data is in large quantities. In order to easily
understand the patterns, you can use different visual representations such as graphs,
databases, flowcharts, and maps.

7. The two main types of data used in AI projects are:


Training data: Training data is the initial dataset used to train an AI module. It is a set of
examples that helps the AI model learn and identify patterns or perform particular tasks.
Testing data: Testing data is used to evaluate the performance of the AI module.

8. Hidden layers are the layers where all the processing occurs. Each node in the hidden
layers has its own machine learning algorithm which processes the data received from
the input layer. The last hidden layer passes the final processed data to the output layer.
B. Long answer type questions.
1. Following are the differences between the rule-based approach and the learning-based
approach:

Rule-based approach Learning-based approach


The machine follows the rules defined by The machine learns on its own from the
the developer. data.
AI is achieved through rule-based technique. AI is achieved through learning technique.
It typically uses labelled data. It can handle both labelled and unlabelled
data.
It may require less training time. It requires more training time.

2. Following are some advantages of neural networks:


 ANNs are fast, efficient and have powerful parallel processing capacities enabling them
to handle vast amounts of data and carry out multiple tasks simultaneously.
 ANNs have the ability to learn and improve from experience and thus enhance their
performance over time.
 ANNS can learn and generalise from training data so there is no need for extensive
programming to instruct the network on how to solve a specific problem.
 ANNS are highly fault-tolerant. The data is stored and distributed across the entire
neural network, such that even if some nodes of the network are unavailable or get
currupted, the entire network keeps functioning.
 ANNs can effectively learn from unorganised, incomplete, complex and non-linear
data and produce accurate output. This makes them suitable for real-world
applications where data may be incomplete.
 ANNs are flexible and can quickly change, adapt and adjust to new environments and
situations.
3. Data visualisation methods are used in the data exploration process of the AI project
cycle to analyse data to discover patterns and gain insights. Different data visualisation
methods are used, like bar chart, line chart, histograms, etc.
Bullet graph: A bullet graph, or a bullet chart, is a variation of a bar chart, consisting of a
primary bar layered on top of a secondary stack of less-prominent bars.
Histogram: A histogram is the graphical representation of data where data is grouped
into continuous number ranges and each range corresponds to a vertical bar. A histogram
divides up the range of possible values in a data set into groups. For each group, a
rectangle (bar) is constructed with a base length equal to the range of values in that
specific group and a length equal to the number of observations (frequency) falling into
that group.

4. Two sources of data are:


Surveys: A survey is a method of gathering specific information from a group of people by
asking them questions. They enable us to collect valuable data quickly and efficiently.
Surveys can be conducted on paper, through face-to-face or telephone interviews, or
through online forms. For example, census surveys are conducted every year for
population analysis.
Web scraping: Web scraping is the process of collecting information from websites in an
automated manner. You can copy and paste the data from the website into a document
and use it as a data source. Web scraping can be done manually, but if a large number of
websites need to be accessed, some automated web scraping tools,
like BeautifulSoup or Scrapy, can be used to help speed up the process.

5. a. Regression: Regression is a learning-based AI model used to predict continuous


numerical values. It is used to predict the behavior of one variable depending upon the
value of another variable.
A special function called mapping function helps us
predict future values based on the data we have.
For instance, we can use an employee’s past salary
data to train the AI model to predict the future
salary. The algorithm learns from this data and
creates a solid line on a graph that represents the
pattern in the salaries. In the given graph, the blue
dots represent the data values we have i.e. the
past salaries, and the solid line represents the
function that helps us predict future values.
b. Dimensionality reduction: Dimensionality reduction makes complex data simpler,
even though it comes at the cost of losing some
information. It is a way to simplify complex data
while still making sense of it. Examples are
document classification and image compression.
Imagine you're holding a 3D ball in your hand. If
you take its picture, the image becomes a 2D flat
picture. Thus, when you reduce a dimension, you
lose some information. In the 2D picture of the
ball, you can't see the back of the ball or if it's the
same color as the front.

6. The Problem Statement Template aids in summarising all the key points in the 4Ws
problem canvas into a single template, which enables us to quickly get back the ideas as
needed in the future. For the purpose of further analysis and decision-making, this
template makes it simple to understand and remember the important aspects of the
problem.

7. The remaining stages are:


 Data acquisition
 Data exploration
 Modelling
 Evaluation

8. Neural Networks are computational networks that are at the heart of deep learning
algorithms, a subfield of Artificial Intelligence. Similar to how our brains learn from
experiences, neural networks learn from examples to understand new situations. A
neural network is initially trained on large amounts of input data. The network recognises
the patterns in this data, learns from it using machine learning techniques and can then
make predictions on a new dataset. It is a fast and efficient way to solve problems for
which the dataset is very large, such as in images and vidoes.
Features of neural networks:
 Artificial neural networks are extremely powerful computational algorithms or models.
 Neural Network systems are modeled on the structure and function of the human
brain and nervous system.
 The most powerful feature of neural networks is that once trained, they can
independently process new data, take decisions and make predictions without human
intervention.
Unit 3: Advance Python
A. Short answer type questions.
1. Anaconda distribution is a powerful and widely used open source distribution of Python
language for scientific computations, machine learning and data science tasks. It is an
essential tool for data scientists, researchers and developers as it includes essential pre-
istalled libraries. It simplifies the process of managing software packages and
dependencies.

2. Once you have launched Jupyter Notebook within your virtual environment, you can
execute commands by creating and running Python code cells within a notebook.
 Create a New Notebook or Open an Existing One.
 Once you have a notebook open, you'll see an empty code cell where you can
enter Python code. Click on the cell to select it, and then type or paste your
Python code into the cell.
 After entering your Python code in a cell, you can execute it by either pressing
"Shift + Enter" or clicking the "Run" button in the toolbar. This will run the code in
the selected cell and display the output directly below the cell.

3. Venv module are tools that allow users to create virtual environments. These virtual
environments contain their own Python interpreter and package installation directories.
Thus, each project can have its own set of libraries Python versions to avoid conflicts
between different projects.

4. Membership operators 'in' and 'not in' are used to check if a value exists in a list or
sequence or not.

5. In some cases, a condition in a 'for' or 'while' loop does not ever become false, hence
the statements between the loop keep repeating indefinitely. This is called an infinite
loop.

6. The else statement works with single conditions whereas the elif statement is used to
test multiple conditions.

7. A loop inside a loop is called a nested loop.


8. The syntax for nested 'if-else' are:
if condition1:
Code block1
if condition2:
Code block2
else:
Code block3
else:
Code block4

B. Long answer type questions.


1. The different ways to install Python libraries are:
 Importing the entire library: This allows to access all functionalities of the library:
import numpy
 Importing the library with an alias: This imports the library with an alias name:
import numpy as np
 Importing all functions from a library: This imports all functions and objects from the
library:
from numpy import *

2. The uses and applications of any four Python libraries are:


1) NumPy: Stands for Numerical Python. It is a fundamental pachkge for numerical
computations.
2) Pandas: it is a powerful data analysis and manipultation library.
3) MatPlotLib: Comprehensive library for creating static, anmated and interactive
visualisations in Python.
4) NLTK: Natural Language Toolkit used for interaction with humans in natural language.

3. The components of Jupyter Notebook are:


Menu: Located at the top of the page. It includes options to create new notebooks, open
existing notebooks and saving your work.
Code cells: Used to type and execute programs.
Markdown cells: Used for adding comments, headings and formatted text.
Output area: It displays the output of the code. This could be in the form of text, plots,
graphs etc.

4. A division operator (/) performs division and returns a floating-point result, even if the
operands are integers. If both operands are integers, the result will be a floating-point
number, including the fractional part.
Example:
result = 7 / 2
print(result) # Output: 3.5
The floor division operator (//) performs division and returns the quotient of the division,
rounded down to the nearest integer. If returns only whole numbers.
Example:
result = 7 // 2 print(result) # Output: 3

5. The input( ) function is used to take input from user. It accepts input from the console.
Example:
name = input(“Enter your name”)
age = int(input("Enter your age: "))
The print( ) function prints a message or value. It converts a value into string before
displaying it.
Example:
print("Hello, ", name) #name is a variable in which a string has been accepted

6. The 'for' loop is used when you are sure about the number of times a loop body will be
executed. It is also known as a definite loop. Whereas, the 'while' loop in Python
executes a set of statements based on a condition. If the test expression evaluates to
true, then the body of the loop gets executed. Otherwise, the loop stops iterating and the
control comes out of the body of the loop.

7. Nested if statements in Python refer to if statements that are placed inside other if
statements. The inner if statement gets executed only if the outer is true. An example to
check if a number is non-zero and odd or even:
num = int(input("Enter a number: "))
# Check if the number is positive
if num >= 0:
# Check if the number is even
if num % 2 == 0:
print("The number is even.")
else:
print("The number is odd."

8. The process of converting one data type to another is called typecasting.


The types of typecasting are:
Implicit typecasting: The typecasting where one data type is automatically converted to
another is called implicit typecasting. This makes programming simpler.
Example:
num1=10
num2= 20.5
sum = num1+num2
Here sum is automatically converted to a float, which is the higher data type.
Explicit typecasting: In this type, the data type conversion is done manually by using int(),
float() and str() functions.
Example:
age = int(input("Enter your age: "))
Here the input by the user is explicitly converted into an integer value and assigned to
the variable age.

9. The program to display numbers divisible by 7 and multiples of 5 between 1200 and 2200
is:
start = 1200
end = 2200
# Iterate through the range and display numbers meeting the criteria
print("Numbers divisible by 7 and multiples of 5 between 1200 and 2200:")
for num in range(start, end + 1):
if num % 7 == 0 and num % 5 == 0:
print(num)

10. The program to enter the monthly income of an employee between 40 and 60 years and
calculate the annual income tax is:

monthly_income = float(input("Enter the monthly income of the employee: "))


# Calculate annual income
annual_income = monthly_income * 12
# Calculate annual income tax based on income range
if annual_income <= 300000:
tax_amount = 0
elif annual_income <= 500000:
tax_percentage = 5
tax_amount = (annual_income * tax_percentage) / 100
elif annual_income <= 1000000:
tax_percentage = 20
tax_amount = (annual_income * tax_percentage) / 100
else:
tax_percentage = 30
tax_amount = (annual_income * tax_percentage) / 100
# Display the annual income tax
print("Annual Income Tax: ₹", tax_amount)

C. Predict the output of the following code snippets:

1. 200

2. Numbers 0 to 99

3. Numbers 1 to 6

4. FALSE

TRUE

5. 11.0

6. 36

7. Pooja you are 15 now, and you will be 16 next year.

8. 2

9. [1, 2, 3, 5, 7]

10. 3
Unit 4: Data Science
A. Short answer type questions.
1. NumPy stands for Numerical Python and is the fundamental package for mathematical
and logical operations on arrays in Python. NumPy is a commonly used package that
offers a wide range of arithmetic operations that make it easy to work with numbers as
well as arrays.

2. Data Science applications study the link between DNA and our health and find the
biological connection between genetics, diseases, and response to drugs or medicines.
This enables doctors to offer personalised treatment to people based on the research of
genetics and genomics.

3. Some online sources of data are:


 Open-sourced Government Portals
 Reliable Websites
 World organisations’ open-sourced statistical websites

4. Two features of the k-NN algorithm are:


 Classifies new information based on the closest surrounding points or neighbours to
determine its class or group. This means when new data appears it can be easily
classified into a suitable category by using k- NN algorithm.
 Utilises the properties of the nearest neighbours to decide how to classify unknown
points.

5. Histograms are used to accurately represent continuous data. They are particularly suited
for plotting the variation in a value over a period of time.

6. There was a time when finance companies were facing large amounts of bad debts. Using
data science, the companies analysed the customer profile, past expenditures, and other
essential variables and then analysed the possibilities of risk and default to decide whom
to give loans and how much. Based on this, they were able to reduce losses.

B. Long answer type questions.


1. Pandas is well suited for different kinds of data, like:
 Tabular data with heterogeneously-typed columns, as in an SQL table or Excel
spreadsheet. This is structured data that may consist of data of different data types,
arranged in the form of rows and columns in a table.
 Ordered (data in a sequence) and unordered (data not in a sequence) time series data.
Time series data involves recording observations at multiple time points.
 Arbitrary matrix data (homogeneously typed or heterogeneous) with row and column
labels. This means data arranged in a matrix-like format that can be of the same data
type or different data types across the matrix.
 Any other form of observational or statistical data sets. Panda can handle data
arranged in different formats and structures.
 The data need not be labelled at all to be placed into a Pandas data structure. Pandas
is flexible and can work with data that may not have labels.

2. Data science can help identify the areas of improvement in order to keep airline
companies profitable. Some of the insights provided by data science are:
 Predict flight delays
 Analyse which flight routes are in demand
 Decide which class of airplanes to buy
 Plan the route – decide if it will be more cost effective to directly land at the
destination or take a halt in between
 Help design strategies to encourage and manage customer loyalty

3. a. CSV: CSV is a simple file format used to store tabular data. Each line of this file is a data
record and each record consists of one or more fields which are separated by commas.
Hence, the name is CSV, i.e., Comma Separated Values.
b. SQL: SQL or Structured Query Language is a specialised programming language used
for designing, programming and managing data within Database Management Systems
(DBMS). It is especially useful in handling structured data.

4. Using data science, finance companies analyse the customer profile, past expenditures,
and other essential variables and then analysed the possibilities of risk and default to
decide whom to give loans to and how much. Based on this, they are able to reduce
losses and it also helped them promote their banking products based on customers'
purchasing power. Real-time data analyses also help detect any fraudulent online
transactions or illegal activity and enable fraud detection and prevention.

5. a. Website Recommendations: Popular websites and streaming platforms such as


Amazon, X, Google Play, Netflix, LinkedIn, IMDB use data science to recommend
products, movies and shows as per the previous buying patterns of the user and the users
past searches. This provides for a better user experience and also helps businesses
increase their profits.
b. Search Engines: Search engines like Google, Bing, Ask and AOL use data science
algorithms extensively to provide users with the best search results based on the search
key in less than a second. Data science helps search engines read and analyse the
keywords you are searching for, searches through the content that is available on the
internet and determine which entries have relevant keywords. Data science also enables
Google to use your search history to predict your next Google search, improve search
results and show ads based on your interests on the web pages you reach from your
Google search. Google processes more than 20 petabytes of data everyday – this would
not have been possible without data science.

6. Four disadvantages of the k-nearest neighbour algorithm are:


 Can be computationally expensive for large datasets. This is because it needs to
calculate the distance between each test data point and every training data point,
which can be time-consuming.
 Requires high memory storage
 Can be sensitive to outliers in the data, which affect its performance. Since outliers are
different from the rest of the data, they can have a disproportionate impact on the
algorithms classification results.
 Requires a good choice of the K parameter that determines the number of nearest
neighbours used for classification. If K is too small or too large the algorithm may not
work correctly.

Unit 5: Computer Vision


A. Short answer type questions.
1. The number of pixels in an image is called its resolution. It determines the level of detail
or clarity of images displayed on screens or captured by cameras.

2. A Convolutional Neural Network is a deep Learning algorithm that is commonly used in


image recognition and processing. CNN analyses an image, extracts the best features and
reduces its size to make it manageable, while still preserving its important features. This
helps it to differentiate one image from the other.

3. An image convolution is simply an element-wise multiplication of the image array with


another array called the kernel followed by sum. This results in forming a convolution
matrix or filtered image.

4. The features like corners are easy to find as their exact location can be pinpointed in the
image. Thus, corners are always good features to extract from an image followed by the
edges.

5. The word "pixel" stands for "picture element". Every digital photograph is made up of
tiny elements called pixels. A pixel is the smallest unit of information that makes up a
text, image or video on a computer. Even a small image can contain millions of pixels of
different colours. Pixels are usually arranged in a 2-dimensional grid and are often in
round or square shape.
6. The objective of computer vision is to replicate both the way humans see and the way
humans make sense of what they see.

7. Computer vision models are trained on massive amounts of visual data. Once a large
amount of data is fed through the model, the computer will "look" at the data and teach
itself to differentiate one image from other using deep learning algorithms.

8. In image processing, the image can have features like a blob, an edge or a corner. These
features help us to perform certain tasks and analysis. Feature extraction refers to the
process of automatically extracting relevant and meaningful features from raw input
images. The features like corners are easy to find as their exact location can be
pinpointed in the image, whereas the patches that are spread over a line or an edge look
the same all along.

B. Long answer type questions.


1. Every RGB image is stored on a computer in the form of three different channels - the
Red channel, the Green channel and the Blue channel. Each channel contains a number of
pixels, with the value of each pixel ranging from 0 to 255. When all channels are
combined together, they form a coloured image. This means that in an RGB image, each
pixel has a set of three different values which together give colour to that particular pixel.

2. The Computer Vision domain of artificial intelligence enables machines to interpret visual
data, process it and analyse it using algorithms and methods to interpret real-world
phenomena. It helps machines derive meaningful information from digital images, videos
and other visual inputs and take actions based on that information.
Applications of Computer Vision are:
Face filters: This is one of the popular applications used in apps like Instagram and
Snapchat. A face filter is a filter applied to photographs, or videos in real time, to make
the face look more attractive. You can also use it to combine a face with animal features
to give it a funny appearance.
Facial recognition: With smart homes becoming more popular, computer vision is being
used for making homes more secure. Computer Vision facial recognition is used to verify
the identity of the visitors and guests and to maintain a log of the visitors. This
technology is also used in social networking applications for detecting faces and tagging
friends.

3. Each pixel in a digital image on a computer has a pixel value which determines its
brightness or colour. The most common pixel format is the byte image, where this value
is stored as an 8-bit integer having a range of possible values from 0 to 255. Typically,
zero is considered as no colour or black and 255 is considered to be full colour or white.
4. The CV tasks for a single object in an image are:
Image Classification: This task involves assigning a label to the entire image based on its
content.
Image Classification plus Localisation: This is the task which involves both processes of
identifying what object is present in the image and at the same time identifying at what
location that object is present in that image.

5. Humans see an image with the help of their eyes, and then the brain processes and
identifies the image through learning and experience. In computer vision, AI first
perceives the image with a sensing device, and then computer vision and other AI
algorithms identify and classify the elements in the image to recognise it.

6. The face-lock feature on smartphones uses computer vision to analyse and identify facial
features. When a user activates the face-lock feature, the smartphone's CV system
compares the facial features with pre-registered photographs stored on the device. If the
facial characteristics match, the device grants access2 to the user. This authentication
method offers convenience and security for the user.

7. Image classification involves identifying the main object category in a photo, while image
classification with localisation determines both the object's category and its precise
location within the image, often by drawing a bounding box around it.
For example, in an image showing a cat, the image classification algorithm will identify
and label the image as a cat. Whereas, the image classification with localisation algorithm
will not only identify the cat, but will also draw a box to indicate the location of the cat in
the image.

8. The convolution operator is a fundamental mathematical operation used in image


processing. It involves multiplying two arrays of numbers, element-wise, to produce a
third array. In image processing, convolution is a common tool used for image editing to
apply filters or effects, such as blurring, sharpening, outlining or embossing on an image.
An image convolution is simply an element-wise multiplication of the image array with
another array called the kernel followed by sum. This results in forming a convolution
matrix or filtered image.

9. There are two types of pooling which can be performed on an image. They are:
a. Max Pooling: This returns the maximum value from the portion of the image covered
by the Kernel.
b. Average Pooling: This returns the average value from the portion of the image covered
by the Kernel.
10. Visual search algorithms in search engines use computer vision technology to help you
search for different objects using real world images. CV compares different features of
the input image to its database of images, analyses the image features and gives us the
search result. Computer vision, combined with machine learning allows the device not
only to see the image, but also to interpret what is in the picture, helping make decisions
based on it.

11. Role of computer vision in the following fields:


a. Healthcare: Medical imaging has greatly benefited from computer vision. It not only
creates and analyses images, but also acts as an assistant and helps doctors to better
understand a patient’s health condition. CV analyses X-Rays, CT scans and MRI is used to
read and convert 2D scanned images into interactive 3D models. This results in an
increase in the accuracy and efficiency of diagnosis since the machines can identify
details invisible to the human eye.
b. Warehouses: CV can be used in warehouses to remove human error during the
receiving and storing process of products by automating the scanning and data entry
process for inventory management. Robots equipped with CV accurately pick parcels and
pack them. CV also automatically checks the order against the contents.

12. Autonomous driving involves identifying objects, getting navigational routes and
monitoring the surroundings. Automated cars from companies like Tesla can detect the
360-degree movements of pedestrians, vehicles, road signs and traffic lights and create
3D maps. CV helps them detect and analyse objects in real-time and take decisions like
breaking, stopping or keep driving.

Unit 6: Natural Language Processing


A. Short answer type questions.
1. Natural Language Processing is a field of artificial intelligence that enables computers to
understand and interpret human (natural) language. NLP takes a verbal or written input,
processes it and analyses it, based on which appropriate action can be taken.

2. Companies use Natural Language Processing applications, such as sentiment analysis, to


identify the emotions in the text and to categorise opinion about their products and
services as 'good', 'bad' or 'neutral'. This process can be used to identify emotions in text
even when it is not clearly expressed and enables companies to understand what
customers think about their brand and image. It helps not only to understand what
people like or dislike but understand what affects a customer’s choice in deciding what to
buy.
3. Some popular virtual assistants are Google Assistant, Cortana and Siri.

4. Script bots are used for simple functions like answering frequently asked questions,
setting appointments and on messaging apps to give predefined responses.

5. Example:
"The bat is hanging upside down on the tree."
"Anju bought a new bat for the cricket match finale."
In the first sentence, "bat" refers to a mammal hanging upside down. In the second, it is
cricket equipment used for hitting balls.

6. Stem: studi
Lemma: study

7. The name "bag" symbolises that the algorithm is not concerned with where the words
occur in the corpus, i.e., the sequence of tokens, but aims at getting unique words from
the corpus and the frequency of their occurrence.

8. The steps involved in the BoW algorithm are:


Step 1: Text Normalisation - Collect data and pre-process it
Step 2: Create Dictionary - Make a list of all the unique words occurring in the corpus
(Vocabulary).
Step 3: Create document vectors for each document - Find out how many times the
unique words from the document have occurred.
Step 4: Create document vectors for all the documents.
B. Long answer type questions.
1. Our brain keeps processing the sounds that it hears and tries to make sense out of them.
Sound travels through air, enters the ear and reaches the eardrum through the ear canal.
The sound striking the eardrum is converted into a neuron impulse and gets transported
to the brain. This signal is then processed by the brain to derive its meaning and helps us
give the required response.

2. Sometimes, a sentence can have a correct syntax but it does not mean anything.
For example, "Purple elephants dance gracefully on my ceiling."
This statement is correct grammatically but does not make any sense.

3. Text normalisation is a process that reduces the randomness and complexity of text by
converting the text data into a standard form. The text is normalised to a lower or
simplified level hence improving the efficiency of the model.

4. a. Sentence Segmentation: In sentence segmentation, the entire corpus is divided into


sentences. Based on punctuation marks the entire corpus is split into sentences.
b. Tokenisation: After segmenting the sentences, each sentence is further divided into
tokens. Tokenization is the process of separating a piece of text into smaller units called
tokens. Token is a term used for any word or number or special character occurring in a
sentence. Under tokenisation, every word, number and special character is considered as
a separate unit or token.

5. Differences between stemming and lemmatization are:


 In Stemming, the words left in the corpus are reduced to their root words. Stemming
is the process in which the affixes of words are removed and the words are converted
to their base form or "stem". Stemming does not take into account if the stemmed
word is meaningful or not. It just removes the affixes, hence it is faster. For example,
the words – 'programmer, programming and programs' are reduced to 'program'
which is meaningful, but 'universal' and 'beautiful' are reduced to 'univers' and
'beauti' respectively after removal of the affix and are not meaningful.
 Lemmatisation too has a similar function, removal of affixes. But the difference is that
in lemmatization, the word we get after affix removal, known as lemma, is a
meaningful one. Lemmatization understands the context in which the word is used
and makes sure that lemma is a word with meaning. Hence it takes a longer time to
execute than stemming. For example: 'universal' and 'beautiful' are reduced to
'universe' and 'beauty' respectively after removal of the affix and are meaningful.

6. Let us understand the steps involved in implementing a BoW by taking an example of


three documents with one sentence each.
Document 1: Hema is learning about AI
Document 2: Hema asked the smart robot KiBo about AI
Document 3: KiBo explained the basic concepts
Step 1: Text Normalisation - Collecting data and pre-processing it
Document 1: [hema, is, learning, about, ai]
Document 2: [hema, asked, the, smart, robot, kibo, about, ai]
Document 3: [kibo, explained, the, basic, concepts]
No tokens have been removed in the stopwords removal step because we have very little
data and since the frequency of all the words is almost the same, no word can be said to
have lesser value than the other.
Step 2: Create Dictionary - Make a list of all the unique words occurring in the corpus
(Vocabulary).
Listing the unique words from all three documents:

hema is learning about ai asked the


smart robot kibo explained basic concepts
Step 3: Create document vector
In this step, a table with frequency of unique words in each document is created. The
vocabulary, i.e., unique words are written in the top row of the table. For each document,
in case the word exists, the number of times the word occurs is written in the rows
below. If the word does not occur in that document, a 0 is put under it.
For example, for the first document:

hema is learning about ai asked the smart robot kibo explained basic concepts
1 1 1 1 1 0 0 0 0 0 0 0 0

Step 4: Create document vectors for all documents


In this table, the header row contains the vocabulary of the corpus and three rows below
it corresponds to the three different documents

hema is learning about ai asked the smart robot kibo explained basic concepts
1 1 1 1 1 0 0 0 0 0 0 0 0
1 0 0 1 1 1 1 1 1 1 0 0 0
0 0 0 0 0 0 1 0 0 1 1 1 1

7. In text processing we pay special attention to the frequency of words occurring in the
text, since it gives us valuable insights into the content of the document. Based on the
frequency of words that occur in the graph, we can see three categories of words. The
words that have the highest occurrence across all the documents of the corpus are
considered to have negligible value. These words, termed as stop words, do not add
much meaning to the text and are usually removed at the pre-processing stage. The
words that have moderate occurrence in the corpus are called frequent words. These
words are valuable since they relate to subject or topic of the documents and occur in
sufficient number throughout the documents. The less common words are termed as
rare words. These words appear in the least frequently but contribute greatly to the
corpus’ meaning. When processing text, we only take frequent and rare words into
consideration.

8. TFIDF expands to "Term Frequency & Inverse Document Frequency".


Applications of TFIDF:
 Document Classification: TFIDF can help in classifying the type and genre of a
document. This can be used to group similar documents and organise large document
collections.
 Topic Modelling: It helps in predicting the topic for a corpus.
9. Document 1: Neha and Soniya are classmates.
Document 2: Neha likes dancing but Soniya loves to study mathematics.

Step 1: Text Normalization


Document 1: [neha, and, soniya, are, classmates]
Document 2: [neha, likes, dancing, but, soniya, loves, to, study, mathematics]

Step 2: Create Dictionary (Vocabulary) Unique words from all documents:


[neha, and, soniya, are, classmates, likes, dancing, but, loves, to, study, mathematics]

Step 3: Create Document Vector for Document 1


neha and soniya are classmates likes dancing but loves to study Mathematics
1 1 1 1 1 0 0 0 0 0 0 0

Step 4: Create Document Vector all documents: 1 and 2


neha and soniya are classmates likes dancing but loves to study Mathematics
1 1 1 1 1 0 0 0 0 0 0 0
1 0 1 0 0 1 1 1 1 1 1 1

Unit 7: Evaluation
A. Short answer type questions.
1. Recall considers True Positive and False Negative cases.

2. Precision calculates the percentage of true positive cases versus all the cases where the
prediction is true.

3. The formula for Accuracy is:

4. F1 score can be defined as the measure of balance between Precision and Recall. F1 score
combines both Precision and Recall into a single number to give a better overall picture
of how well the model is performing.

5. True Positive in model evaluation is a case where predictions and reality match and
prediction is true (Positive). True Negative in model evaluation is a case where
predictions and reality match and prediction is true (Negative).
6. Recall is defined as the fraction of positive cases that are correctly identified.

7. Complete the following table:

B. Long answer type questions.


1. Situation where cost of False Negative is high: In case of autonomous cars if a model
gives a False Negative while detecting pedestrians or other vehicles, it could result in
accidents and result in loss of life.
Situations where cost of False Positive is high: A False positive by an airport security
screening model that predicts a security threat, when it predicts a threat when there isn’t
any, can lead to flight delays, inconvenience, and missed flights for travellers. The cost to
the airline is also high.

2. A confusion matrix is a summarised table used to anlayse and assess the performance of
an AI model. The matrix compares the actual target values with those predicted by the
model. This allows us to visualise how well our classification model is performing and
what kinds of errors it is making.

3. Evaluating model behaviour means checking how well the model "fits" the data. A good
fit means the model has identified the patterns and relationships in the training data
correctly and can make accurate predictions when it is tested with new, unseen data,
while a poor fit means it cannot make reliable predictions.
When a model’s output does not match the true function at all, the model is said to be
underfitting and its accuracy is lower.
When a model’s performance matches well with the true function, i.e., the model has
optimum accuracy, the model is called a perfect fit.
When a model’s model performance tries to cover all the data samples even if they are
out of alignment to the true function, the model is said to be overfitting and has a lower
accuracy.

4. Automated trade industry has developed an AI model which predicts the selling and
purchasing of automobiles. During testing, the AI model came up with the following
predictions:
The Confusion Matrix Reality: 1 Reality: 0

Predicted: 1 55 (TP) 12 (FP)


Predicted: 0 10 (FN) 20 (TN)

a. How many total tests have been performed in the above scenario?
Ans: Total tests performed: (55+10+12+20) = 97

b. Accuracy, Precision, Recall and F1 Score for the above predictions are:
 Accuracy: [(TP + TN) / Total tests] * 100 = [75 / 97] * 100 = 77.32%
 Precision: [TP / (TP + FP)] * 100 = [55 / 67] * 100 = 0.8209 * 100 OR 82.09%
 Recall: [TP / (TP + FN)] * 100 = [55 / 65] * 100 = 0.8462 * 100 OR 84.62%
 F1 Score: [2 * (Precision * Recall) / (Precision + Recall)] * 100 = [2 * (0.8209 *
0.8462) / (0.8209 + 0.8462)] * 100 ≈ 0.83

5. In order to assess if the performance of a model is good, we need two measures: Recall
and Precision. In some cases, you may have a high Precision but low Recall and in others,
low Precision but high Recall. But since both the measures are important, there is a need
of a metric which takes both Precision and Recall into account. The metric that takes into
account both these parameters is F1 Score. F1 score can be defined as the measure of
balance between Precision and Recall. F1 score combines both Precision and Recall into a
single number to give a better overall picture of how well the model is performing.

6. Recently the country was shaken up by a series of earthquakes which has done a huge
damage to the people as well as the infrastructure. To address this issue, an AI model has
been created which can predict if there is a chance of earthquake or not. The confusion
matrix for the same is:

a. How many total cases are True Negative in the above scenario?
Ans: 20
b. Precision, recall and F1 score of the above predictions are:

Precision: [TP / (TP + FP)] * 100


= [50 / (50 + 5)] * 100
= [50 / 55] * 100
≈ 0.9091 * 100 OR 90.91%

Recall (Sensitivity): [TP / (TP + FN)] * 100


= [50 / (50 + 25)] * 100
= [50 / 75] * 100
≈ 0.6667 * 100 OR 66.67%

F1 Score: [2 * (Precision * Recall) / (Precision + Recall)] * 100


= [2 * (0.9091 * 0.6667) / (0.9091 + 0.6667)] * 100
≈ 0.77

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy