0% found this document useful (0 votes)

75 views

Natural Language Processing Notes Class 10 AI

Natural Language Processing (NLP) is a sub-field of AI that enables computers to understand and process human language, both spoken and written. Key applications include chatbots, sentiment analysis, and virtual assistants, which utilize various techniques such as tokenization, stemming, and lemmatization to analyze text. The document also outlines the project cycle for AI projects and the differences between human and computer languages, highlighting the complexities of human communication.

Uploaded by

sumanyumimroth

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

75 views

Natural Language Processing Notes Class 10 AI

Uploaded by

sumanyumimroth

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 25

AI Simplified by

Aiforkids.in

NLP Class
10
AI Notes
NATURAL
LANGUAGE
PROCESSING Process to simplify human
The ability of a computer to Start
lang. to make it
understand text and spoken understandable.
words NLP Process
Data
Ex. Mitsuku Bot, Clever
Wh Processing
Bot, Jabberwacky, and
Haptik. at Text Normalisation
Sentence Segmentation
Chat Tokenisation
Removal of Stop word
Bots Smart
Converting into same case
Bot Script Application
s of Stemming and
NLP
Bot
Lemmatization
Automatic Wh
Bag of word Algorithm
Summarization y
Sentiment TFIDF
Term Frequency
Analysis Text
Inverse Document Frequency
classification
Virtual Applications of TFIDF
Assistants Problems in
Understanding human
languages by computers. CLICK TEXT TO OPEN THE LINK
Human
Language Download Revision Notes
Computer Human
VS
Pdf Solve Important
Language Computer Questions Practice VIP
Arrangement of words & Questions PDF
meanings
Practice Sample Papers
(Structure) Syntax
Ask and Solve Doubts at
(Meaning) Semantics
Aiforkids Doubts corner
Multiple Meanings of a Practice NLP Explanation Video
word Perfect Syntax, no
Meaning
Youtube.com/ Aiforkids.in/class-
aiforkids 10/nlp

Learning is not a course, Its a path from passion to

" ~Lalit
Kumar
1

WHAT IS NLP?

Natural Language Processing (NLP) is the sub-field of AI that focuses

on the ability of a computer to understand human language (command)
as spoken or written and to give an output by processing it.

APPLICATIONS OF NLP → Feel Deta h, NLP KA

Automatic
Summarization
Summarizing the meaning of documents and information
Extract the key emotional information from the text to understand the
reactions (Social Media)

Sentiment
Analysis
Identify sentiments and emotions from one or more posts
Companies use it to identify opinions and sentiments to get
feedback Can be Positive, Negative or Neutral

Text
classification
Assign predefined categories to a document and organize it to help you
find the information you need or simplify some activities.
Eg: Spam filtering in email.
2

Virtual
Assistants
By accessing our data, they can help us in keeping notes of our tasks,
making calls for us, sending messages, and a lot more.
With speech recognition, these assistants can not only detect our
speech but can also make sense of it.
A lot more advancements are expected in this field in the near
future Eg: Google Assistant, Cortana, Siri, Alexa, etc

REVISING AI PROJECT CYCLE

Project Cycle is a step-by-step process to solve problems using

proven scientific methods and drawing inferences about them.

1 COMPONENTS OF PROJECT CYCLE

Problem Scoping - Understanding the problem

Data Acquisition - Collecting accurate and
reliable data Data Exploration - Arranging the
data uniformly
Modelling - Creating Models from the
data Evaluation - Evaluating the
project

The Stakeholder Who

[Problem Statement Template]
Have a Issue/Problem What
problem
When/While Context/Situation/Location Wher
e
3

CHATBOT
S
One of the most common applications of Natural Language Processing is a
chatbot.

Exampl Mitsuku Rose

e:
Bot Clever OChatbot
Jabberwacky
Bot
Haptic

Types of
ChatBots
SCRIPT BOTS SMART BOTS

Easy to make Comparatively difficult to make

Work on the script of the programmed Work on bigger databases

set.
Limited functionality Wide functionality

No or little language processing skills Coding is required

Example: Customer Care Bots. Example: Google Assistant, Alexa,
Cortana, Siri, etc.
4

HUMAN LANGUAGE VS COMPUTER

LANGUAGE

1 HUMAN
LANGUAGE
Humans communicate through language which we process all the
time.
Our brain keeps on processing the sounds that it hears around
itself and tries to make sense out of them all the time.
Communications made by humans are complex.
2 COMPUTER
LANGUAGE

The computer understands the language of numbers.

Everything that is sent to the machine has to be converted to
numbers.
A single mistake is made, the computer throws an error and
does not process that part.
The communications made by the machines are very basic and simple

ERRORS IN PROCESSING HUMAN

LANGUAGE
Arrangement of words and Word Perfect
meaning
Syntax, No Meaning
Different Syntax, Same
meaning Different
Meaning, Same Syntax
Multiple Meanings of the
Our Brain

Listen Prioritize

Process
5

ARRANGEMENT OF THE WORDS AND

MEANING

Syntax: Syntax refers to the grammatical structure of a

sentence. Semantics: It refers to the meaning of the

sentence.

Different syntax, same semantics: 2+3 =

3+2
Here the way these statements are written is different, but their
meanings are the same that is 5.
Different semantics, same syntax: 3/2 (Python 2.7) ≠ 3/2
(Python 3).
Here we have the same syntax but their meanings are different. In
Python 2.7, this statement would result in 1 while in Python 3, it
would give an output of 1.5.

1 MULTIPLE MEANINGS OF A
WORD
To understand let us have an example of the following three sentences:
1. "His face turned red after he found out that he had taken the wrong
bag"
Possibilities: He feels ashamed because he took another person’s bag
instead of his OR he's feeling angry because he did not manage to
steal the bag that he has been targeting.
2. "The red car zoomed past his nose"
Possibilities: Probably talking about the color of the car, that traveled
close to him in a flash.
6

3. "His face turns red after consuming the medicine"

Possibilities: Is he having an allergic reaction? Or is he not able to bear the
taste of that medicine?

2 PERFECT SYNTAX, NO
MEANING

1. "Chickens feed extravagantly while the moon drinks tea"

Meaning: This statement is correct grammatically but makes no
sense. In Human language, a perfect balance of syntax and
semantics is important for better understanding.

DATA PROCESSING

Since we all know that the language of computers is Numerical, the

very first step that comes to our mind is to convert our language to
numbers.
This conversion takes a few steps to happen. The first step to
it is Text Normalisation.

TEXT
NORMALISATION
In Text Normalization, we undergo several steps to normalize the text
to a lower level. That is, we will be working on text from multiple
documents and the term used for the whole textual data from all the
documents altogether is known as "Corpus".
7

1 SENTENCE
SEGMENTATION
Under sentence segmentation, the whole corpus is divided into
sentences. Each sentence is taken as a different data so now the whole
corpus gets reduced to sentences.

Example:

BEFORE SENTENCE AFTER SENTENCE SEGMENTATION

SEGMENTATION
“You want to see the dreams with You want to see the dreams with close
eyes
close eyes and achieve them?
and achieve them?
They’ll remain dreams, look for
They’ll remain dreams, look for AIMs
AIMs and your eyes have to stay and your eyes have to stay open for
open for a change to be a change
seen.” to be seen

2 TOTKOEKNENI
SISAATTIOIO

A “Token” is a term used for any word or number or special character

occurring in a sentence.

Under Tokenisation, every word, number, and special character is

considered separately and each of them is now a separate token.
8

Corpus: A corpus can be defined as a collection of text

documents.
Example: You want to see the dreams with close eyes and achieve them?

You want to see the dream with close

eyes and acheive them ?

4 REMOVAL OF
STOPWORDS

Stopwords: Stopwords are the words that occur very frequently in

the corpus but do not add any value to it.

Examples: a, an, and, are, as, for, it, is, into, in, if, on, or, such, the, there, to.

In this step, the tokens which are not necessary are removed from the
token list. To make it easier for the computer to focus on meaningful
terms, these words are removed.

Along with these words, a lot of times our corpus might have special
characters and/or numbers.

if you are working on a document containing email IDs, then you might
not want to remove the special characters and numbers
9

Example: You want to see the dreams with close eyes and
achieve them? the removed words would be
to, the, and, ?

The outcome would be:

-> You want see dreams with close eyes achieve them

5 CONVERTING TEXT TO A COMMON

CASE
We convert the whole text into a similar case, preferably lower case.
This ensures that the case sensitivity of the machine does not
consider the same words as different just because of different
cases.

6 STEMMIN
G

Stemming is a technique used to extract the base form of the words

by removing affixes from them. It is just like cutting down the branches
of a tree to its stems.

Might not be
meaningful.
10

Example:

Word Affix Ste

es
s m
in
healin heal
g
g drea
s
dream m

7 LEMMATIZATIO
N
In lemmatization, the word we get after affix removal (also
known as lemma) is a meaningful one and it takes a longer
time to execute than stemming.

Lemmatization makes sure that a lemma is a word with

meaning
Example:

Word Affix lemm

es
s a
in
healin heal
g
g drea
s
dream m
11

DIFFERENCE BETWEEN STEMMING AND

LEMMATIZATION

Stemming lemmatization
The stemmed words might The lemma word is a
not be meaningful. meaningful one.
Caring ➔ Car Caring ➔ Care

BAG OF WORD ALGORITHM

Bag of Words just creates a set of vectors containing the count of

word occurrences in the document (reviews). Bag of Words
vectors is easy to interpret.

The bag of words gives us two things:

A vocabulary of words for the corpus
The frequency of these words (number of times it has occurred
in the whole corpus).

Here calling this algorithm a “bag” of words symbolizes that the

sequence of sentences or tokens does not matter in this case as all we
need are the unique words and their frequency in it.
12

STEPS OF THE BAG OF WORDS

ALGORITHM
1. Text Normalisation: Collecting data and pre-processing it
2. Create Dictionary: Making a list of all the unique words
occurring in the corpus. (Vocabulary)
3. Create document vectors: For each document in the
corpus, find out how many times the word from the unique list
of words has occurred.
4. Create document vectors for all the documents.

Example:
Step 1: Collecting data and pre-processing it.

Raw Data Processed Data

Document 1: Aman and Anil Document 1: [aman, and, anil,
are stressed are, stressed ]
Document 2: Aman went Document 2: [aman, went,
to a therapist to, a, therapist]
Document 3: Anil Document 3: [anil, went,
went to download a to, download, a, health,
health chatbot chatbot]

Step 2: Create Dictionary

Dictionary in NLP means a list of all the unique words
occurring in the corpus. If some words are repeated in different
documents, they are all written just once while creating the
dictionary.
13

aman and anil are stresse wen

d t

downloa healt chatbo therapi

d h t st a to

Some words are repeated in different documents, they are

all written just once, while creating the dictionary, we
create a list of unique words.

Step 3: Create a document vector

The document Vector contains the frequency of each word of
the vocabulary in a particular document.

In the document, vector vocabulary is written in the top row.

Now, for each word in the document, if it
matches the vocabulary, put a 1 under it.
If the same word appears again,
increment the previous value by 1.
And if the word does not occur in that document,
put a 0 under it.

ama an ani ar stresse wen t a therapi downloa healt chatbo

n d l e d t o st d h t
0
1 1 1 1 1 0 0 0 0 0 0
14

Step 4: Creating a document vector table for all documents

ama an ani ar stresse wen t a therapi downloa healt chatbo

n d l e d t o st d h t
0
1 1 1 1 1 0 0 0 0 0 0
1
1 0 0 0 0 1 1 1 0 0 0
1
0 0 1 0 0 1 1 0 1 1 1

In this table, the header row contains the vocabulary of the corpus
and three rows correspond to three different documents.
Finally, this gives us the document vector table for our corpus.
But the tokens have still not converted to numbers. This leads us
to the final steps of our algorithm: TFIDF.

TFIDF
TFIDF stands for Term Frequency & Inverse Document
Frequency.

1 TERM
FREQUENCY
1. Term frequency is the frequency of a word in one document.
2. Term frequency can easily be found in the document vector
table

Example:
15

ama an ani ar stresse wen t a therapi downloa healt chatbo

n d l e d t o st d h t
0
1 1 1 1 1 0 0 0 0 0 0
1
1 0 0 0 0 1 1 1 0 0 0
1
0 0 1 0 0 1 1 0 1 1 1

Here, as we can see that the frequency of each word for each
document has been recorded in the table. These numbers are
nothing but the Term Frequencies!

2 DOCUMENT
FREQUENCY
Document Frequency is the number of documents in which the word
occurs irrespective of how many times it has occurred in those
documents.

ama an ani ar stresse wen t a therapi downloa healt chatbo

n d l e d t o st d h t
2
2 1 2 1 1 2 2 1 1 1 1

We can observe from the table is:

1. Document frequency of ‘aman’, ‘anil’, ‘went’, ‘to’ and ‘a’ is
2 as they have occurred in two documents.
2. Rest of them occurred in just one document hence
the document frequency for them is one.
16

3 INVERSE DOCUMENT
FREQUENCY
In the case of inverse document frequency, we need to put the
document frequency in the denominator while
the total number of documents is the
numerator.
ama an ani ar stresse wen to a therapi downloa healt chatbo
n d l e d t st d h t
3/ 3/
3/2 3/1 3/ 3/ 3/1 3/2 2 2 3/1 3/1 3/1 3/1

FORMULA OF TFIDF

The formula of TFIDF for any word W

becomes:

We don’t need to calculate the log

values by ourselves. We simply
have to use the log function in the
calculator and find out!
aman and anil are stresse wen to a therapi downloa health chatbo
d t st d t

1lo 0lo 0*lo

1*log(3/2 1*log(3 1*log(3 0*log(3
g 0*lo g g
) ) ) 1*log(3 0*log(3 0*log(3) ) 0*log(3
(3/2) g (3/2) (3/2) )
) )
(3/2)

0lo 1lo 1*lo

1*log(3/2 0*log(3 0*log(3 0*log(3) 0*log(3
g 1*lo g g
) ) ) 0*log(3 1*log(3 ) 0*log(3
(3/2) ) g (3/2) (3/2) ) )
(3/2)

1*log(3)
1*lo 1*lo 1*lo
17

After calculating all the values, we get:

ama an anil are stresse went to a therapi downloa healt chatbo

n d d st d h t

0.17 0.47 0 0 0
0.17 .47 6 7 0.477 0 0 0 0
6 7

0.17 0.17 0.17

0 0 0 6 6 6 0.477 0 0 0
0.17 0
6

0.17 0 0 0.17 0.17 0.17 0 0.477 0.47 0.477

Finally, the words have been converted to numbers. These

numbers are the values of each document.

Here, we can see that since we have less amount of data,

words like ‘are’ and ‘and’ also have a high value. But as the IDF
value increases, the value of that word decreases.

That is, for example:

Total Number of documents: 10
Number of documents in which ‘and’
occurs: 10 Therefore, IDF(and) = 10/10 = 1

Which means: log(1) = 0. Hence, the value of ‘and’ becomes 0.

On the other hand, the number of documents in which ‘pollution’ occurs: 3
IDF(pollution) = 10/3 = 3.3333…
This means log(3.3333) = 0.522; which shows that the word
‘pollution’ has considerable value in the corpus.
18

Important concepts to remember:

1. Words that occur in all the documents with high term
frequencies have the least values and are considered to be the
stopwords
2. For a word to have a high TFIDF value, the word needs to have a
high term frequency but less document frequency which shows
that the word is important for one document but is not a common
word for all documents.
3. These values help the computer understand which words
are to be considered while processing the natural language. The

APPLICATIONS OF TFIDF

TFIDF is commonly used in the Natural Language Processing

domain. Some of its applications are:

1.Document Classification – Helps in classifying the type

and genre of a document.
2. Topic Modelling – It helps in predicting the topic for a corpus.
3. Information Retrieval System – To extract the important
information out of a corpus.
4. Stop word filtering – Helps in removing the unnecessary
words from a text body.

Expression of Making Accepting and Declining Invitation
No ratings yet
Expression of Making Accepting and Declining Invitation
9 pages
Natural Language Processing: Learning Is Not A Course, Its A Path From Passion To Profession
No ratings yet
Natural Language Processing: Learning Is Not A Course, Its A Path From Passion To Profession
19 pages
Natural Language Processing Notes Class 10 AI
100% (1)
Natural Language Processing Notes Class 10 AI
20 pages
AI_NLP
No ratings yet
AI_NLP
9 pages
Unit-6 Natural Language Processing
No ratings yet
Unit-6 Natural Language Processing
7 pages
Unit 6 - AI (NLP)
No ratings yet
Unit 6 - AI (NLP)
37 pages
Natural Language Processing
No ratings yet
Natural Language Processing
6 pages
NLP
No ratings yet
NLP
16 pages
ai-part-b-ch12
No ratings yet
ai-part-b-ch12
16 pages
pdf NLP
No ratings yet
pdf NLP
7 pages
C10_AI_UNIT 3_NLP_ HALF YEARLY
No ratings yet
C10_AI_UNIT 3_NLP_ HALF YEARLY
37 pages
Natural Language Processing Revision Notes
No ratings yet
Natural Language Processing Revision Notes
4 pages
NLP_AI_X
No ratings yet
NLP_AI_X
6 pages
Natural Language Processing
No ratings yet
Natural Language Processing
10 pages
NLP (4)
No ratings yet
NLP (4)
40 pages
Unit 6 (NLP)
No ratings yet
Unit 6 (NLP)
8 pages
Chapter 6 - NLP Question Answer
No ratings yet
Chapter 6 - NLP Question Answer
7 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
5 pages
AIUnit 6 10
No ratings yet
AIUnit 6 10
8 pages
1009_nlp_ppt
No ratings yet
1009_nlp_ppt
31 pages
NLP m2
No ratings yet
NLP m2
71 pages
NLP
No ratings yet
NLP
4 pages
IP Projects NLP
No ratings yet
IP Projects NLP
8 pages
AI-Natural Language Processing
No ratings yet
AI-Natural Language Processing
49 pages
Welcome
No ratings yet
Welcome
8 pages
Overview of Natural Language Processing: Advanced AI CSCE 976 Amy Davis
No ratings yet
Overview of Natural Language Processing: Advanced AI CSCE 976 Amy Davis
54 pages
AIML-HC Mod 04
No ratings yet
AIML-HC Mod 04
71 pages
Adobe Scan 30 Sept 2024
No ratings yet
Adobe Scan 30 Sept 2024
6 pages
UNIT 1_Part1
No ratings yet
UNIT 1_Part1
121 pages
Natural Language Processing Dossier 20231110 141736 0000
No ratings yet
Natural Language Processing Dossier 20231110 141736 0000
114 pages
NLP Class10.PDF
No ratings yet
NLP Class10.PDF
9 pages
NLP - CH-6
No ratings yet
NLP - CH-6
4 pages
NaturalLanguageProcessingClassworkNotes_1473d9cb2fd64561b134cb14125f9536_37661
No ratings yet
NaturalLanguageProcessingClassworkNotes_1473d9cb2fd64561b134cb14125f9536_37661
10 pages
lec2
No ratings yet
lec2
21 pages
Sample Paper Questions - NLP (Part 2)
No ratings yet
Sample Paper Questions - NLP (Part 2)
7 pages
6._NLP
No ratings yet
6._NLP
11 pages
unit-4 NLP
No ratings yet
unit-4 NLP
54 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
63 pages
Intro To NLP: Natural Language Toolkit
No ratings yet
Intro To NLP: Natural Language Toolkit
11 pages
AI-Natural Language Processing
No ratings yet
AI-Natural Language Processing
51 pages
Introduction
No ratings yet
Introduction
23 pages
Assignment of AI Finished
No ratings yet
Assignment of AI Finished
16 pages
Lecture 1 Introduction
No ratings yet
Lecture 1 Introduction
36 pages
Natural Language Processing Applications in Cyber Security
No ratings yet
Natural Language Processing Applications in Cyber Security
3 pages
Introduction To Natural Language Processing (NLP)
No ratings yet
Introduction To Natural Language Processing (NLP)
87 pages
Computational
No ratings yet
Computational
22 pages
NLP-Unit-1-part1
No ratings yet
NLP-Unit-1-part1
61 pages
NLP Insem Notes
No ratings yet
NLP Insem Notes
13 pages
Introduction To Natural Language Processing and NLTK
No ratings yet
Introduction To Natural Language Processing and NLTK
23 pages
What Is Computational Linguistics
No ratings yet
What Is Computational Linguistics
14 pages
Natural Language Processing
No ratings yet
Natural Language Processing
17 pages
Seminar Report1
No ratings yet
Seminar Report1
17 pages
Kuhlmann - Introduction To Computational Linguistics (Slides) (2015)
100% (1)
Kuhlmann - Introduction To Computational Linguistics (Slides) (2015)
66 pages
Natural Language Processing
No ratings yet
Natural Language Processing
72 pages
NLP_Week_01
No ratings yet
NLP_Week_01
57 pages
Chapter 4
No ratings yet
Chapter 4
17 pages
NLP Notes
No ratings yet
NLP Notes
10 pages
Chapter 7.1 - Introducing Natural Language Processing
No ratings yet
Chapter 7.1 - Introducing Natural Language Processing
39 pages
Basics of Chat GPT: How to utilize this powerful tool to enhance your life!
From Everand
Basics of Chat GPT: How to utilize this powerful tool to enhance your life!
Adam Larsen
No ratings yet
Natural Language Processing with Python and spaCy: A Practical Introduction
From Everand
Natural Language Processing with Python and spaCy: A Practical Introduction
Yuli Vasiliev
No ratings yet
Mastering Chat GPT : Conversational AI and Language Generation
From Everand
Mastering Chat GPT : Conversational AI and Language Generation
Vineeta Prasad
No ratings yet
Basics Configuration of PBX Nortel Meridain 81c
No ratings yet
Basics Configuration of PBX Nortel Meridain 81c
26 pages
Industrial Training Report Format
100% (1)
Industrial Training Report Format
4 pages
Thinking With Concepts
No ratings yet
Thinking With Concepts
59 pages
Ballads
No ratings yet
Ballads
18 pages
HTML Tutorial
No ratings yet
HTML Tutorial
56 pages
List of Time Order Words
No ratings yet
List of Time Order Words
6 pages
Hostel Management System
50% (2)
Hostel Management System
26 pages
Relations and Functions 08 _ Class Notes __ Lakshya JEE 2026
No ratings yet
Relations and Functions 08 _ Class Notes __ Lakshya JEE 2026
64 pages
Classroom Orientation Educational Presentation in Green and Yellow Playful Illustrative Style
No ratings yet
Classroom Orientation Educational Presentation in Green and Yellow Playful Illustrative Style
24 pages
What Is Weld Mapping
No ratings yet
What Is Weld Mapping
4 pages
Eng 4M
No ratings yet
Eng 4M
3 pages
14 Easy Ladino Songs For Internet With Lyrics and Cords
100% (2)
14 Easy Ladino Songs For Internet With Lyrics and Cords
38 pages
MTH501 by (Samran Haider)
No ratings yet
MTH501 by (Samran Haider)
524 pages
Tep Lesson Plan 2
No ratings yet
Tep Lesson Plan 2
3 pages
Umbrella Homework
100% (1)
Umbrella Homework
7 pages
Mahendra Singh Chauhan
No ratings yet
Mahendra Singh Chauhan
5 pages
Sadowsky Vision 2011
No ratings yet
Sadowsky Vision 2011
16 pages
19CS2106S 19CS2106A Test - I Set - 1 Key and Scheme
No ratings yet
19CS2106S 19CS2106A Test - I Set - 1 Key and Scheme
8 pages
Augustus Jonathan Edmondson download
No ratings yet
Augustus Jonathan Edmondson download
89 pages
PDF. MSC CV Template
No ratings yet
PDF. MSC CV Template
3 pages
EYL Group 4
No ratings yet
EYL Group 4
23 pages
Albrecht and Gaffney.1983.Software Function Source Lines of Code and Development Effort Prediction A Software Science Validation
No ratings yet
Albrecht and Gaffney.1983.Software Function Source Lines of Code and Development Effort Prediction A Software Science Validation
10 pages
Winston Churchill
No ratings yet
Winston Churchill
2 pages
Spanish 2 Diagnostic Exam Answer Sheet: Me Llamo: - Clase
No ratings yet
Spanish 2 Diagnostic Exam Answer Sheet: Me Llamo: - Clase
2 pages
An Epitaph Upon Husband
No ratings yet
An Epitaph Upon Husband
2 pages
Class 1
No ratings yet
Class 1
28 pages
Using Open Source Software For Digital Libraries: A Case Study of CUSAT
No ratings yet
Using Open Source Software For Digital Libraries: A Case Study of CUSAT
9 pages
Group A Simple Future Tense & Future Continuous Tense
No ratings yet
Group A Simple Future Tense & Future Continuous Tense
6 pages
NI LabVIEW For CompactRIO Developer's Guide-59-84
No ratings yet
NI LabVIEW For CompactRIO Developer's Guide-59-84
26 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Natural Language Processing Notes Class 10 AI

Uploaded by

Natural Language Processing Notes Class 10 AI

Uploaded by

AI Simplified by

Learning is not a course, Its a path from passion to

Natural Language Processing (NLP) is the sub-field of AI that focuses

APPLICATIONS OF NLP → Feel Deta h, NLP KA

REVISING AI PROJECT CYCLE

Project Cycle is a step-by-step process to solve problems using

1 COMPONENTS OF PROJECT CYCLE

Problem Scoping - Understanding the problem

The Stakeholder Who

Exampl Mitsuku Rose

Easy to make Comparatively difficult to make

Work on the script of the programmed Work on bigger databases

No or little language processing skills Coding is required

HUMAN LANGUAGE VS COMPUTER

The computer understands the language of numbers.

ERRORS IN PROCESSING HUMAN

ARRANGEMENT OF THE WORDS AND

Syntax: Syntax refers to the grammatical structure of a

sentence. Semantics: It refers to the meaning of the

Different syntax, same semantics: 2+3 =

3. "His face turns red after consuming the medicine"

1. "Chickens feed extravagantly while the moon drinks tea"

Since we all know that the language of computers is Numerical, the

BEFORE SENTENCE AFTER SENTENCE SEGMENTATION

A “Token” is a term used for any word or number or special character

Under Tokenisation, every word, number, and special character is

Corpus: A corpus can be defined as a collection of text

You want to see the dream with close

eyes and acheive them ?

Stopwords: Stopwords are the words that occur very frequently in

The outcome would be:

5 CONVERTING TEXT TO A COMMON

Stemming is a technique used to extract the base form of the words

Word Affix Ste

Lemmatization makes sure that a lemma is a word with

Word Affix lemm

DIFFERENCE BETWEEN STEMMING AND

BAG OF WORD ALGORITHM

Bag of Words just creates a set of vectors containing the count of

The bag of words gives us two things:

Here calling this algorithm a “bag” of words symbolizes that the

STEPS OF THE BAG OF WORDS

Raw Data Processed Data

Step 2: Create Dictionary

aman and anil are stresse wen

downloa healt chatbo therapi

Some words are repeated in different documents, they are

Step 3: Create a document vector

In the document, vector vocabulary is written in the top row.

ama an ani ar stresse wen t a therapi downloa healt chatbo

Step 4: Creating a document vector table for all documents

ama an ani ar stresse wen t a therapi downloa healt chatbo

ama an ani ar stresse wen t a therapi downloa healt chatbo

ama an ani ar stresse wen t a therapi downloa healt chatbo

We can observe from the table is:

The formula of TFIDF for any word W

We don’t need to calculate the log

1*lo 0*lo 0*lo

0*lo 1*lo 1*lo

After calculating all the values, we get:

ama an anil are stresse went to a therapi downloa healt chatbo

0.17 0.17 0.17

0.17 0 0 0.17 0.17 0.17 0 0.477 0.47 0.477

Finally, the words have been converted to numbers. These

Here, we can see that since we have less amount of data,

That is, for example:

Which means: log(1) = 0. Hence, the value of ‘and’ becomes 0.

Important concepts to remember:

TFIDF is commonly used in the Natural Language Processing

1.Document Classification – Helps in classifying the type

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

1lo 0lo 0*lo

0lo 1lo 1*lo