0% found this document useful (0 votes)
18 views10 pages

UNIT-5 Quetions - Answers

The document discusses various aspects of Natural Language Processing (NLP), including its definition, components, and applications such as Google Translate and chatbots. It covers key techniques like parsing, syntactic analysis, semantic analysis, stemming, and tokenization using the NLTK library. Additionally, it explains the importance of stop words and regular expressions in NLP.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views10 pages

UNIT-5 Quetions - Answers

The document discusses various aspects of Natural Language Processing (NLP), including its definition, components, and applications such as Google Translate and chatbots. It covers key techniques like parsing, syntactic analysis, semantic analysis, stemming, and tokenization using the NLTK library. Additionally, it explains the importance of stop words and regular expressions in NLP.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

UNIT-5 Questions:

1.​ What do you understand by Natural Language Processing.


2.​ Discuss parsing in the context of NLP?
3.​ List the components of Natural Language Processing.
4.​ Explain in detail how sentiment analysis used in NLP,
5.​ Explain what is syntactic analysis.
6.​ Discuss on regular expressions.
7.​ Explain stemming with the help of an example.
8.​ What are stop words?
9.​ List any two real-life applications of Natural Language Processing.
10.​ What is Semantic Analysis?
11.​ Ellaborate on library used for NLP.
12.​ How to tokenize a sentence using the NLTK package?
13.​ Explain how we can do parsing.

. What do you understand by Natural Language


Processing?

Natural Language Processing is a field of computer science that deals with


communication between computer systems and humans. It is a technique used
in Artificial Intelligence and Machine Learning. It is used to create automated
software that helps understand human-spoken languages to extract useful
information from the data. Techniques in NLP allow computer systems to
process and interpret data in the form of natural languages.

List any two real-life applications of Natural Language


Processing.

Two real-life applications of Natural Language Processing are as follows:

1.​ Google Translate: Google Translate is one of the famous applications


of Natural Language Processing. It helps convert written or spoken
sentences into any language. Also, we can find the correct
pronunciation and meaning of a word by using Google Translate. It
uses advanced techniques of Natural Language Processing to achieve
success in translating sentences into various languages.
2.​ Chatbots: To provide a better customer support service, companies
have started using chatbots for 24/7 service. AI Chatbots help resolve
the basic queries of customers. If a chatbot is not able to resolve any
query, then it forwards it to the support team, while still engaging the
customer. It helps make customers feel that the customer support
team is quickly attending to them. With the help of chatbots,
companies have become capable of building cordial relations with
customers. It is only possible with the help of Natural Language
Processing.

What are stop words?


Stop words are said to be useless data for a search engine. Words such as
articles, prepositions, etc. are considered stop words. There are stop words such
as was, were, is, am, the, a, an, how, why, and many more. In Natural Language
Processing, we eliminate the stop words to understand and analyze the meaning
of a sentence. The removal of stop words is one of the most important tasks for
search engines. Engineers design the algorithms of search engines in such a way
that they ignore the use of stop words. This helps show the relevant search
result for a query.

What is NLTK?

NLTK is a Python library, which stands for Natural Language Toolkit. We use
NLTK to process data in human-spoken languages. NLTK allows us to apply
techniques such as parsing, tokenization, lemmatization, stemming, and more to
understand natural languages. It helps in categorizing text, parsing linguistic
structure, analyzing documents, etc.

A few of the libraries of the NLTK package that we often use in NLP are:

1.​ SequentialBackoffTagger
2.​ DefaultTagger
3.​ UnigramTagger
4.​ treebank
5.​ wordnet
6.​ FreqDist
7.​ patterns
8.​ RegexpTagger
9.​ backoff_tagger
10.​UnigramTagger, BigramTagger, and TrigramTagger

What is Syntactic Analysis?

Syntactic analysis is a technique of analyzing sentences to extract meaning from


them. Using syntactic analysis, a machine can analyze and understand the order
of words arranged in a sentence. NLP employs grammar rules of a language that
helps in the syntactic analysis of the combination and order of words in
documents.

The techniques used for syntactic analysis are as follows:

1.​ Parsing: It helps in deciding the structure of a sentence or text in a


document. It helps analyze the words in the text based on the
grammar of the language.
2.​ Word segmentation: The segmentation of words segregates the text
into small significant units.
3.​ Morphological segmentation: The purpose of morphological
segmentation is to break words into their base form.
4.​ Stemming: It is the process of removing the suffix from a word to
obtain its root word.
5.​ Lemmatization: It helps combine words using suffixes, without
altering the meaning of the word.
What is Semantic Analysis?

Semantic analysis helps make a machine understand the meaning of a text. It


uses various algorithms for the interpretation of words in sentences. It also
helps understand the structure of a sentence.

Techniques used for semantic analysis are as given below:

1.​ Named entity recognition: This is the process of information


retrieval that helps identify entities such as the name of a person,
organization, place, time, emotion, etc.
2.​ Word sense disambiguation: It helps identify the sense of a word
used in different sentences.
3.​ Natural language generation: It is a process used by the software to
convert structured data into human-spoken languages. By using NLG,
organizations can automate content for custom reports.

List the components of Natural Language Processing.

The major components of NLP are as follows:


●​ Entity extraction: Entity extraction refers to the retrieval of
information such as place, person, organization, etc. by the
segmentation of a sentence. It helps in the recognition of an entity in
a text.
●​ Syntactic analysis: Syntactic analysis helps draw the specific meaning
of a text.
●​ Pragmatic analysis: To find useful information from a text, we
implement pragmatic analysis techniques.
●​ Morphological and lexical analysis: It helps in explaining the
structure of words by analyzing them through parsing.

What are Regular Expressions?

A regular expression is used to match and tag words. It consists of a series of


characters for matching strings.

Suppose, if A and B are regular expressions, then the following are true for
them:

●​ If {ɛ} is a regular language, then ɛ is a regular expression for it.


●​ If A and B are regular expressions, then A + B is also a regular
expression within the language {A, B}.
●​ If A and B are regular expressions, then the concatenation of A and B
(A.B) is a regular expression.
●​ If A is a regular expression, then A* (A occurring multiple times) is also
a regular expression.

What is Parsing in the context of NLP?


Parsing in NLP refers to the understanding of a sentence and its grammatical
structure by a machine. Parsing allows the machine to understand the meaning
of a word in a sentence and the grouping of words, phrases, nouns, subjects,
and objects in a sentence. Parsing helps analyze the text or the document to
extract useful insights from it. To understand parsing, refer to the below
diagram:

In this, ‘Jonas ate an orange’ is parsed to understand the structure of the


sentence.

Explain Stemming with the help of an example.

In Natural Language Processing, stemming is the method to extract the root


word by removing suffixes and prefixes from a word.​
For example, we can reduce ‘stemming’ to ‘stem’ by removing ‘m’ and ‘ing.’​
We use various algorithms for implementing stemming, and one of them is
PorterStemmer.​
First, we will import PorterStemmer from the nltk package.
1from nltk.stem import PorterStemmer

Creating an object for PorterStemmer


1pst=PorterStemmer()
2pst.stem(“running”), pst.stem(“cookies”), pst.stem(“flying”)

Output:
1(‘run’, ‘cooki', ‘fly’ )
How to tokenize a sentence using the nltk package?

Tokenization is a process used in NLP to split a sentence into tokens. Sentence


tokenization refers to splitting a text or paragraph into sentences.

For tokenizing, we will import sent_tokenize from the nltk package:


1from nltk.tokenize import sent_tokenize<>

We will use the below paragraph for sentence tokenization:​


Para = “Hi Guys. Welcome to SVPCET. This is a blog on the NLP interview
questions and answers.”
1sent_tokenize(Para)

Output:
1[ 'Hi Guys.' ,
2'Welcome to SVPCET. ',
3'This is a blog on the NLP interview questions and answers. ' ]

Tokenizing a word refers to splitting a sentence into words.

Now, to tokenize a word, we will import word_tokenize from the nltk package.
1from nltk.tokenize import word_tokenize

Para = “Hi Guys. Welcome to SVPCET. This is a blog on the NLP interview
questions and answers.”
1word_tokenize(Para)

Output:
[ 'Hi' , 'Guys' , ' . ' , 'Welcome' , 'to' , 'SVPCET' , ' . ' , 'This' , 'is' , 'a', 'blo
1, 'answers' , ' . ' ]

Explain how we can do parsing.

Parsing is the method to identify and understand the syntactic structure of a


text. It is done by analyzing the individual elements of the text. The machine
parses the text one word at a time, then two at a time, further three, and so on.
●​ When the machine parses the text one word at a time, then it is
a unigram.
●​ When the text is parsed two words at a time, it is a bigram.
●​ The set of words is a trigram when the machine parses three words
at a time.

Look at the below diagram to understand unigram, bigram, and trigram.

Explain how we can do parsing.

Parsing is the method to identify and understand the syntactic structure of a


text. It is done by analyzing the individual elements of the text. The machine
parses the text one word at a time, then two at a time, further three, and so on.

●​ When the machine parses the text one word at a time, then it is
a unigram.
●​ When the text is parsed two words at a time, it is a bigram.
●​ The set of words is a trigram when the machine parses three words
at a time.

Look at the below diagram to understand unigram, bigram, and trigram.

Now, let’s implement parsing with the help of the nltk package.
1import nltk
2text = ”Top 30 NLP interview questions and answers”

We will now tokenize the text using word_tokenize.


1text_token= word_tokenize(text)

Now, we will use the function for extracting unigrams, bigrams, and trigrams.
1list(nltk.unigrams(text))

Output:
1[ "Top 30 NLP interview questions and answer"]
1list(nltk.bigrams(text))

Output:
1["Top 30", "30 NLP", "NLP interview", "interview questions", "questions and", "and answer"
1list(nltk.trigrams(text))

Output:
1["Top 30 NLP", "NLP interview questions", "questions and answers"]

For extracting n-grams, we can use the function nltk.ngrams and give the
argument n for the number of parsers.
1list(nltk.ngrams(text,n))

Explain Stemming with the help of an example.

In Natural Language Processing, stemming is the method to extract the root


word by removing suffixes and prefixes from a word.​
For example, we can reduce ‘stemming’ to ‘stem’ by removing ‘m’ and ‘ing.’​
We use various algorithms for implementing stemming, and one of them is
PorterStemmer.​
First, we will import PorterStemmer from the nltk package.
1from nltk.stem import PorterStemmer

Creating an object for PorterStemmer


1pst=PorterStemmer()
2pst.stem(“running”), pst.stem(“cookies”), pst.stem(“flying”)

Output:
1(‘run’, ‘cooki', ‘fly’ )

About NLTK took kit


Link:

NLTK :: Natural Language Toolkit

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy