0% found this document useful (0 votes)

54 views27 pages

NLP Practical

Uploaded by

hardik.gawde203

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views27 pages

NLP Practical

Uploaded by

hardik.gawde203

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 27

Practical No.

01
AIM:
Apply various text preprocessing techniques for any given text: Tokenization, Filtration, and
Script Validation.

THEORY:
In Natural Language Processing (NLP), raw text data is often noisy, unstructured, and
difficult to analyze directly. Text preprocessing is a crucial step in converting this raw data
into a cleaner and more structured format that is suitable for analysis or modelling. Three of
the most important preprocessing techniques are tokenization, filtration, and script validation.
Each serves a distinct purpose, contributing to improving data quality and relevance.

1. Tokenization
Tokenization is the process of splitting text into smaller, manageable units called tokens.
These tokens can be words, sentences, or sub word units depending on the level of
granularity required for the task at hand.
 Word Tokenization: This splits a text into individual words or terms. For example,
the sentence:
o Input: "Hello, world!"
o Output: ["Hello", "world"]
 Sentence Tokenization: This splits text into sentences.
o Input: "Hello world! Welcome to NLP."
o Output: ["Hello world!", "Welcome to NLP."]

2. Filtration
Filtration refers to the process of removing irrelevant elements from the text, such as
punctuation, stop words, and inconsistencies that may affect analysis.
 Removing Punctuation: Punctuation marks like commas and periods do not usually
provide meaningful information in most NLP tasks and are often removed.
 Lowercasing: Converting all text to lowercase helps ensure uniformity. For example,
"Word" and "word" should be treated the same way.
 Removing Stop Words: Stop words are common words like "the", "is", and "and".
These are often filtered out because they provide little to no useful information.
3. Script Validation
Script validation is the process of ensuring that the text data follows specific format rules or
contains valid characters.
 Character Validation: Ensures that tokens consist only of valid characters, such as
letters or digits. For instance, we can validate that a token contains only alphabetic
characters if our focus is on standard text analysis.
 Pattern Matching: Regular expressions can be used to validate certain patterns, like
email addresses or phone numbers.

Conclusion:
In this practical, we successfully implemented various text preprocessing techniques on a
given text. We applied tokenization to break the text into smaller units, filtration to remove
irrelevant components, and script validation to ensure data correctness. These preprocessing
techniques form the foundation for effective and efficient text analysis in NLP.
PROGRAM

OUTPUT:
Practical No. 02

AIM:
Apply various other text preprocessing techniques for any given text: Stop Word Removal
and Lemmatization/Stemming.

THEORY:
Text preprocessing is a fundamental part of Natural Language Processing (NLP) that helps
prepare raw textual data for analysis. Beyond basic tokenization and filtration, techniques
such as stop word removal, lemmatization, and stemming are essential for improving the
quality and efficiency of text analysis.

1. Stop Word Removal

Stop words are commonly used words in a language that typically do not contribute
significant meaning to the text. These include words like "and," "the," "is," "in," and "on."
While they are important in forming grammatically correct sentences, they often carry little
importance in text analysis and are removed to reduce noise.
 Example:
o Input: "The cat is on the mat."
o Output after stop word removal: "cat mat"

2. Lemmatization
Lemmatization is the process of reducing words to their base or dictionary form (lemma).
Unlike stemming (which simply chops off word endings), lemmatization uses linguistic rules
to ensure that the base form is a valid word. For example, the words "running" and "ran" are
both reduced to the base form "run."
 Example:
o Input: "The boys are running quickly."
o Output after lemmatization: "The boy be run quick"
3.Stemming
Stemming is a process of reducing words to their root form by stripping affixes (such as
suffixes and prefixes) from words. Unlike lemmatization, stemming does not necessarily
produce valid words but still helps in reducing word variations.
 Example:
o Input: "The boys are running quickly."
o Output after stemming: "The boy are run quick"

PROCEDURE:
We will implement the following preprocessing steps on a sample text:
1. Stop Word Removal
2. Lemmatization
3. Stemming

CONCLUSION:
In this practical, we applied additional text preprocessing techniques: stop word removal,
lemmatization, and stemming. Stop word removal filtered out common, insignificant words
to reduce noise in the text. Lemmatization reduced words to their dictionary base forms,
while stemming truncated words to their root forms. These preprocessing steps help prepare
text data for more effective analysis in NLP tasks by reducing variability in word forms and
focusing on the most relevant information.
PROGRAM:

OUTPUT:
Practical No. 03

AIM:
Perform Morphological Analysis and Word Generation for any given text.

THEORY:
Morphological analysis and word generation are essential in Natural Language Processing
(NLP), as they help break words into morphemes—the smallest meaningful units in a
language. Morphemes are used to understand and generate new words. This analysis enables
a better understanding of how words are structured and how they can be modified to convey
different meanings or grammatical roles.

Types of Morphemes:
1. Free Morphemes:
Morphemes that can stand alone as words. Examples include "book," "walk," and
"help."

2. Bound Morphemes:
Morphemes that cannot stand alone and must be attached to other morphemes. These
include prefixes (e.g., "dis-") and suffixes (e.g., "-ed," "-ly").

Morphological Analysis:
In morphological analysis, we break down words into their base forms (stems) and identify
their affixes (prefixes or suffixes), revealing their meaning and grammatical function.

Example Words:
Let’s consider the following words for morphological analysis:

 "disagreement"
 "playing"
 "bikes"
 "driver"
Morphological Breakdown:
1. "disagreement"
o Base Word: agree
o Prefix: dis- (indicates negation or opposition)
o Suffix: -ment (indicates a state or condition)
o Morpheme Breakdown: dis- + agree + -ment
o Meaning: The state of not agreeing or being in opposition.
2. "playing"
o Base Word: play
o Suffix: -ing (indicates present participle)
o Morpheme Breakdown: play + -ing
o Meaning: The act of engaging in a playful or recreational activity.
3. "bikes"
o Base Word: bike
o Suffix: -s (indicates plural)
o Morpheme Breakdown: bike + -s
o Meaning: More than one bike.
4. "driver"
o Base Word: drive
o Suffix: -er (indicates a person who performs the action)
o Morpheme Breakdown: drive + -er
o Meaning: A person who drives.

Word Generation:
Word generation is the process of creating new words by applying morphological rules, such
as adding prefixes or suffixes, or by modifying the tense or form of the base word.

Generating New Words:

1. From "agree" (Base Word):
o Possible Generations:
 Agreement (the state of agreeing)
 Disagree (to not agree)
 Agreeable (pleasing or able to be agreed upon)

2. From "play" (Base Word):

o Possible Generations:
 Playing (present participle)
 Player (a person who plays)
 Played (past tense)

3. From "bike" (Base Word):

o Possible Generations:
 Bikes (plural form)
 Biking (the act of riding a bike)
 Biker (a person who rides a bike)

4. From "drive" (Base Word):

o Possible Generations:
 Driver (a person who drives)
 Driving (present participle)
 Driven (past participle)

SUMMARY:

 Morphological Analysis breaks down words into morphemes (e.g., prefixes,

roots, and suffixes) to reveal their meanings and grammatical structures.

 Word Generation creates new words by applying morphological rules, such as

adding affixes or modifying the tense, enhancing vocabulary and expanding language
use.

CONCLUSION:
Morphological analysis and word generation are crucial processes in NLP, as they help us
understand the structure of language and generate new words from existing morphemes.
These tasks enhance language processing in applications such as text generation, machine
translation, and sentiment analysis. Through the analysis of morphemes and the generation of
new words, we can improve language comprehension and produce more sophisticated NLP
models.
PROGRAM:-

Output:-
PRACTICAL NO. 04

AIM:
Implement Word Sense Disambiguation (WSD) using LSTM or GRU.

THEORY:
Word Sense Disambiguation (WSD) is a key task in Natural Language
Processing (NLP) that involves identifying which sense (or meaning) of a word
is used in a given context. Many words in natural language have multiple
meanings (polysemy), and determining the correct meaning based on context is
critical for tasks like machine translation, information retrieval, and question
answering.
Recurrent Neural Networks (RNNs), particularly Long Short-Term Memory
(LSTM) and Gated Recurrent Units (GRU), are commonly used to model
sequences and capture contextual dependencies in text, making them suitable
for WSD tasks. They can learn word representations and use context from
surrounding words to predict the correct sense of a target word.

Model Overview:
 Input: A sequence of words (text) where one word is ambiguous.
 Output: The correct sense of the ambiguous word, based on the context
provided by the surrounding words.
LSTM and GRU models can be used to learn the context of a word in a sentence
and then classify the sense of the ambiguous word.

Dataset for Word Sense Disambiguation:

The SemCor dataset is a popular dataset used for training and evaluating WSD
models. It contains sentences where each word is tagged with its sense from
WordNet.
For the purpose of this example, we will simulate a small dataset, but larger,
labeled datasets like SemCor should be used for better results.
Explanation:
1. Data Preprocessing:
o The sentences are tokenized and converted into sequences of
numbers using the Tokenizer. The sequences are then padded to
ensure uniform input lengths.
2. Embedding Layer:
o The Embedding layer converts words into dense word vectors
(embeddings), capturing semantic relationships between words.
3. LSTM Layer:
o The LSTM layer captures sequential dependencies between words
in the input sequence, helping the model understand the context
around the ambiguous word.
4. Dense Layer:
o The Dense layer at the output predicts the sense of the ambiguous
word based on the learned context.
5. Training and Evaluation:
o The model is trained using binary cross-entropy since the example
involves binary classification (two senses). After training, the
model is evaluated on a test set to measure its accuracy.

Conclusion:
In this practical, we implemented a basic LSTM-based model to perform Word
Sense Disambiguation (WSD) on a small dataset. The model learns to predict
the correct sense of an ambiguous word based on its context in the sentence. In
real-world applications, a larger, labeled dataset such as SemCor would be used
to achieve better accuracy.
PROGRAM:

OUTPUT:-
PRACTICAL NO. 05

AIM:
Implement N-gram Model for the given text input.

THEORY:
An N-gram model is a probabilistic language model used in Natural Language
Processing (NLP) and computational linguistics to predict the next item in a
sequence. It operates by analyzing the probability of a word based on the last N-
1 words that came before it. In essence, an N-gram represents a contiguous
sequence of N items (words, characters, etc.) from a given text.
Types of N-grams:
1. Unigram (1-gram): One word at a time.
2. Bigram (2-gram): Sequence of two words.
3. Trigram (3-gram): Sequence of three words.
4. N-gram (N-gram): Sequence of N words.
For example, consider the sentence:
 Sentence: "The quick brown fox"
o Unigrams: "The", "quick", "brown", "fox"
o Bigrams: "The quick", "quick brown", "brown fox"
o Trigrams: "The quick brown", "quick brown fox"
The N-gram model is widely used in text prediction, machine translation, and
speech recognition systems.

OUTPUT EXPLANATION:
For the input text "The quick brown fox jumps over the lazy dog. The quick
fox is fast.", the output will include:
1. Unigrams:
o ('The',), ('quick',), ('brown',), ('fox',), ('jumps',), ...
2. Bigrams:
o ('The', 'quick'), ('quick', 'brown'), ('brown', 'fox'), ('fox', 'jumps'), ...
3. Trigrams:
o ('The', 'quick', 'brown'), ('quick', 'brown', 'fox'), ('brown', 'fox',
'jumps'), ...
4. Bigram Frequencies:
o ('The', 'quick'): 2
o ('quick', 'brown'): 1
o ('fox', 'jumps'): 1
o ('lazy', 'dog'): 1
o
CONCLUSION:

By using N-grams, we can understand the sequence of words in a given text and the
probability of certain word combinations. This is particularly useful in predictive text models,
where knowing the likelihood of a sequence of words can help make accurate predictions
about the next word. N-gram models are fundamental in many NLP applications, such as
speech recognition, text generation, and machine translation.

Program :-
OUTPUT:-

PRACTICAL NO. 06
AIM:
Study the different Part-of-Speech (POS) taggers and perform POS tagging on
the given text.

THEORY:
Part-of-Speech (POS) tagging is a core task in Natural Language Processing
(NLP) that involves assigning a grammatical label (or tag) to each word in a text
based on its syntactic role. POS tags can include categories such as nouns,
verbs, adjectives, adverbs, pronouns, and more. POS tagging helps in
understanding the grammatical structure of a sentence and is useful for tasks
like syntactic parsing, named entity recognition, text summarization, and
machine translation.
There are various POS tagging techniques and models, which use different
methods to classify the POS of words.
Types of POS Taggers:
1. Rule-Based Taggers:
o These taggers use manually created rules to determine the POS of a
word based on its context. These rules take into account the word
itself and the surrounding words to determine its grammatical role.
o Example: The Brill Tagger is a famous rule-based tagger.
2. Stochastic (Probabilistic) Taggers:
o Stochastic taggers use statistical models to assign POS tags based
on probabilities derived from a large corpus. These models often
use algorithms such as Hidden Markov Models (HMM) to predict
the most likely sequence of POS tags.
o Example: The HMM Tagger and Maximum Entropy Taggers are
well-known probabilistic taggers.
3. Neural Network-Based Taggers:
o These taggers use deep learning models to predict POS tags.
Neural networks, especially Recurrent Neural Networks (RNNs)
like LSTM and BiLSTM (Bidirectional Long Short-Term
Memory), and transformer-based models such as BERT, are
commonly used for POS tagging and provide highly accurate
results.
o Example: Models like BiLSTM-CRF or BERT are state-of-the-art
for POS tagging.
4. Hybrid Taggers:
o Hybrid taggers combine rule-based methods with probabilistic or
machine learning techniques to achieve better performance and
accuracy. This approach aims to balance between hand-crafted
rules and the power of machine learning models.

EXPLANATION OF THE TAGS:

 DT (Determiner): A word that introduces a noun (e.g., "the").
 JJ (Adjective): A word that describes a noun (e.g., "quick," "brown,"
"lazy").
 NN (Noun): A word that names a person, place, thing, or idea (e.g.,
"fox," "dog").
 VBZ (Verb, 3rd person singular present): A verb form used for
he/she/it (e.g., "jumps").
 IN (Preposition): A word that shows the relationship of a noun or
pronoun to another word (e.g., "over").

CONCLUSION:
POS tagging is a crucial task in NLP, allowing for the grammatical
classification of words within a sentence. Different POS taggers employ various
techniques, from rule-based systems to advanced neural networks. These
taggers help in analyzing and understanding text more deeply, which is
fundamental for further NLP applications like machine translation, text
summarization, and sentiment analysis.

PROGRAM:-
OUTPUT:-

PRACTICAL NO. 07
AIM:
Implement Named Entity Recognition (NER) for the given text input.

THEORY:
Named Entity Recognition (NER) is a task in Natural Language Processing
(NLP) that involves identifying and classifying named entities (e.g., people,
organizations, locations, dates, quantities) mentioned in a text. The goal is to tag
each word or phrase with a label indicating whether it is a named entity and
what type of entity it is.
NER helps in various applications such as information retrieval, question
answering, summarization, and machine translation by extracting meaningful
entities from text.
Types of Named Entities:
 PERSON: Names of people.
 ORGANIZATION: Companies, institutions, etc.
 LOCATION: Cities, countries, geographical entities.
 DATE: Dates or periods.
 GPE (Geopolitical Entities): Countries, cities, states.
 MONEY: Monetary values.
 TIME: Specific times, durations.
 PERCENT: Percentage values.
NER Approaches:
1. Rule-Based Systems:
o Use hand-crafted rules and patterns to identify named entities.
2. Statistical Models:
o Use probabilistic models such as Hidden Markov Models (HMM)
or Conditional Random Fields (CRF) to predict named entities.

3. Neural Network-Based Models:

o Use deep learning models like BiLSTM, BiLSTM-CRF, or
Transformers (like BERT) for state-of-the-art NER performance.

CONCLUSION:
Named Entity Recognition (NER) is an essential task in NLP that identifies and
classifies key entities in text. By recognizing important entities such as people,
organizations, and places, NER helps in understanding the content of a text and
extracting meaningful information for downstream applications such as
information extraction, summarization, and question answering.
In this practical, we implemented a basic NER system using SpaCy, a state-of-
the-art NLP library, which successfully identified and labeled entities in the
provided text.
PROGRAM:-

OUTPUT:-

PRACTICAL NO. 08
AIM:
Perform Exploratory Data Analysis (EDA) of the given text and generate a
Word Cloud.

THEORY:
Exploratory Data Analysis (EDA) is an approach to summarizing and
visualizing the important characteristics of a dataset, often used to uncover
patterns, trends, or anomalies. When applied to textual data, it involves
analyzing the frequency and distribution of words, which can reveal insights
about the text's content and structure.
One popular technique for visualizing textual data is generating a Word Cloud,
which is a visual representation where the size of each word is proportional to
its frequency in the text. The more frequently a word appears, the larger it is in
the word cloud. Word clouds provide an intuitive way to understand the most
significant words in a dataset.

Steps for Text EDA and Word Cloud Generation:

1. Text Preprocessing:
o Tokenization: Splitting text into individual words.
o Removing stop words (common words like "the", "is", etc.) and
punctuation.
o Converting all words to lowercase to avoid case sensitivity.
2. Generating Word Frequencies:
o Calculate the frequency of each word in the text.
3. Visualizing with Word Cloud:
o Generate a word cloud that emphasizes the most frequently
occurring words.

EXAMPLE TEXT:
For demonstration, we will use the following sample text:
 Text: "Data science is the field of study that combines domain expertise,
programming skills, and knowledge of mathematics and statistics to
extract meaningful insights from data."

OUTPUT:

 The code will generate a Word Cloud visual that highlights important
words like "data," "science," "study," "programming," "knowledge," and
"statistics." Words that occur more frequently will appear larger and
bolder in the word cloud.

CONCLUSION:

 Using Exploratory Data Analysis (EDA) on text, we can gain valuable

insights into word frequencies and the importance of certain terms. A
Word Cloud is a simple but effective visualization tool that highlights
the most prominent words in a dataset. By removing stop words and
focusing on meaningful terms, we can easily identify the core themes of
the text. This technique is commonly used for text summarization and
sentiment analysis in various NLP applications.

PROGRAM:-
OUTPUT:-

NLP Unit-I Notes (1)
No ratings yet
NLP Unit-I Notes (1)
19 pages
Community Health Nursing in Canada 2nd Edition Stanhope Lancaster Test Bank pdf download
100% (2)
Community Health Nursing in Canada 2nd Edition Stanhope Lancaster Test Bank pdf download
34 pages
Chapter 7.1 - Introducing Natural Language Processing
No ratings yet
Chapter 7.1 - Introducing Natural Language Processing
39 pages
Natural Language Processing tools and approaches
No ratings yet
Natural Language Processing tools and approaches
106 pages
Unit 7-NLP
No ratings yet
Unit 7-NLP
33 pages
Download ebooks file Machine Learning An Algorithmic Perspective Second Edition Stephen Marsland all chapters
100% (7)
Download ebooks file Machine Learning An Algorithmic Perspective Second Edition Stephen Marsland all chapters
82 pages
NLP m2
No ratings yet
NLP m2
71 pages
Oumh 2203
No ratings yet
Oumh 2203
13 pages
nlp
No ratings yet
nlp
35 pages
ir manual
No ratings yet
ir manual
53 pages
Enamoured by A Feather-61-90
No ratings yet
Enamoured by A Feather-61-90
30 pages
HSLC 2021
No ratings yet
HSLC 2021
71 pages
NLP EXP 3 (1)
No ratings yet
NLP EXP 3 (1)
24 pages
The Sphotavada of Language
100% (1)
The Sphotavada of Language
3 pages
18 Text Mining - Text Preprocessing
No ratings yet
18 Text Mining - Text Preprocessing
40 pages
PHIẾU BÀI TẬP
No ratings yet
PHIẾU BÀI TẬP
16 pages
DLL MATATAG_READING & LITERACY 1 Q4_W6
No ratings yet
DLL MATATAG_READING & LITERACY 1 Q4_W6
21 pages
NLP PYQ SOLUTIONS
No ratings yet
NLP PYQ SOLUTIONS
59 pages
NLP_(1)[1]
No ratings yet
NLP_(1)[1]
30 pages
TSP unit1 own (1)
No ratings yet
TSP unit1 own (1)
20 pages
Misplaced & Dangling Modifiers
No ratings yet
Misplaced & Dangling Modifiers
17 pages
NLB final lab manual (2)
No ratings yet
NLB final lab manual (2)
23 pages
AIML-HC Mod 04
No ratings yet
AIML-HC Mod 04
71 pages
ai txt unit1
No ratings yet
ai txt unit1
13 pages
Module 3
No ratings yet
Module 3
40 pages
Summon Us - m t Addams
No ratings yet
Summon Us - m t Addams
426 pages
Grade 3 English Topical Questions and Answers
No ratings yet
Grade 3 English Topical Questions and Answers
15 pages
Chapter 4
No ratings yet
Chapter 4
14 pages
VO_MCA_SEM 4 _ Text Mining _U2
No ratings yet
VO_MCA_SEM 4 _ Text Mining _U2
15 pages
Handling Corpus Raw Text
No ratings yet
Handling Corpus Raw Text
15 pages
TSP Unit1 Own
No ratings yet
TSP Unit1 Own
13 pages
NLP SEM IMP
No ratings yet
NLP SEM IMP
46 pages
IR....
No ratings yet
IR....
5 pages
Introduction to NLP Basics of Text Processing, Spelling Correction-Edit Distance, Weighted Edit Distance
No ratings yet
Introduction to NLP Basics of Text Processing, Spelling Correction-Edit Distance, Weighted Edit Distance
35 pages
nlp2
No ratings yet
nlp2
45 pages
NLP
No ratings yet
NLP
40 pages
Chapter 1
No ratings yet
Chapter 1
29 pages
CAT King study material 5
No ratings yet
CAT King study material 5
21 pages
FLANG-REVIEWER-EXAM-TYPE
No ratings yet
FLANG-REVIEWER-EXAM-TYPE
9 pages
Week 8-Module 7 NLP
No ratings yet
Week 8-Module 7 NLP
52 pages
Text Analytics and Natural Language Processing - KAI073.docx
No ratings yet
Text Analytics and Natural Language Processing - KAI073.docx
24 pages
NLP Experiment 1
No ratings yet
NLP Experiment 1
13 pages
Grapheme:: Morpheme
No ratings yet
Grapheme:: Morpheme
20 pages
Chapter - 1
No ratings yet
Chapter - 1
25 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
54 pages
Schematic Networks
No ratings yet
Schematic Networks
3 pages
Statement of Intent Guidelines
No ratings yet
Statement of Intent Guidelines
4 pages
Unit 1a
No ratings yet
Unit 1a
53 pages
Natural Language Processing Revision Notes
No ratings yet
Natural Language Processing Revision Notes
4 pages
Canada US Wildfire Agreement
No ratings yet
Canada US Wildfire Agreement
6 pages
NLP Insem Notes
No ratings yet
NLP Insem Notes
13 pages
NLP unit1
No ratings yet
NLP unit1
24 pages
Natural Language Processing
No ratings yet
Natural Language Processing
47 pages
Introduction To NLP
No ratings yet
Introduction To NLP
50 pages
NLP - Notes
No ratings yet
NLP - Notes
3 pages
Chapter-1 Introduction To NLP
No ratings yet
Chapter-1 Introduction To NLP
12 pages
English Phonetics - Consonants
No ratings yet
English Phonetics - Consonants
15 pages
NLP TT-1 Question Bank
No ratings yet
NLP TT-1 Question Bank
21 pages
Subordinate Clauses
No ratings yet
Subordinate Clauses
2 pages
NLP Steps Basic
No ratings yet
NLP Steps Basic
26 pages
4th Quarter English 7 REVIEW GUIDE
No ratings yet
4th Quarter English 7 REVIEW GUIDE
2 pages
pdf NLP
No ratings yet
pdf NLP
7 pages
GI B2 U3 Vocabulary Higher
No ratings yet
GI B2 U3 Vocabulary Higher
1 page
NLP_AI_X
No ratings yet
NLP_AI_X
6 pages
I02 - Unit 05 Across Cultures - Lesson B
No ratings yet
I02 - Unit 05 Across Cultures - Lesson B
59 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
17 pages
AIUnit 6 10
No ratings yet
AIUnit 6 10
8 pages
NLP FINAL
No ratings yet
NLP FINAL
33 pages
Solution NLP UT1
No ratings yet
Solution NLP UT1
7 pages
NLP Notes
No ratings yet
NLP Notes
18 pages
NLP Notes
No ratings yet
NLP Notes
43 pages
Charity Art Sale
No ratings yet
Charity Art Sale
6 pages
2021 Get Smart Plus 4 Final Asessment
No ratings yet
2021 Get Smart Plus 4 Final Asessment
9 pages
Artificial Intelligence Assignment 6
No ratings yet
Artificial Intelligence Assignment 6
4 pages
Babylonian Chronology 626 B C - A D 75
No ratings yet
Babylonian Chronology 626 B C - A D 75
72 pages
A Report
No ratings yet
A Report
10 pages
Lexicon Language Standard English Prestige: Slang Consists of A
No ratings yet
Lexicon Language Standard English Prestige: Slang Consists of A
17 pages
NLP Unit1
No ratings yet
NLP Unit1
51 pages
Natural Language Processing_NOTES
No ratings yet
Natural Language Processing_NOTES
4 pages
Seminar On Natural Language Processing
No ratings yet
Seminar On Natural Language Processing
21 pages
CSDM2-Text Preprocessing For NL Data - 011050
No ratings yet
CSDM2-Text Preprocessing For NL Data - 011050
6 pages
Context Clues Exercises
100% (1)
Context Clues Exercises
2 pages
Assignment of AI Finished
No ratings yet
Assignment of AI Finished
16 pages
Bộ đề speaking part 2
No ratings yet
Bộ đề speaking part 2
31 pages
PT Goals
No ratings yet
PT Goals
13 pages
De Thi Giua Hoc Ki 1 Anh8
No ratings yet
De Thi Giua Hoc Ki 1 Anh8
2 pages
ESL Intermediate/Advanced Writing
From Everand
ESL Intermediate/Advanced Writing
Mary Ellen Munoz Page
4.5/5 (3)
Understanding Words and Morphology
From Everand
Understanding Words and Morphology
Gauraang Asan
No ratings yet
Natural Language Processing
From Everand
Natural Language Processing
Ajit Singh
No ratings yet
English Grammar
From Everand
English Grammar
Manal Shedeed
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

NLP Practical

Uploaded by

NLP Practical

Uploaded by

Practical No.

1. Stop Word Removal

Generating New Words:

2. From "play" (Base Word):

3. From "bike" (Base Word):

4. From "drive" (Base Word):

 Morphological Analysis breaks down words into morphemes (e.g., prefixes,

 Word Generation creates new words by applying morphological rules, such as

Dataset for Word Sense Disambiguation:

EXPLANATION OF THE TAGS:

3. Neural Network-Based Models:

Steps for Text EDA and Word Cloud Generation:

 Using Exploratory Data Analysis (EDA) on text, we can gain valuable

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.