0% found this document useful (0 votes)
45 views2 pages

Morphological Colab

Morphological_Colab

Uploaded by

Payal Khuspe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views2 pages

Morphological Colab

Morphological_Colab

Uploaded by

Payal Khuspe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

# Install the NLTK library

!pip install nltk

Requirement already satisfied: nltk in /usr/local/lib/python3.10/dist-packages (3.8.1)


Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from nltk) (8.1.7)
Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from nltk) (1.4.2)
Requirement already satisfied: regex>=2021.8.3 in /usr/local/lib/python3.10/dist-packages (from nltk) (2024.5.15)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from nltk) (4.66.5)

# Importing necessary libraries from NLTK


import nltk
from nltk.tokenize import word_tokenize
from nltk.stem import PorterStemmer, WordNetLemmatizer
from nltk.corpus import wordnet, stopwords
from nltk.tokenize import word_tokenize

# Downloading required NLTK packages


nltk.download('punkt') # Tokenization
nltk.download('wordnet') # WordNet for lemmatization
nltk.download('averaged_perceptron_tagger') # POS tagging
nltk.download('stopwords') # Stopwords

[nltk_data] Downloading package punkt to /root/nltk_data...


[nltk_data] Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] /root/nltk_data...
[nltk_data] Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data] Unzipping corpora/stopwords.zip.
True

# Function to perform stemming on a list of words


def steam_words(words):
ps = PorterStemmer()
return [ps.stem(word) for word in words]
# Function to map POS tags to WordNet tags for lemmatization
def get_wordnet_pos(word):
"""Map POS tag to the first character lemmatize() accepts"""
tag = nltk.pos_tag([word])[0][1][0].upper() # Get the first character of the POS tag
tag_dict = {"J": wordnet.ADJ, # Adjective
"N": wordnet.NOUN, # Noun
"V": wordnet.VERB, # Verb
"R": wordnet.ADV} # Adverb
return tag_dict.get(tag, wordnet.NOUN) # Default to NOUN if no match

# Function to perform lemmatization on a list of words


def lemmatize_words(words):
lemmatizer = WordNetLemmatizer()
return [lemmatizer.lemmatize(word, get_wordnet_pos(word)) for word in words]

# Function to preprocess the text: convert to lowercase and tokenize


def preprocess_text(text):
text1 = text.lower() # Convert text to lowercase
return word_tokenize(text1) # Tokenize the text

# Example text for processing


text = "Adjective There is one minute left in the game. I have a few one-dollar bills in my purse. She is one year old."

# Preprocess the text by converting to lowercase and tokenizing


words = preprocess_text(text)

# Apply stemming to the tokenized words


stemmed_words = steam_words(words)

# Apply lemmatization to the tokenized words


lemmatized_words = lemmatize_words(words)

# Print the original, stemmed, and lemmatized words


print("Original Words: ", words)
print("Stemmed Words: ", stemmed_words)
print("Lemmatized Words: ", lemmatized_words)

Original Word: ['adjective', 'there', 'is', 'one', 'minute', 'left', 'in', 'the', 'game', '.', 'i', 'have', 'a', 'few', 'one-dollar', 'bills', 'in', 'my
Stenned Word: ['adject', 'there', 'is', 'one', 'minut', 'left', 'in', 'the', 'game', '.', 'i', 'have', 'a', 'few', 'one-dollar', 'bill', 'in', 'my', 'pu
Lemmatized Word: ['adjective', 'there', 'be', 'one', 'minute', 'left', 'in', 'the', 'game', '.', 'i', 'have', 'a', 'few', 'one-dollar', 'bill', 'in', 'm

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy