0% found this document useful (0 votes)
10 views2 pages

NLP Lab 4

The document outlines a procedure for implementing word prediction using the N-Gram language model. It details steps including data preprocessing, tokenization, n-gram generation, probability calculation, and model testing in Python. The result indicates that the N-Gram model effectively predicts the next word based on historical sequences.

Uploaded by

727722euai051
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views2 pages

NLP Lab 4

The document outlines a procedure for implementing word prediction using the N-Gram language model. It details steps including data preprocessing, tokenization, n-gram generation, probability calculation, and model testing in Python. The result indicates that the N-Gram model effectively predicts the next word based on historical sequences.

Uploaded by

727722euai051
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

EX.

NO : 4 WORD PREDICTION USING N GRAM MODELING


DATE :

AIM :
To implement word prediction using the N-Gram language model.

ALGORITHM :
STEP 1:Data Preprocessing: Collect and clean the text dataset by removing special
characters, extra spaces, and converting text to lowercase.
STEP 2:Tokenization: Split the text into individual words or tokens.
STEP 3:Building N-Grams: Generate n-grams (bigrams, trigrams, etc.) from the
tokenized text.
STEP 4:Probability Calculation: Compute conditional probabilities of words given
previous words using frequency counts.
STEP 5:Prediction Model: Given an input sequence, determine the most probable
next word based on maximum likelihood estimation.
STEP 6:Smoothing (if needed): Apply smoothing techniques like Laplace smoothing
to handle unseen n-grams.
STEP 7:Implementation & Testing: Implement the model in Python and test its
accuracy with different input sequences.

PROGRAM:
import nltk
from nltk.util import ngrams
from collections import Counter
nltk.download('punkt')
text = """Artificial intelligence is the future of technology.
The future of AI is bright and promising.
AI is transforming the world with automation and intelligence."""
tokens = nltk.word_tokenize(text.lower())
bigrams = list(ngrams(tokens, 2))
bigram_freq = Counter(bigrams)
def predict_next_word(prev_word):
candidates = {pair[1]: freq for pair, freq in bigram_freq.items() if pair[0] ==
prev_word}
if not candidates:
return "No prediction available"
return max(candidates, key=candidates.get)
print("Predicted word after 'the':", predict_next_word("the"))
print("Predicted word after 'ai':", predict_next_word("ai"))
print("Predicted word after 'future':", predict_next_word("future"))

OUTPUT :

RESULT :
The N-Gram model successfully predicts the next word based on historical
word sequences.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy