0% found this document useful (0 votes)
37 views3 pages

NLP Assignment

The document outlines two NLP problems: Continuous Bag of Words (CBOW) and the comparison between Skip-gram and GloVe models. CBOW predicts a target word based on surrounding context words, while Skip-gram generates context words from a target word, with each model having distinct training methodologies. The document highlights that Skip-gram is predictive and works well with smaller datasets, whereas GloVe is count-based and requires a larger corpus for effective training.

Uploaded by

atkalajadu69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views3 pages

NLP Assignment

The document outlines two NLP problems: Continuous Bag of Words (CBOW) and the comparison between Skip-gram and GloVe models. CBOW predicts a target word based on surrounding context words, while Skip-gram generates context words from a target word, with each model having distinct training methodologies. The document highlights that Skip-gram is predictive and works well with smaller datasets, whereas GloVe is count-based and requires a larger corpus for effective training.

Uploaded by

atkalajadu69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

NLP Assignment

Name: Ridhyal Chauhan

Registration No: RA2211056010047


Problem-1: Continuous Bag of Words (CBOW)

(a) List of Target Words for Each Context Window


Given the sentence: "The quick brown fox jumps over the lazy dog."

With a context window size of 2, the target words and their corresponding context windows
are:

Context Window Target Word


[The, brown] quick
[quick, fox] brown
[brown, jumps] fox
[fox, over] jumps
[jumps, the] over
[over, lazy] the
[the, dog] lazy

(b) How CBOW Works


The CBOW model predicts a target word using the surrounding context words. The steps
involved are:
1. **Input Representation**: Each context word is converted into a one-hot encoded vector
or an embedding vector.
2. **Averaging the Context Vectors**: The embeddings of context words are averaged or
summed to form a single vector.
3. **Feeding into a Neural Network**: This averaged vector is passed through a neural
network (usually a single hidden layer).
4. **Output Layer (Softmax Function)**: The network outputs probabilities for all words in
the vocabulary, and the most probable word is chosen as the predicted target word.
5. **Backpropagation & Training**: The model adjusts weights based on prediction errors
to improve accuracy over multiple iterations.
Problem-2: Skip-gram vs GloVe Model

(a) Skip-gram Model (Word2Vec) Processing the Sentence


Given the sentence: "Natural language processing is amazing."

The Skip-gram model predicts context words given a target word. The steps are:

1. Target Word Selection: A word is chosen as the center (target) word.


2. Context Window Definition: With a window size of 2, it considers two words before and
after the target word.
3. Prediction Pairs Generation: The model generates training pairs in the form (target word,
context word).

For example, with a window size of 2, the Skip-gram model generates pairs like:

Target Word Context Words


Natural (language, processing)
language (Natural, processing, is)
processing (language, is, amazing)
is (processing, amazing)
amazing (is)

(b) Difference Between Skip-gram and GloVe


The Skip-gram and GloVe models differ in their approach to learning word embeddings.

1. Skip-gram Model (Word2Vec)


- Predicts context words given a target word.
- Trained using local context windows.
- Maximizes the probability of seeing correct context words for a target word.
- Performs well with small datasets and infrequent words.

2. GloVe Model
- Uses a word co-occurrence matrix instead of predicting context words.
- Trained using a global co-occurrence matrix.
- Factorizes the matrix to capture word relationships.
- Requires a large corpus for effective training.

In summary, Skip-gram is a **predictive model**, while GloVe is a **count-based model**.


Skip-gram learns embeddings through context prediction, whereas GloVe captures word
relationships by analyzing word co-occurrences across the entire corpus.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy