0% found this document useful (0 votes)
4 views27 pages

Lecture 02 - NLU concepts

The document covers key concepts in Natural Language Understanding (NLU) for Conversational AI, focusing on text pre-processing techniques such as tokenization, stemming, lemmatization, and case normalization. It also discusses Named Entity Recognition (NER), Part of Speech (POS) tagging, sentiment analysis, and Text-to-Speech synthesis, highlighting their importance and applications in NLP. Various tools and methods for implementing these techniques are provided, emphasizing their role in enhancing model performance and user interaction.

Uploaded by

vipula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views27 pages

Lecture 02 - NLU concepts

The document covers key concepts in Natural Language Understanding (NLU) for Conversational AI, focusing on text pre-processing techniques such as tokenization, stemming, lemmatization, and case normalization. It also discusses Named Entity Recognition (NER), Part of Speech (POS) tagging, sentiment analysis, and Text-to-Speech synthesis, highlighting their importance and applications in NLP. Various tools and methods for implementing these techniques are provided, emphasizing their role in enhancing model performance and user interaction.

Uploaded by

vipula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

NLU concepts for Conversational AI

Lecture 2

BITS Pilani
Pilani Campus
Session Content

• Importance of text pre-processing


• Overview of past learnings (NLP)
• Stemming
• Tokenization
• POS Tagging
• Text to Speech
◦ Understanding T2S algorithms basic layer
◦ Different algorithms working
◦ Code and example via different libraries
◦ Importance of T2S Synthesis, ASR etc.

BITS Pilani, Pilani Campus


Importance of text pre-processing
• Text pre-processing is a crucial step in Natural Language Processing (NLP) that involves
cleaning and preparing text data for analysis.

• It includes techniques such as:


• Tokenization
• Stop Word Removal
• Stemming
• Lemmatization
• Text Cleaning

Importance: These steps help in converting raw text into a more manageable and
analyzable format, facilitating better model performance and more accurate results.

BITS Pilani, Pilani Campus


Tokenization
• Tokenization is the process of splitting text into smaller units called tokens, which can be
words, phrases, or symbols.

• Example:Input: "Natural Language Processing is fascinating.


• "Output: ["Natural", "Language", "Processing", "is", "fascinating"]

• Tools
• NLTK: A leading platform for building Python programs to work with human language data.
• SpaCy: An open-source software library for advanced NLP in Python.
• Tokenizer APIs: Available in various programming languages and platforms.

• Importance: Tokenization is the first step in the text pre-processing pipeline and is crucial
for the performance of subsequent steps.

BITS Pilani, Pilani Campus


Tokenization
Inconsistent Data Representation Without tokenization, text data remains as large, unmanageable
strings, making it difficult to analyze and process.

Model Performance: Machine learning models require numerical or categorical input. Without
tokenization, converting text into a suitable format is impossible, leading to poor model performance.

Difficulty in Feature Extraction: Tokenization allows for extracting features such as word
frequencies, n-grams, and more. Skipping this step hinders effective feature extraction.

Ineffective Text Cleaning: Tokenization is often the first step in text cleaning. Without it, removing
stop words, punctuation, and performing stemming/lemmatization becomes challenging.

Error Propagation: Errors in initial steps propagate through the pipeline, leading to inaccuracies in
tasks like sentiment analysis, NER, and POS tagging.

BITS Pilani, Pilani Campus


Stop-Words
• Stop words are common words that are usually ignored in text processing because they do not
carry significant meaning.
• Examples: Common stop words include "a", "the", "and", "in", "to", etc.

• Purpose:
• Noise Reduction: Removing stop words helps in reducing the noise in the text data.
• Efficiency: It reduces the size of the text data, making processing faster and more efficient.

• Tools:
• NLTK: Provides a predefined list of stop words and functions for their removal.
• SpaCy: Offers built-in support for stop word removal in various languages.

• Example:
• Input: "The quick brown fox jumps over the lazy dog."
• Output: "quick brown fox jumps lazy dog"

• Importance: Removing stop words helps in focusing on the words that are more likely to be
significant in the analysis, thereby improving the performance of NLP models.

BITS Pilani, Pilani Campus


Stemming and Lemmatization
• Stemming: Reduces words to their base or root form by removing suffixes. It may not
always produce a real word.

• Lemmatization: Converts words to their base form (lemma) using morphological analysis. It
always returns a valid word.

• Examples:
• Stemming: "running" -> "run", "jumps" -> "jump"
• Lemmatization: "better" -> "good", "running" -> "run"

• Tools:
• NLTK:
• PorterStemmer for stemming
• WordNetLemmatizer for lemmatization
• SpaCy: Offers built-in lemmatization capabilities.

BITS Pilani, Pilani Campus


Stemming and Lemmatization
• Differences:
• Accuracy: Lemmatization is generally more accurate than stemming.
• Complexity: Stemming is simpler and faster but less accurate.

• Use Case: Choose stemming for quick and dirty text processing; use lemmatization for
tasks requiring higher accuracy.

• Importance: Both techniques help in normalizing words to their base forms, which reduces
the dimensionality of the text data and improves the performance of NLP models.

• Problems if we don’t do stemming or lemmatization:


• High dimensionality
• Inconsistent Data
• Reduced model performance and difficulty in text analysis
• Lower accuracy in search and retrieval

BITS Pilani, Pilani Campus


Case Normalisation
• Case normalization is the process of converting all characters in the text to a uniform case, either lower case or
upper case, to ensure consistency.

• Purpose:
• Consistency: Ensures that words are treated equally regardless of their case.
• Reduction of Redundancy: Helps in reducing redundancy by treating "Apple" and "apple" as the same word.
• Example:
• Input: "Natural Language Processing"
• Output: "natural language processing"
• Tools:
• Python String Methods: .lower() and .upper()
• NLTK: Provides functions for case normalization.
• SpaCy: Built-in support for case normalization.
• Importance:
• Improves Text Processing: Case normalization simplifies text processing by reducing the number of unique tokens.
• Enhances Model Performance: Models become more efficient as they deal with fewer variations of the same word.

• Note: Case normalization is particularly useful when the case of the text does not carry significant meaning for the
analysis.

BITS Pilani, Pilani Campus


Text Cleaning
• Text cleaning involves removing unwanted elements from the text to make it suitable for analysis.
• Common Techniques:
• Removing Punctuation: Eliminating characters such as periods, commas, and exclamation marks.
• Removing Special Characters: Removing symbols like #, $, %, etc.
• Removing Numbers: Excluding digits unless they are relevant to the analysis.
• Removing HTML Tags: Stripping HTML content from web-scraped text.
• Handling Contractions: Expanding contractions (e.g., "don't" to "do not").

• Tools:
• Regular Expressions (Regex): Powerful for pattern matching and substitution.
• NLTK: Provides functions for various text cleaning tasks.
• SpaCy: Built-in functions for text cleaning.

• Example:
• Input: "Hello, world! Visit us at https://example.com #NLP"
• Output: "Hello world Visit us at example com NLP"

• Importance:
• Enhances Data Quality: Cleaned text is more consistent and easier to analyze.
• Improves Model Accuracy: Cleaner data leads to better-performing models by reducing noise and irrelevant information.

BITS Pilani, Pilani Campus


Named Entity Recognition (NER)
• Named Entity Recognition (NER) is a subtask of information extraction that identifies and classifies named
entities in text into predefined categories such as person names, organizations, locations, dates, etc.
• Categories:
• Person: Names of people (e.g., "John Doe")
• Organization: Names of organizations (e.g., "Google")
• Location: Geographical locations (e.g., "Paris")
• Date/Time: Dates and times (e.g., "January 1, 2020")
• Others: Money, percentages, etc.
• Purpose:
• Information Extraction: Helps in extracting structured information from unstructured text.
• Improved Search and Retrieval: Enhances the performance of search engines and information retrieval systems.
• Tools:
• SpaCy: Provides a pre-trained NER model and tools for custom training.
• NLTK: Offers NER capabilities through its chunking module.
• Stanford NER: A widely used NER tool developed by Stanford University.
Importance: NER is essential for understanding the context and extracting relevant information from large volumes
of text, making it a key component in various NLP applications.

BITS Pilani, Pilani Campus


Named Entity Recognition (NER)
Rule-based Approaches: Use predefined rules and patterns to identify
entities.
• Example: Regular expressions to identify dates and email addresses.
• Pros: Simple to implement and interpret.
• Cons: Limited flexibility and scalability.

Machine Learning Approaches: Use statistical models trained on


labeled data to identify entities.
• Conditional Random Fields (CRF)
• Hidden Markov Models (HMM)
• Pros: More flexible and accurate than rule-based approaches.
• Cons: Require labeled training data and computational resources.

BITS Pilani, Pilani Campus


Named Entity Recognition (NER)
• Deep Learning Approaches: Use neural networks to automatically learn features and
patterns from data
• Recurrent Neural Networks (RNN)
• Long Short-Term Memory (LSTM)
• Transformer-based models (e.g., BERT)
• Pros: High accuracy and ability to capture complex patterns.
• Cons: Require large datasets and significant computational power.
• Example:
• Input: "Apple is looking at buying U.K. startup for $1 billion."
• Output: Entities: [("Apple", "ORG"), ("U.K.", "LOC"), ("$1 billion", "MONEY")]

Importance:
Choosing the right technique depends on the specific requirements of the task, available
data, and resources. Machine learning and deep learning approaches are preferred for
their accuracy and scalability.

BITS Pilani, Pilani Campus


Part of Speech Tagging (POS)
• Part-of-Speech (POS) tagging is the process of assigning a part of speech to each word in a sentence. The
parts of speech include nouns, verbs, adjectives, adverbs, pronouns, conjunctions, prepositions, and
interjections

• Understanding Syntax: Helps in understanding the grammatical structure of sentences.


• Disambiguation: Resolves ambiguities by providing context to words (e.g., "book" as a noun vs. "book" as a verb).

Example The quick brown fox jumps over the lazy dog."
• POS Tags:
The (DT) quick (JJ) brown (JJ) fox (NN) jumps (VBZ) over (IN) the (DT) lazy (JJ) dog (NN)
Tools:
• NLTK: Provides a comprehensive POS tagging module.
• SpaCy: Offers efficient and accurate POS tagging capabilities.
• Stanford POS Tagger: A robust tool developed by Stanford University.

Importance:
POS tagging is fundamental for many NLP tasks such as parsing, text-to-speech conversion, and information
extraction. It enables a deeper understanding of the syntactic and semantic properties of text.

BITS Pilani, Pilani Campus


Part of Speech Tagging (POS)

• Rule-based Tagging: Uses a set of hand-crafted rules to assign POS tags.


• Example: "If a word ends in 'ly', tag it as an adverb (RB)."
• Statistical Tagging:Uses probabilistic models based on the likelihood of a word's POS tag given its
context.
• Hidden Markov Models (HMM)
• Maximum Entropy Models
• Machine Learning Approaches: Use supervised learning techniques to predict POS tags.
• Support Vector Machines (SVM)
• Conditional Random Fields (CRF)
• Deep Learning Approaches:Uses neural networks to automatically learn features from data.
• Recurrent Neural Networks (RNN)
• Long Short-Term Memory (LSTM)
• Transformer-based models (e.g., BERT)
Example:
• Sentence: "The cat sat on the mat."
• POS Tags: The (DT) cat (NN) sat (VBD) on (IN) the (DT) mat (NN)

BITS Pilani, Pilani Campus


Sentiment Analysis
• Sentiment analysis, also known as opinion mining, is the process of
determining the sentiment expressed in a piece of text. It classifies
text into positive, negative, or neutral sentiments.
Understanding Public Opinion: Helps businesses understand customer opinions and
feedback.
Market Research: Analyzing trends and opinions about products, services, or events.
Social Media Monitoring: Tracking sentiment trends on social media platforms.
Applications:
Customer Feedback Analysis: Understanding customer satisfaction and improving
services.
Brand Monitoring: Tracking public sentiment towards a brand.
Political Sentiment: Analyzing public opinion on political issues or candidates.
Product Reviews: Assessing the sentiment in product reviews to gauge consumer
reactions.

BITS Pilani, Pilani Campus


Sentiment Analysis
Techniques:
Lexicon-based Methods: Use predefined lists of positive and negative words.
Machine Learning Approaches: Train classifiers (e.g., SVM, Naive Bayes) on labeled data.
Deep Learning Approaches: Use neural networks (e.g., LSTM, BERT) for higher accuracy.

Example:
Input: "I love the new design of your website!"
Output: Positive

Importance:
Sentiment analysis provides valuable insights into the emotions and opinions
expressed in text, enabling better decision-making and strategy formulation.

BITS Pilani, Pilani Campus


Text to Speech Synthesis
• Text-to-Speech (TTS) synthesis is the process of converting written text into spoken words using
computational methods.
• Purpose:
• Accessibility: Provides access to information for visually impaired users.
• User Experience: Enhances user interaction in virtual assistants and conversational agents.
• Automation: Automates tasks that require reading text, such as news reading or announcements.
• Components:
• Text Analysis:
• Tokenization: Breaking text into smaller units such as sentences and words.
• Linguistic Analysis: Determining the part of speech, phonetic transcription, and prosody (intonation, stress, rhythm).
• Acoustic Modeling:
• Phoneme Synthesis: Generating speech sounds based on phonetic transcription.
• Prosody Generation: Adding natural intonation, stress, and rhythm to synthesized speech.
• Speech Synthesis:
• Concatenative Synthesis: Combining pre-recorded speech segments.
• Formant Synthesis: Using mathematical models to generate speech sounds.
• Waveform Synthesis: Using neural networks to generate high-quality, natural-sounding speech.

BITS Pilani, Pilani Campus


Text to Speech Synthesis
• Tools:
• Google Text-to-Speech: A widely used TTS engine with high-quality
voices.
• Amazon Polly: A cloud-based service that converts text into lifelike
speech.
• IBM Watson Text to Speech: Provides a range of customizable voices.
• Open Source Tools: eSpeak, Festival, and MaryTTS
• Example:
• Input: "Welcome to the world of Text-to-Speech synthesis."
• Output: (Spoken audio)
• Importance: TTS synthesis plays a crucial role in improving
accessibility and enhancing user experiences across various
applications and devices.

BITS Pilani, Pilani Campus


Text Generation Techniques
Text generation involves creating coherent and contextually relevant text from a given input or set of rules.

Key Techniques:
Rule-based Systems : Use predefined rules and templates to generate text.
Example: Fill-in-the-blank templates for automated report generation
Markov Chains : Use probabilistic models based on the likelihood of word sequences.
Example: Generating text by predicting the next word based on the previous one
Recurrent Neural Networks (RNNs): Use neural networks with loops to maintain context over
sequences.
Example: Generating poetry or short stories
Long Short-Term Memory Networks (LSTMs) : A type of RNN designed to better handle long-
term dependencies.
Example: Generating more coherent paragraphs and articles.
Transformer Models: Use self-attention mechanisms to capture long-range dependencies in text.
Example: GPT-3 generating articles, stories, and dialogue.

BITS Pilani, Pilani Campus


State-based and Rule-based Dialogue
Systems
Dialogue systems designed to follow specific states or rules to manage interactions with
users.
State-based Dialogue Systems : Systems that manage conversations using predefined
states and transitions between those states.
Rule-based Dialogue Systems: Systems that use predefined rules and templates to
generate responses and manage interactions.
Examples : If user says "Hello", respond with "Hi, how can I help you today?"
If user asks for account balance, respond with "Please provide your account
number."
Components:
Rule Engine: Processes user input based on predefined rules.
Template Manager: Uses templates to generate responses.
Context Manager: Maintains context of the conversation to apply relevant rules.

BITS Pilani, Pilani Campus


State-based and Rule-based Dialogue
Systems
State-based Dialogue Systems

NLG

USER
Intents ACTION
NLU & Slots
DIALOG MANAGER
MANAGER

STATE DATABASE
(Personal or Public
MACHINE information)

BITS Pilani, Pilani Campus


Slot Filling & Intent Recognition
Intent Recognition: The process of identifying the goal or purpose behind a user's input in a conversation.
Purpose: Helps in understanding user requests and guiding the conversation appropriately.

Example Intents:
Booking a flight
Slot Filling: Extracting specific pieces of information (slots) from the user's input that are necessary to complete the intent.
Purpose: Provides detailed information required to fulfill the user's request.
Example
Slots for Flight Booking:
Destination
Departure
DateReturn
Date
Number of Passengers

BITS Pilani, Pilani Campus


Slot Filling & Intent Recognition
Components:
Natural Language Understanding (NLU) Module: Processes user input to recognize intents and extract slots.
Dialogue Manager: Uses recognized intents and filled slots to manage the conversation flow and fulfill the user’s
request.

Techniques:
Rule-based Methods: Use predefined patterns and templates to recognize intents and extract slots.
Machine Learning Approaches: Train classifiers on labeled datasets to predict intents and extract slots.
Deep Learning Approaches: Use neural networks, particularly sequence-to-sequence models, to handle more complex and varied
inputs.
Example:
User Input: "I want to book a flight to New York on June 5th."
Recognized Intent: Book Flight
Extracted Slots:
Destination: New York
Departure Date: June 5th

Oopular slot filling dataset: https://github.com/howl-anderson/ATIS_dataset/blob/master/README.en-US.md

BITS Pilani, Pilani Campus


Slot filling & Intent Recognition
Pre-processing - Input Sentence: "Set an alarm for 6:30 AM tomorrow."
• Tokenizing: "Set", "an", "alarm", "for", "6:30", "AM", "tomorrow".
• Filtering Unwanted Words: "Set", "alarm", "6:30", "AM", "tomorrow".
• Gold Tags: "Set" (None), "alarm" (None), "6:30" (B-time), "AM" (I-time), "tomorrow" (B-date).
• Labeling Intents: Assigning the intent "SetAlarm" to the sentence.
Model Training:
• Training for Slot Extraction: The model learns from the preprocessed data to understand patterns and
relationships.
• Training for Intent Recognition: The model learns from sentences labeled with intents like "SetAlarm",
"BookFlight", "GetWeather".

BITS Pilani, Pilani Campus


Slot filling & Intent Recognition
Model Prediction:
For a new sentence - "Remind me to call the doctor at 8 PM today and send an email at 10 AM tomorrow".
• Prediction of Slots: "Remind" (None), "me" (None), "to" (None), "call" (None), "the" (None), "doctor" (None),
"at" (None), "8" (B-time), "PM" (I-time), "today" (B-date), "and" (None), "send" (None), "an" (None), "email"
(None), "at" (None), "10" (B-time), "AM" (I-time), "tomorrow" (B-date).
• Prediction of Intent: "SetReminder" and "SendEmail".
Postprocessing:
• Normalizing Slots: Convert "8 PM" to "20:00", "10 AM" to "10:00", and "today" and "tomorrow" to their respective
dates.
• Normalizing Intents: Check the predicted intents "SetReminder" and "SendEmail" are in a standard format.
"BookFlight", "book_flight" to BookFlight
• Filtering Incorrect Cases: Check "at" is not tagged as "B-time" and the intents are correctly identified.
• Identifying Obvious Cases: Check "8 PM" and "10 AM" are correctly tagged as "B-time", "I-time" and Clear case
intents like "alarm" are correctly identified as "SetAlarm"

Slots: Time 1: 20:00, Date 1: 2024-11-30, Time 2: 10:00, Date 2: 2024-12-01


Intent Recognized: SetAlarm and SendEmail

BITS Pilani, Pilani Campus


Normalization techniques
➢ Date Normalization: Converting various date formats into a ➢ Text Case Normalization: Converting text to a
standard format. consistent case (e.g., all lowercase).
Input: "12th December 2024", "12/12/2024", "December 12, Input: "Hello World", "HELLO WORLD", "hello world".
2024". Normalized: "hello world".
Normalized: "2024-12-12". ➢ Whitespace Normalization: Removing extra spaces and
➢ Time Normalization: Converting different time formats into a ensuring consistent spacing.
24-hour format. Input: "Hello World", "Hello World", "Hello
Input: "7 PM", "07:00 PM", "19:00". World".
Normalized: "19:00". Normalized: "Hello World".
➢ Currency Normalization: Standardizing currency ➢ Synonym Normalization: Converting synonyms to a
representations. standard term.
Input: "5000 rupees", "₹5000", "INR 5000". Input: "car", "vehicle".
Normalized: "INR 5000". Normalized: "car".
➢ Phone Number Normalization: Converting various phone
number formats into a standard international format.
Input: "9876xx3210", "09876xx3210",
Normalized: "+91-9876xx3210".
➢ Address Normalization: Standardizing address components.
Input: "123 MG Road, Bangalore", "123, MG Rd, Bengaluru".
Normalized: "123 MG Road, Bangalore".

BITS Pilani, Pilani Campus

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy