0% found this document useful (0 votes)

4 views27 pages

Lecture 02 - NLU concepts

The document covers key concepts in Natural Language Understanding (NLU) for Conversational AI, focusing on text pre-processing techniques such as tokenization, stemming, lemmatization, and case normalization. It also discusses Named Entity Recognition (NER), Part of Speech (POS) tagging, sentiment analysis, and Text-to-Speech synthesis, highlighting their importance and applications in NLP. Various tools and methods for implementing these techniques are provided, emphasizing their role in enhancing model performance and user interaction.

Uploaded by

vipula

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views27 pages

Lecture 02 - NLU concepts

Uploaded by

vipula

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

NLU concepts for Conversational AI

Lecture 2

BITS Pilani
Pilani Campus
Session Content

• Importance of text pre-processing

• Overview of past learnings (NLP)
• Stemming
• Tokenization
• POS Tagging
• Text to Speech
◦ Understanding T2S algorithms basic layer
◦ Different algorithms working
◦ Code and example via different libraries
◦ Importance of T2S Synthesis, ASR etc.

BITS Pilani, Pilani Campus

Importance of text pre-processing
• Text pre-processing is a crucial step in Natural Language Processing (NLP) that involves
cleaning and preparing text data for analysis.

• It includes techniques such as:

• Tokenization
• Stop Word Removal
• Stemming
• Lemmatization
• Text Cleaning

Importance: These steps help in converting raw text into a more manageable and
analyzable format, facilitating better model performance and more accurate results.

BITS Pilani, Pilani Campus

Tokenization
• Tokenization is the process of splitting text into smaller units called tokens, which can be
words, phrases, or symbols.

• Example:Input: "Natural Language Processing is fascinating.

• "Output: ["Natural", "Language", "Processing", "is", "fascinating"]

• Tools
• NLTK: A leading platform for building Python programs to work with human language data.
• SpaCy: An open-source software library for advanced NLP in Python.
• Tokenizer APIs: Available in various programming languages and platforms.

• Importance: Tokenization is the first step in the text pre-processing pipeline and is crucial
for the performance of subsequent steps.

BITS Pilani, Pilani Campus

Tokenization
Inconsistent Data Representation Without tokenization, text data remains as large, unmanageable
strings, making it difficult to analyze and process.

Model Performance: Machine learning models require numerical or categorical input. Without
tokenization, converting text into a suitable format is impossible, leading to poor model performance.

Difficulty in Feature Extraction: Tokenization allows for extracting features such as word
frequencies, n-grams, and more. Skipping this step hinders effective feature extraction.

Ineffective Text Cleaning: Tokenization is often the first step in text cleaning. Without it, removing
stop words, punctuation, and performing stemming/lemmatization becomes challenging.

Error Propagation: Errors in initial steps propagate through the pipeline, leading to inaccuracies in
tasks like sentiment analysis, NER, and POS tagging.

BITS Pilani, Pilani Campus

Stop-Words
• Stop words are common words that are usually ignored in text processing because they do not
carry significant meaning.
• Examples: Common stop words include "a", "the", "and", "in", "to", etc.

• Purpose:
• Noise Reduction: Removing stop words helps in reducing the noise in the text data.
• Efficiency: It reduces the size of the text data, making processing faster and more efficient.

• Tools:
• NLTK: Provides a predefined list of stop words and functions for their removal.
• SpaCy: Offers built-in support for stop word removal in various languages.

• Example:
• Input: "The quick brown fox jumps over the lazy dog."
• Output: "quick brown fox jumps lazy dog"

• Importance: Removing stop words helps in focusing on the words that are more likely to be
significant in the analysis, thereby improving the performance of NLP models.

BITS Pilani, Pilani Campus

Stemming and Lemmatization
• Stemming: Reduces words to their base or root form by removing suffixes. It may not
always produce a real word.

• Lemmatization: Converts words to their base form (lemma) using morphological analysis. It
always returns a valid word.

• Examples:
• Stemming: "running" -> "run", "jumps" -> "jump"
• Lemmatization: "better" -> "good", "running" -> "run"

• Tools:
• NLTK:
• PorterStemmer for stemming
• WordNetLemmatizer for lemmatization
• SpaCy: Offers built-in lemmatization capabilities.

BITS Pilani, Pilani Campus

Stemming and Lemmatization
• Differences:
• Accuracy: Lemmatization is generally more accurate than stemming.
• Complexity: Stemming is simpler and faster but less accurate.

• Use Case: Choose stemming for quick and dirty text processing; use lemmatization for
tasks requiring higher accuracy.

• Importance: Both techniques help in normalizing words to their base forms, which reduces
the dimensionality of the text data and improves the performance of NLP models.

• Problems if we don’t do stemming or lemmatization:

• High dimensionality
• Inconsistent Data
• Reduced model performance and difficulty in text analysis
• Lower accuracy in search and retrieval

BITS Pilani, Pilani Campus

Case Normalisation
• Case normalization is the process of converting all characters in the text to a uniform case, either lower case or
upper case, to ensure consistency.

• Purpose:
• Consistency: Ensures that words are treated equally regardless of their case.
• Reduction of Redundancy: Helps in reducing redundancy by treating "Apple" and "apple" as the same word.
• Example:
• Input: "Natural Language Processing"
• Output: "natural language processing"
• Tools:
• Python String Methods: .lower() and .upper()
• NLTK: Provides functions for case normalization.
• SpaCy: Built-in support for case normalization.
• Importance:
• Improves Text Processing: Case normalization simplifies text processing by reducing the number of unique tokens.
• Enhances Model Performance: Models become more efficient as they deal with fewer variations of the same word.

• Note: Case normalization is particularly useful when the case of the text does not carry significant meaning for the
analysis.

BITS Pilani, Pilani Campus

Text Cleaning
• Text cleaning involves removing unwanted elements from the text to make it suitable for analysis.
• Common Techniques:
• Removing Punctuation: Eliminating characters such as periods, commas, and exclamation marks.
• Removing Special Characters: Removing symbols like #, $, %, etc.
• Removing Numbers: Excluding digits unless they are relevant to the analysis.
• Removing HTML Tags: Stripping HTML content from web-scraped text.
• Handling Contractions: Expanding contractions (e.g., "don't" to "do not").

• Tools:
• Regular Expressions (Regex): Powerful for pattern matching and substitution.
• NLTK: Provides functions for various text cleaning tasks.
• SpaCy: Built-in functions for text cleaning.

• Example:
• Input: "Hello, world! Visit us at https://example.com #NLP"
• Output: "Hello world Visit us at example com NLP"

• Importance:
• Enhances Data Quality: Cleaned text is more consistent and easier to analyze.
• Improves Model Accuracy: Cleaner data leads to better-performing models by reducing noise and irrelevant information.

BITS Pilani, Pilani Campus

Named Entity Recognition (NER)
• Named Entity Recognition (NER) is a subtask of information extraction that identifies and classifies named
entities in text into predefined categories such as person names, organizations, locations, dates, etc.
• Categories:
• Person: Names of people (e.g., "John Doe")
• Organization: Names of organizations (e.g., "Google")
• Location: Geographical locations (e.g., "Paris")
• Date/Time: Dates and times (e.g., "January 1, 2020")
• Others: Money, percentages, etc.
• Purpose:
• Information Extraction: Helps in extracting structured information from unstructured text.
• Improved Search and Retrieval: Enhances the performance of search engines and information retrieval systems.
• Tools:
• SpaCy: Provides a pre-trained NER model and tools for custom training.
• NLTK: Offers NER capabilities through its chunking module.
• Stanford NER: A widely used NER tool developed by Stanford University.
Importance: NER is essential for understanding the context and extracting relevant information from large volumes
of text, making it a key component in various NLP applications.

BITS Pilani, Pilani Campus

Named Entity Recognition (NER)
Rule-based Approaches: Use predefined rules and patterns to identify
entities.
• Example: Regular expressions to identify dates and email addresses.
• Pros: Simple to implement and interpret.
• Cons: Limited flexibility and scalability.

Machine Learning Approaches: Use statistical models trained on

labeled data to identify entities.
• Conditional Random Fields (CRF)
• Hidden Markov Models (HMM)
• Pros: More flexible and accurate than rule-based approaches.
• Cons: Require labeled training data and computational resources.

BITS Pilani, Pilani Campus

Named Entity Recognition (NER)
• Deep Learning Approaches: Use neural networks to automatically learn features and
patterns from data
• Recurrent Neural Networks (RNN)
• Long Short-Term Memory (LSTM)
• Transformer-based models (e.g., BERT)
• Pros: High accuracy and ability to capture complex patterns.
• Cons: Require large datasets and significant computational power.
• Example:
• Input: "Apple is looking at buying U.K. startup for $1 billion."
• Output: Entities: [("Apple", "ORG"), ("U.K.", "LOC"), ("$1 billion", "MONEY")]

Importance:
Choosing the right technique depends on the specific requirements of the task, available
data, and resources. Machine learning and deep learning approaches are preferred for
their accuracy and scalability.

BITS Pilani, Pilani Campus

Part of Speech Tagging (POS)
• Part-of-Speech (POS) tagging is the process of assigning a part of speech to each word in a sentence. The
parts of speech include nouns, verbs, adjectives, adverbs, pronouns, conjunctions, prepositions, and
interjections

• Understanding Syntax: Helps in understanding the grammatical structure of sentences.

• Disambiguation: Resolves ambiguities by providing context to words (e.g., "book" as a noun vs. "book" as a verb).

Example The quick brown fox jumps over the lazy dog."
• POS Tags:
The (DT) quick (JJ) brown (JJ) fox (NN) jumps (VBZ) over (IN) the (DT) lazy (JJ) dog (NN)
Tools:
• NLTK: Provides a comprehensive POS tagging module.
• SpaCy: Offers efficient and accurate POS tagging capabilities.
• Stanford POS Tagger: A robust tool developed by Stanford University.

Importance:
POS tagging is fundamental for many NLP tasks such as parsing, text-to-speech conversion, and information
extraction. It enables a deeper understanding of the syntactic and semantic properties of text.

BITS Pilani, Pilani Campus

Part of Speech Tagging (POS)

• Rule-based Tagging: Uses a set of hand-crafted rules to assign POS tags.

• Example: "If a word ends in 'ly', tag it as an adverb (RB)."
• Statistical Tagging:Uses probabilistic models based on the likelihood of a word's POS tag given its
context.
• Hidden Markov Models (HMM)
• Maximum Entropy Models
• Machine Learning Approaches: Use supervised learning techniques to predict POS tags.
• Support Vector Machines (SVM)
• Conditional Random Fields (CRF)
• Deep Learning Approaches:Uses neural networks to automatically learn features from data.
• Recurrent Neural Networks (RNN)
• Long Short-Term Memory (LSTM)
• Transformer-based models (e.g., BERT)
Example:
• Sentence: "The cat sat on the mat."
• POS Tags: The (DT) cat (NN) sat (VBD) on (IN) the (DT) mat (NN)

BITS Pilani, Pilani Campus

Sentiment Analysis
• Sentiment analysis, also known as opinion mining, is the process of
determining the sentiment expressed in a piece of text. It classifies
text into positive, negative, or neutral sentiments.
Understanding Public Opinion: Helps businesses understand customer opinions and
feedback.
Market Research: Analyzing trends and opinions about products, services, or events.
Social Media Monitoring: Tracking sentiment trends on social media platforms.
Applications:
Customer Feedback Analysis: Understanding customer satisfaction and improving
services.
Brand Monitoring: Tracking public sentiment towards a brand.
Political Sentiment: Analyzing public opinion on political issues or candidates.
Product Reviews: Assessing the sentiment in product reviews to gauge consumer
reactions.

BITS Pilani, Pilani Campus

Sentiment Analysis
Techniques:
Lexicon-based Methods: Use predefined lists of positive and negative words.
Machine Learning Approaches: Train classifiers (e.g., SVM, Naive Bayes) on labeled data.
Deep Learning Approaches: Use neural networks (e.g., LSTM, BERT) for higher accuracy.

Example:
Input: "I love the new design of your website!"
Output: Positive

Importance:
Sentiment analysis provides valuable insights into the emotions and opinions
expressed in text, enabling better decision-making and strategy formulation.

BITS Pilani, Pilani Campus

Text to Speech Synthesis
• Text-to-Speech (TTS) synthesis is the process of converting written text into spoken words using
computational methods.
• Purpose:
• Accessibility: Provides access to information for visually impaired users.
• User Experience: Enhances user interaction in virtual assistants and conversational agents.
• Automation: Automates tasks that require reading text, such as news reading or announcements.
• Components:
• Text Analysis:
• Tokenization: Breaking text into smaller units such as sentences and words.
• Linguistic Analysis: Determining the part of speech, phonetic transcription, and prosody (intonation, stress, rhythm).
• Acoustic Modeling:
• Phoneme Synthesis: Generating speech sounds based on phonetic transcription.
• Prosody Generation: Adding natural intonation, stress, and rhythm to synthesized speech.
• Speech Synthesis:
• Concatenative Synthesis: Combining pre-recorded speech segments.
• Formant Synthesis: Using mathematical models to generate speech sounds.
• Waveform Synthesis: Using neural networks to generate high-quality, natural-sounding speech.

BITS Pilani, Pilani Campus

Text to Speech Synthesis
• Tools:
• Google Text-to-Speech: A widely used TTS engine with high-quality
voices.
• Amazon Polly: A cloud-based service that converts text into lifelike
speech.
• IBM Watson Text to Speech: Provides a range of customizable voices.
• Open Source Tools: eSpeak, Festival, and MaryTTS
• Example:
• Input: "Welcome to the world of Text-to-Speech synthesis."
• Output: (Spoken audio)
• Importance: TTS synthesis plays a crucial role in improving
accessibility and enhancing user experiences across various
applications and devices.

BITS Pilani, Pilani Campus

Text Generation Techniques
Text generation involves creating coherent and contextually relevant text from a given input or set of rules.

Key Techniques:
Rule-based Systems : Use predefined rules and templates to generate text.
Example: Fill-in-the-blank templates for automated report generation
Markov Chains : Use probabilistic models based on the likelihood of word sequences.
Example: Generating text by predicting the next word based on the previous one
Recurrent Neural Networks (RNNs): Use neural networks with loops to maintain context over
sequences.
Example: Generating poetry or short stories
Long Short-Term Memory Networks (LSTMs) : A type of RNN designed to better handle long-
term dependencies.
Example: Generating more coherent paragraphs and articles.
Transformer Models: Use self-attention mechanisms to capture long-range dependencies in text.
Example: GPT-3 generating articles, stories, and dialogue.

BITS Pilani, Pilani Campus

State-based and Rule-based Dialogue
Systems
Dialogue systems designed to follow specific states or rules to manage interactions with
users.
State-based Dialogue Systems : Systems that manage conversations using predefined
states and transitions between those states.
Rule-based Dialogue Systems: Systems that use predefined rules and templates to
generate responses and manage interactions.
Examples : If user says "Hello", respond with "Hi, how can I help you today?"
If user asks for account balance, respond with "Please provide your account
number."
Components:
Rule Engine: Processes user input based on predefined rules.
Template Manager: Uses templates to generate responses.
Context Manager: Maintains context of the conversation to apply relevant rules.

BITS Pilani, Pilani Campus

State-based and Rule-based Dialogue
Systems
State-based Dialogue Systems

NLG

USER
Intents ACTION
NLU & Slots
DIALOG MANAGER
MANAGER

STATE DATABASE
(Personal or Public
MACHINE information)

BITS Pilani, Pilani Campus

Slot Filling & Intent Recognition
Intent Recognition: The process of identifying the goal or purpose behind a user's input in a conversation.
Purpose: Helps in understanding user requests and guiding the conversation appropriately.

Example Intents:
Booking a flight
Slot Filling: Extracting specific pieces of information (slots) from the user's input that are necessary to complete the intent.
Purpose: Provides detailed information required to fulfill the user's request.
Example
Slots for Flight Booking:
Destination
Departure
DateReturn
Date
Number of Passengers

BITS Pilani, Pilani Campus

Slot Filling & Intent Recognition
Components:
Natural Language Understanding (NLU) Module: Processes user input to recognize intents and extract slots.
Dialogue Manager: Uses recognized intents and filled slots to manage the conversation flow and fulfill the user’s
request.

Techniques:
Rule-based Methods: Use predefined patterns and templates to recognize intents and extract slots.
Machine Learning Approaches: Train classifiers on labeled datasets to predict intents and extract slots.
Deep Learning Approaches: Use neural networks, particularly sequence-to-sequence models, to handle more complex and varied
inputs.
Example:
User Input: "I want to book a flight to New York on June 5th."
Recognized Intent: Book Flight
Extracted Slots:
Destination: New York
Departure Date: June 5th

Oopular slot filling dataset: https://github.com/howl-anderson/ATIS_dataset/blob/master/README.en-US.md

BITS Pilani, Pilani Campus

Slot filling & Intent Recognition
Pre-processing - Input Sentence: "Set an alarm for 6:30 AM tomorrow."
• Tokenizing: "Set", "an", "alarm", "for", "6:30", "AM", "tomorrow".
• Filtering Unwanted Words: "Set", "alarm", "6:30", "AM", "tomorrow".
• Gold Tags: "Set" (None), "alarm" (None), "6:30" (B-time), "AM" (I-time), "tomorrow" (B-date).
• Labeling Intents: Assigning the intent "SetAlarm" to the sentence.
Model Training:
• Training for Slot Extraction: The model learns from the preprocessed data to understand patterns and
relationships.
• Training for Intent Recognition: The model learns from sentences labeled with intents like "SetAlarm",
"BookFlight", "GetWeather".

BITS Pilani, Pilani Campus

Slot filling & Intent Recognition
Model Prediction:
For a new sentence - "Remind me to call the doctor at 8 PM today and send an email at 10 AM tomorrow".
• Prediction of Slots: "Remind" (None), "me" (None), "to" (None), "call" (None), "the" (None), "doctor" (None),
"at" (None), "8" (B-time), "PM" (I-time), "today" (B-date), "and" (None), "send" (None), "an" (None), "email"
(None), "at" (None), "10" (B-time), "AM" (I-time), "tomorrow" (B-date).
• Prediction of Intent: "SetReminder" and "SendEmail".
Postprocessing:
• Normalizing Slots: Convert "8 PM" to "20:00", "10 AM" to "10:00", and "today" and "tomorrow" to their respective
dates.
• Normalizing Intents: Check the predicted intents "SetReminder" and "SendEmail" are in a standard format.
"BookFlight", "book_flight" to BookFlight
• Filtering Incorrect Cases: Check "at" is not tagged as "B-time" and the intents are correctly identified.
• Identifying Obvious Cases: Check "8 PM" and "10 AM" are correctly tagged as "B-time", "I-time" and Clear case
intents like "alarm" are correctly identified as "SetAlarm"

Slots: Time 1: 20:00, Date 1: 2024-11-30, Time 2: 10:00, Date 2: 2024-12-01

Intent Recognized: SetAlarm and SendEmail

BITS Pilani, Pilani Campus

Normalization techniques
➢ Date Normalization: Converting various date formats into a ➢ Text Case Normalization: Converting text to a
standard format. consistent case (e.g., all lowercase).
Input: "12th December 2024", "12/12/2024", "December 12, Input: "Hello World", "HELLO WORLD", "hello world".
2024". Normalized: "hello world".
Normalized: "2024-12-12". ➢ Whitespace Normalization: Removing extra spaces and
➢ Time Normalization: Converting different time formats into a ensuring consistent spacing.
24-hour format. Input: "Hello World", "Hello World", "Hello
Input: "7 PM", "07:00 PM", "19:00". World".
Normalized: "19:00". Normalized: "Hello World".
➢ Currency Normalization: Standardizing currency ➢ Synonym Normalization: Converting synonyms to a
representations. standard term.
Input: "5000 rupees", "₹5000", "INR 5000". Input: "car", "vehicle".
Normalized: "INR 5000". Normalized: "car".
➢ Phone Number Normalization: Converting various phone
number formats into a standard international format.
Input: "9876xx3210", "09876xx3210",
Normalized: "+91-9876xx3210".
➢ Address Normalization: Standardizing address components.
Input: "123 MG Road, Bangalore", "123, MG Rd, Bengaluru".
Normalized: "123 MG Road, Bangalore".

BITS Pilani, Pilani Campus

Chapter 7.1 - Introducing Natural Language Processing
No ratings yet
Chapter 7.1 - Introducing Natural Language Processing
39 pages
Text-Processing-For-NLP-Text-Processing (6)
No ratings yet
Text-Processing-For-NLP-Text-Processing (6)
15 pages
2. NLP Pipeline
No ratings yet
2. NLP Pipeline
50 pages
Schools of Psychology by Sarfraz Ahmad Mayo
No ratings yet
Schools of Psychology by Sarfraz Ahmad Mayo
5 pages
NLP PPT
No ratings yet
NLP PPT
58 pages
Unit - 2
No ratings yet
Unit - 2
55 pages
Module 1 Nlp
No ratings yet
Module 1 Nlp
26 pages
NLP - 1_250119_222702 (1)
No ratings yet
NLP - 1_250119_222702 (1)
71 pages
Importance of Teaching
No ratings yet
Importance of Teaching
44 pages
Lecture 8 - Text Analytics NLP
No ratings yet
Lecture 8 - Text Analytics NLP
24 pages
NLP_record300
No ratings yet
NLP_record300
24 pages
NLP_Module 2
No ratings yet
NLP_Module 2
54 pages
NLP Notebook
No ratings yet
NLP Notebook
20 pages
WJ - I Insights Discovery An Introduction Journal v1 2010 Spreads (Not For Printing) PDF
100% (1)
WJ - I Insights Discovery An Introduction Journal v1 2010 Spreads (Not For Printing) PDF
30 pages
NLP EXP 3 (1)
No ratings yet
NLP EXP 3 (1)
24 pages
Cot Math 6 1st QTR Week 5 Multiplies Decimals and Mixed Decimals
No ratings yet
Cot Math 6 1st QTR Week 5 Multiplies Decimals and Mixed Decimals
11 pages
AP for NLP-Word 2 Vec
No ratings yet
AP for NLP-Word 2 Vec
33 pages
NLP (4)
No ratings yet
NLP (4)
40 pages
4.TWITTER EXTRACTION AND ANALYTICS
No ratings yet
4.TWITTER EXTRACTION AND ANALYTICS
45 pages
Text Mining
No ratings yet
Text Mining
34 pages
Overview UCD - Step 1 Know User
No ratings yet
Overview UCD - Step 1 Know User
51 pages
NLP Lecture2 Text Pre Processing
No ratings yet
NLP Lecture2 Text Pre Processing
54 pages
Text Analytics Basics
No ratings yet
Text Analytics Basics
28 pages
Lect02
No ratings yet
Lect02
23 pages
Machine Learning For NLP: Vocabulary
No ratings yet
Machine Learning For NLP: Vocabulary
37 pages
Assignment 1 8601
No ratings yet
Assignment 1 8601
26 pages
Module 3
No ratings yet
Module 3
40 pages
AP for NLP-LO1
No ratings yet
AP for NLP-LO1
61 pages
Natural Language Processing manual
No ratings yet
Natural Language Processing manual
39 pages
VO_MCA_SEM 4 _ Text Mining _U2
No ratings yet
VO_MCA_SEM 4 _ Text Mining _U2
15 pages
Handling Corpus Raw Text
No ratings yet
Handling Corpus Raw Text
15 pages
Project Name: Class Name, Number Instructor Group Member Names
No ratings yet
Project Name: Class Name, Number Instructor Group Member Names
20 pages
IR....
No ratings yet
IR....
5 pages
NLP_Preprocessing_Steps__1740444240
No ratings yet
NLP_Preprocessing_Steps__1740444240
20 pages
Anapanasati Meditation Technique For Effective Reduction in Stress, Anxiety & Depression: An Experimental Study
No ratings yet
Anapanasati Meditation Technique For Effective Reduction in Stress, Anxiety & Depression: An Experimental Study
5 pages
DeekshikaJadyada-AP24LDS11
No ratings yet
DeekshikaJadyada-AP24LDS11
6 pages
Natural Language Processing Notes Class 10
No ratings yet
Natural Language Processing Notes Class 10
10 pages
DLT Unit-5
No ratings yet
DLT Unit-5
48 pages
Gitika_Mandal_BE4_A_17_NLP_EXP1
No ratings yet
Gitika_Mandal_BE4_A_17_NLP_EXP1
3 pages
Week 8-Module 7 NLP
No ratings yet
Week 8-Module 7 NLP
52 pages
Chapter 4
No ratings yet
Chapter 4
17 pages
8 Search AIC3 VL2
No ratings yet
8 Search AIC3 VL2
13 pages
Unit 5 - Aiaaia
No ratings yet
Unit 5 - Aiaaia
19 pages
Chapter - 1
No ratings yet
Chapter - 1
25 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
54 pages
Natural Language Processing Revision Notes
No ratings yet
Natural Language Processing Revision Notes
4 pages
Teaching Task - Anticipated Problems and Solutions
No ratings yet
Teaching Task - Anticipated Problems and Solutions
9 pages
Kierkegaard. Pathos Bij Mens
No ratings yet
Kierkegaard. Pathos Bij Mens
24 pages
The Prototypicality of Brands
No ratings yet
The Prototypicality of Brands
9 pages
Unraveling The Power of Natural Language Processing
No ratings yet
Unraveling The Power of Natural Language Processing
11 pages
Sentiment Analysis Using Supervised Machine Learning Ijariie13051
No ratings yet
Sentiment Analysis Using Supervised Machine Learning Ijariie13051
7 pages
Introduction To NLP
No ratings yet
Introduction To NLP
50 pages
Web and Social Media Analytics Lab
No ratings yet
Web and Social Media Analytics Lab
34 pages
Syllable and Syllable Stress
No ratings yet
Syllable and Syllable Stress
22 pages
Guia N°4 - 5°
No ratings yet
Guia N°4 - 5°
8 pages
Pipeline
No ratings yet
Pipeline
9 pages
Natural Language Processing
No ratings yet
Natural Language Processing
10 pages
Preprocessing Techniquesfor Text Mining
No ratings yet
Preprocessing Techniquesfor Text Mining
7 pages
Natural Language Processing
No ratings yet
Natural Language Processing
25 pages
Chapter-1 Introduction To NLP
No ratings yet
Chapter-1 Introduction To NLP
12 pages
ADHD Brain Filter
No ratings yet
ADHD Brain Filter
4 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
55 pages
Splenic Authority-Human Design
No ratings yet
Splenic Authority-Human Design
3 pages
NLP - Srilakshmi H - PPT Assignment
No ratings yet
NLP - Srilakshmi H - PPT Assignment
29 pages
NLP Manual (1-12) 1
No ratings yet
NLP Manual (1-12) 1
56 pages
Edu 600 Learning Episode 6
No ratings yet
Edu 600 Learning Episode 6
4 pages
Text Preprocessing Stages
No ratings yet
Text Preprocessing Stages
8 pages
NLP Manual
No ratings yet
NLP Manual
15 pages
NLP_Crash_Course_Comprehensive
No ratings yet
NLP_Crash_Course_Comprehensive
2 pages
Lecture 9 - User Interface Design
No ratings yet
Lecture 9 - User Interface Design
56 pages
Assignment
No ratings yet
Assignment
6 pages
A Whole New Mind
100% (3)
A Whole New Mind
70 pages
Minorproject Ishant
No ratings yet
Minorproject Ishant
18 pages
1) What Is Natural Language Processing?
No ratings yet
1) What Is Natural Language Processing?
14 pages
Public Speaking Final Self Reflection
No ratings yet
Public Speaking Final Self Reflection
1 page
Functional Behavior Assesstment and Challenging Behaviors PDF
100% (1)
Functional Behavior Assesstment and Challenging Behaviors PDF
45 pages
Lesson Plan My Favourite Subject
No ratings yet
Lesson Plan My Favourite Subject
2 pages
Natural Language Processing_NOTES
No ratings yet
Natural Language Processing_NOTES
4 pages
Advantages and Disadvantages of Performance Appraisal
No ratings yet
Advantages and Disadvantages of Performance Appraisal
6 pages
Think and Grow Rich
No ratings yet
Think and Grow Rich
3 pages
NLP Lab Manual-1
No ratings yet
NLP Lab Manual-1
18 pages
CSDM2-Text Preprocessing For NL Data - 011050
No ratings yet
CSDM2-Text Preprocessing For NL Data - 011050
6 pages
Google's Human Resource Management Practices
No ratings yet
Google's Human Resource Management Practices
9 pages
Intended Learning Outcome: at The End of This Module
100% (1)
Intended Learning Outcome: at The End of This Module
2 pages
Week 6: Introduction To Natural Language Processing
No ratings yet
Week 6: Introduction To Natural Language Processing
18 pages
Prepositions: Time: At, On, In, For, and Since
No ratings yet
Prepositions: Time: At, On, In, For, and Since
4 pages
Using Mainstream Game To Teach Technology Through An Interest Framework
No ratings yet
Using Mainstream Game To Teach Technology Through An Interest Framework
12 pages
Abstract Reasoning Questions & Answers - Page 6
No ratings yet
Abstract Reasoning Questions & Answers - Page 6
8 pages
Bergson, Deleuze and The Becoming of Unbecoming
100% (1)
Bergson, Deleuze and The Becoming of Unbecoming
11 pages
Data Manipulation with Python Step by Step: A Practical Guide with Examples
From Everand
Data Manipulation with Python Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lecture 02 - NLU concepts

Uploaded by

Lecture 02 - NLU concepts

Uploaded by

NLU concepts for Conversational AI

• Importance of text pre-processing

BITS Pilani, Pilani Campus

• It includes techniques such as:

BITS Pilani, Pilani Campus

• Example:Input: "Natural Language Processing is fascinating.

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

• Problems if we don’t do stemming or lemmatization:

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

Machine Learning Approaches: Use statistical models trained on

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

• Understanding Syntax: Helps in understanding the grammatical structure of sentences.

BITS Pilani, Pilani Campus

• Rule-based Tagging: Uses a set of hand-crafted rules to assign POS tags.

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

Oopular slot filling dataset: https://github.com/howl-anderson/ATIS_dataset/blob/master/README.en-US.md

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

Slots: Time 1: 20:00, Date 1: 2024-11-30, Time 2: 10:00, Date 2: 2024-12-01

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.