Module-3 Part A
Module-3 Part A
Analysis
What Is Semantic Analysis?
Semantic analysis refers to a process of understanding natural language (text) by extracting insightful
information such as context, emotions, and sentiments from unstructured data. It gives computers and
systems the ability to understand, interpret, and derive meanings from sentences, paragraphs, reports,
registers, files, or any document of a similar kind.
Semantic analysis analyzes the grammatical format of sentences, including the arrangement of words,
phrases, and clauses, to determine relationships between independent terms in a specific context. This
is a crucial task of natural language processing (NLP) systems. It is also a key component of several
machine learning tools available today, such as search engines, chatbots, and text analysis software.
The semantic analysis process begins by studying and analyzing the dictionary definitions and
meanings of individual words also referred to as lexical semantics. Following this, the relationship
between words in a sentence is examined to provide clear understanding of the context.
When fueled by natural language processing and machine learning, systems of semantic analysis tend
to achieve human-level accuracy. Several companies rely heavily on semantic analysis-driven tools
that automatically draw valuable data from unstructured data such as emails, client reports, and
customer reviews.
The purpose of semantic analysis is to draw exact meaning, or dictionary meaning from the text. The
work of semantic analyzer is to check the text for meaningfulness.
It is the first part of the semantic analysis in which the study of the meaning of individual
words is performed. This part is called lexical semantics.
In the second part, the individual words will be combined to provide meaning in sentences.
The most important task of semantic analysis is to get the proper meaning of the sentence. For
example, analyze the sentence Ram is great. In this sentence, the speaker is talking either about Lord
Ram or about a person whose name is Ram. That is why the job, to get the proper meaning of the
sentence, of semantic analyzer is important.
Hyponymy
It may be defined as the relationship between a generic term and instances of that generic term.
Here the generic term is called hypernym and its instances are called hyponyms. For example, the
word color is hypernym and the color blue, yellow etc. are hyponyms.
Homonymy
It may be defined as the words having same spelling or same form but having different and
unrelated meaning. For example, the word Bat is a homonymy word because bat can be an
implement to hit a ball or bat is a nocturnal flying mammal also.
Polysemy
Polysemy is a Greek word, which means many signs. It is a word or phrase with different but related
sense. In other words, we can say that polysemy has the same spelling but different and related
meaning. For example, the word bank is a polysemy word having the following meanings −
A financial institution.
The building in which such an institution is located.
A synonym for to rely on.
Difference between Polysemy and Homonymy
Both polysemy and homonymy words have the same syntax or spelling. The main difference
between them is that in polysemy, the meanings of the words are related but in homonymy, the
meanings of the words are not related. For example, if we talk about the same word Bank, we
can write the meaning a financial institution or a river bank. In that case it would be the example
of homonym because the meanings are unrelated to each other.
Synonymy
It is the relation between two lexical items having different forms but expressing the same or a
close meaning. Examples are author/writer, fate/destiny.
Antonymy
It is the relation between two lexical items having symmetry between their semantic
components relative to an axis. The scope of antonymy is as follows −
Meaning Representation
Semantic analysis creates a representation of the meaning of a sentence. But before getting into
the concept and approaches related to meaning representation, we need to understand the
building blocks of semantic system.
Entities − It represents the individual such as a particular person, location etc. For
example, Haryana. India, Ram all are entities.
Concepts − It represents the general category of the individuals such as a person, city,
etc.
Relations − It represents the relationship between entities and concept. For example,
Ram is a person.
Predicates − It represents the verb structures. For example, semantic roles and case
grammar are the examples of predicates.
Now, we can understand that meaning representation shows how to put together the building
blocks of semantic systems. In other words, it shows how to put together entities, concepts,
relation and predicates to describe a situation. It also enables the reasoning about the semantic
world.
Semantic Nets
Frames
Rule-based architecture
Case Grammar
Conceptual Graphs
Lexical Semantics
The first part of semantic analysis, studying the meaning of individual words is called lexical
semantics. It includes words, sub-words, affixes (sub-units), compound words and phrases also.
All the words, sub-words, etc. are collectively called lexical items. In other words, we can say
that lexical semantics is the relationship between lexical items, meaning of sentences and syntax
of sentence.
Classification of lexical items like words, sub-words, affixes, etc. is performed in lexical
semantics.
Decomposition of lexical items like words, sub-words, affixes, etc. is performed in lexical
semantics.
Automated semantic analysis allows customer service teams to focus on complex customer inquiries
that require human intervention and understanding. Also, machines can analyze the messages
received on social media platforms, chatbots, and emails. This improves the overall productivity of
the employees as the tech frees them from mundane tasks and allows them to concentrate on
critical inquiries or operations.
Semantic analysis helps fine-tune the search engine optimization (SEO) strategy by allowing
companies to analyze and decode users’ searches. For example, understanding users’ Google
searches. The approach helps deliver optimized and suitable content to the users, thereby boosting
traffic and improving result relevance.
How Does Semantic Analysis Work?
The semantic analysis method begins with a language-independent step of analyzing the set of
words in the text to understand their meanings. This step is termed ‘lexical semantics‘ and refers to
fetching the dictionary definition for the words in the text. Subsequently, words or elements are
parsed. Each element is designated a grammatical role, and the whole structure is processed to cut
down on any confusion caused by ambiguous words having multiple meanings.
Upon parsing, the analysis then proceeds to the interpretation step, which is critical for artificial
intelligence algorithms. For example, the word ‘Blackberry’ could refer to a fruit, a company, or its
products, along with several other meanings. Moreover, context is equally important while
processing the language, as it takes into account the environment of the sentence and then
attributes the correct meaning to it.
For example, ‘Blackberry is known for its sweet taste’ may directly refer to the fruit, but ‘I got a
blackberry’ may refer to a fruit or a Blackberry product. As such, context is vital in semantic analysis
and requires additional information to assign a correct meaning to the whole sentence or language.
Data processing.
Defining features, parameters, and characteristics of processed data
Data representation
Defining grammar for data analysis
Assessing semantic layers of processed data
Performing semantic analysis based on the linguistic formalism
Semantic analysis techniques
The semantic analysis uses two distinct techniques to obtain information from text or corpus of data.
The first technique refers to text classification, while the second relates to text extractor.
1. Semantic classification
Semantic classification implies text classification wherein predefined categories are assigned to the
text for faster task completion. Following are the various types of text classification covered under
semantic analysis:
Topic classification: This classifies text into preset categories on the basis of the content type. For
example, customer support teams in a company may intend to classify the tickets raised by
customers at the help desk into separate categories so that the concerned teams can address them.
In this scenario, ML-based semantic analysis tools may recognize tickets based on their content and
classify them under a ‘payment concern’ or ‘delayed delivery’ category.
Sentiment analysis: Today, sentiment analysis is used by several social media platforms such as
Twitter, Facebook, Instagram, and others to detect positive, negative, or neutral emotions hidden in
text (posts, stories). These sentiments, in a way, denote urgency and may raise ‘call to action’ alarms
for respective platforms. Sentiment analysis helps brands identify dissatisfied customers or users in
real-time and gets a hint on what customers feel about the brand as a whole.
Intent classification: Intent classification refers to the classification of text based on customers’
intentions in the context of what they intend to do next. You can use it to tag customers as
‘interested’ or ‘not Interested’ to effectively reach out to those customers who may intend to buy a
product or show an inclination toward buying it.
2. Semantic extraction
Semantic extraction refers to extracting or pulling out specific data from the text. Extraction types
include:
Keyword extraction: This technique helps identify relevant terms and expressions in the text
and gives deep insights when combined with the above classification techniques.
For example, one can analyze keywords in multiple tweets that have been labeled as
positive or negative and then detect or extract words from those tweets that have been
mentioned the maximum number of times. One can later use the extracted terms for
automatic tweet classification based on the word type used in the tweets.
Entity extraction: This technique is used to identify and extract entities in text, such as
names of individuals, organizations, places, and others.
This method is typically helpful for customer support teams who intend to extract relevant
information from customer support tickets automatically, including customer name, phone
number, query category, shipping details, etc.
Machine learning algorithm-based automated semantic analysis
One can train machines to make near-accurate predictions by providing text samples as
input to semantically-enhanced ML algorithms. Such estimations are based on previous
observations or data patterns. Machine learning-based semantic analysis involves sub-tasks
such as relationship extraction and word sense disambiguation.
For example, ‘Raspberry Pi’ can refer to a fruit, a single-board computer, or even a company
(UK-based foundation). Hence, it is critical to identify which meaning suits the word
depending on its usage.
2. Relationship extraction
Let’s consider a phrase as an example. ‘Elon Musk is one of the co-founders of Tesla, which is
based in Austin, Texas.’
Hyponyms: This refers to a specific lexical entity having a relationship with a more generic
verbal entity called hypernym. For example, red, blue, and green are all hyponyms of color,
their hypernym.
Meronomy: Refers to the arrangement of words and text that denote a minor component of
something. For example, mango is a meronomy of a mango tree.
Polysemy: It refers to a word having more than one meaning. However, it is represented
under one entry. For example, the term ‘dish’ is a noun. In the sentence, ‘arrange the dishes
on the shelf,’ the word dishes refers to a kind of plate.
Synonyms: This refers to similar-meaning words. For example, abstract (noun) has a
synonyms summary–synopsis.
Antonyms: This refers to words with opposite meanings. For example, cold has the
antonyms warm and hot.
Homonyms: This refers to words with the same spelling and pronunciation, but reveal a
different meaning altogether. For example, bark (tree) and bark (dog).
Apart from these vital elements, the semantic analysis also uses semiotics and collocations
to understand and interpret language. Semiotics refers to what the word means and also the
meaning it evokes or communicates. For example, ‘tea’ refers to a hot beverage, while it
also evokes refreshment, alertness, and many other associations. On the other hand,
collocations are two or more words that often go together. For example, fast food, dark
chocolate, etc.
Text Extraction