Unit 1-Part2
Unit 1-Part2
Nominal Data
Nominal Data is used to label variables without any order or quantitative
value. The color of hair can be considered nominal data, as one color can’t be
compared with another color.
The name “nominal” comes from the Latin name “nomen,” which means
“name.” With the help of nominal data, we can’t do any numerical tasks or
can’t give any order to sort the data. These data don’t have any meaningful
order; their values are distributed into distinct categories.
Examples of Nominal Data :
●
Colour of hair (Blonde, red, Brown, Black, etc.)
●
Marital status (Single, Widowed, Married)
●
Nationality (Indian, German, American)
●
Gender (Male, Female, Others)
●
Eye Color (Black, Brown, etc.)
Ordinal Data
Ordinal data have natural ordering where a number is present in some kind
of order by their position on the scale. These data are used for observation
like customer satisfaction, happiness, etc., but we can’t do any arithmetical
tasks on them.
Ordinal data is qualitative data for which their values have some kind of
relative position. These kinds of data can be considered “in-between”
qualitative and quantitative data. The ordinal data only shows the sequences
and cannot use for statistical analysis. Compared to nominal data, ordinal
data have some kind of order that is not present in nominal data.
Examples of Ordinal Data :
When companies ask for feedback, experience, or satisfaction on a scale of 1
to 10
Letter grades in the exam (A, B, C, D, etc.)
Ranking of people in a competition (First, Second, Third, etc.)
Economic Status (High, Medium, and Low)
Education Level (Higher, Secondary, Primary)
Continuous Data
Continuous data are in the form of fractional numbers. It can be
the version of an android phone, the height of a person,
the length of an object, etc. Continuous data represents
information that can be divided into smaller levels. The
continuous variable can take any value within a range.
The key difference between discrete and continuous data is
that discrete data contains the integer or whole number. Still,
continuous data stores the fractional numbers to record
different types of data such as temperature, height, width,
time, speed, etc.
Examples of Continuous Data :
Height of a person
Speed of a vehicle
“Time-taken” to finish the work
Wi-Fi Frequency
Market share price
Discrete Data
The term discrete means distinct or separate. The
discrete data contain the values that fall under integers
or whole numbers. The total number of students in a
class is an example of discrete data. These data can’t be
broken into decimal or fraction values.
The discrete data are countable and have finite values;
their subdivision is not possible. These data are
represented mainly by a bar graph, number line, or
frequency table.
Examples of Discrete Data :
Total numbers of students present in a class
Numbers of employees in a company
The total number of players who participated in a
competition
Days in a week
Graph data :
Data that represents relationships between
entities, often modeled as nodes and edges
(e.g., social networks, network graphs).
Graph data : Examples
Social Networks:
Facebook, Twitter, LinkedIn, and other social media platforms represent individuals
(nodes)
and their connections or friendships (edges).
Network Graphs:
In computer networks, nodes can represent devices (such as computers or routers),
and edges
represent the connections or links between them.
Citation Networks:
In academic research, nodes can represent papers or authors, and edges represent
citations or
collaborations between them.
Recommendation Systems:
Nodes can represent users or items, and edges can represent user interactions or
preferences.
Graphs are used to model and recommend items to users based on their preferences
and
connections.
High-Dimensional Data:
Image Data:
Each pixel in an image can be considered a dimension, and high-resolution images result in
high-dimensional datasets.
Text Data:
In natural language processing, the representation of text data using techniques like
TF-IDF or word embeddings can result in high-dimensional feature spaces.
1. Structured Data:
Definition: Well-organized data with a fixed schema, often stored in
relational databases.
Examples:
Relational Database Table:
• Attributes (Columns): ID, Name, Age, Address
• Records (Rows):
• 1, John Doe, 30, 123 Main St
• 2, Jane Smith, 25, 456 Oak St
Excel Spreadsheet:
• Columns represent different attributes (e.g., Date, Sales, Product).
• Rows represent individual entries for each date.
SQL Database:
• Tables with predefined columns and data types.
Semi - Structured Data
2. Semi-structured Data
Definition: Falls between structured and unstructured data, having
some
organizational elements but not adhering to a strict schema.
Examples:
.
Sources of data in data science
2.APIs (Application Programming Interfaces):
• Access data from web APIs that provide a
structured way to interact with web services.
• Examples include Twitter API, Google Maps
API, or financial market APIs
Sources of data in data science
3.Web Scraping: