0% found this document useful (0 votes)
9 views10 pages

DS Sol

DD

Uploaded by

jeminpatel769
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views10 pages

DS Sol

DD

Uploaded by

jeminpatel769
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Q-2) Discuss pyramid of analytics with diagram

ANS)

The Pyramid of Analytics, also known as the DIKW (Data-Information-


Knowledge-Wisdom) Pyramid, is a hierarchical model that represents the
transformation of raw data into actionable insights. It consists of four stages:
1. Data: The foundation of the pyramid, data refers to the raw, unprocessed
information collected from various sources. This includes numbers, text,
images, and other digital content.
2. Information: The first stage of processing, information involves
organizing and structuring data into a meaningful format. This includes
data cleaning, filtering, and summarization.
3. Knowledge: The second stage, knowledge, involves analyzing and
interpreting the information to extract insights and patterns. This
includes data modeling, statistical analysis, and machine learning.
4. Wisdom: The pinnacle of the pyramid, wisdom represents the
application of knowledge to inform decision-making and drive business
outcomes. This includes using insights to identify opportunities, mitigate
risks, and optimize processes.

Q-3) Describe three components of Business Analytics.


ANS)
The three primary components are:
1. Descriptive Analytics
 Purpose: This component focuses on summarizing historical data to
understand what has happened in the past. It involves collecting,
organizing, and visualizing data using techniques like data aggregation,
dashboards, and reports.
 Tools/Techniques: Visualization tools (e.g., Tableau, Power BI), SQL
queries, and reporting software.
 Example: A company uses descriptive analytics to track sales figures,
customer behavior, and inventory levels over the past year.
2. Predictive Analytics
 Purpose: Predictive analytics aims to forecast future outcomes based on
historical data. It uses statistical models, machine learning algorithms,
and data mining techniques to identify patterns and trends that can help
predict future events.
 Tools/Techniques: Regression analysis, time series analysis, machine
learning models (e.g., decision trees, neural networks).
 Example: An e-commerce company predicts future sales based on
previous sales trends, seasonal effects, and marketing campaigns.
3. Prescriptive Analytics
 Purpose: Prescriptive analytics suggests the best course of action to
achieve desired outcomes by analyzing data and forecasting the effects
of different decisions. It often combines optimization techniques and
simulation models to recommend specific strategies.
 Tools/Techniques: Optimization algorithms, decision trees, simulations,
AI-based tools.
 Example: A logistics company uses prescriptive analytics to optimize
delivery routes by considering traffic, fuel costs, and delivery deadlines
to minimize costs and improve efficiency.

Q-4) Give the difference between descriptive analytics and predictive


analytics.
ANS)
The key differences between Descriptive Analytics and Predictive Analytics lie
in their purpose, techniques, and time orientation:
1. Purpose:
 Descriptive Analytics: Focuses on summarizing and analyzing historical
data to understand past events or performance. It answers the question,
"What happened?"
 Predictive Analytics: Uses historical data to make predictions about
future events or trends. It answers the question, "What is likely to
happen?"
2. Time Orientation:
 Descriptive Analytics: Primarily concerned with past and current data. It
helps in understanding past trends and generating insights based on
historical data.
 Predictive Analytics: Looks forward by using past data to forecast future
outcomes, identifying potential risks or opportunities.
3. Techniques:
 Descriptive Analytics: Involves techniques like data aggregation, data
mining, visualization (e.g., charts, graphs), and basic statistical analysis
(e.g., averages, frequencies).
 Predictive Analytics: Employs more advanced techniques, including
machine learning algorithms, regression analysis, and time series
forecasting to identify patterns and predict future behavior.
4. Outcome:
 Descriptive Analytics: Provides insights into what has happened,
allowing businesses to evaluate past performance and understand data
trends.
 Predictive Analytics: Provides actionable insights by anticipating future
outcomes, helping businesses make proactive decisions and strategies.

Q-5) Explain the Framework for Data-Driven Decision Making


process.
ANS)
This framework encompasses key elements from various sources, including
business intelligence, higher education institutions, and organizational
development.
1. Establish a Data-Driven Culture
 Encourage critical thinking and curiosity across all job levels
 Foster a culture that values data-driven decision-making
 Develop core capabilities: data proficiency, analytics agility, and
community
2. Collect and Analyze Data
 Use business intelligence (BI) reporting tools to simplify data
visualization and collection
 Identify relevant data sources and ensure data quality
 Analyze data to identify patterns, inferences, and insights
 Use data visualization tools to create charts and graphs for easier
understanding
3. Triangulate Data
 Validate data insights through multiple sources and methods
 Ensure data accuracy and reliability
 Address potential biases and flaws in data collection or interpretation
4. Act on Insights
 Develop a structured decision-making process
 Use data-driven insights to inform decisions
 Implement data analytics technology to support decision-making
 Monitor and measure the impact of decisions
5. Organizational Supports
 Develop a data-driven decision-making framework (e.g., Collect, Analyze,
Triangulate, and Act - CAT)
 Establish organizational structures and processes to support DDDM
 Provide education and training on data best practices and ethics
6. Continuous Improvement
 Regularly measure and monitor the effectiveness of DDDM
 Refine the framework and processes based on feedback and lessons
learned
 Ensure ongoing data quality and integrity

Q-8) Define with example Nominal Scale, Ordinal Scale and Interval
Scale.
ANS)
Nominal Scale: A nominal scale assigns categories or labels to data without
implying any inherent order or magnitude. It is used to identify unique
categories or groups.
Example: Gender (Male/Female), Political preference
(Independent/Democrat/Republican), or Residential area (Suburbs/City/Town)
Ordinal Scale: An ordinal scale assigns categories or labels to data with a
natural order or ranking. It implies that the differences between categories are
meaningful, but not necessarily equal.
Example: Customer satisfaction survey ratings (Poor, Fair, Good, Excellent),
Ranking of favorite sports teams (1st to 5th place), or Job titles (Entry-level,
Mid-level, Senior-level)
Interval Scale: An interval scale has all the properties of an ordinal scale, plus
equal intervals between categories. It allows for meaningful differences
between categories and calculations such as mean and standard deviation.
Example: Temperature in Celsius (0°C to 100°C), Time (hours, minutes,
seconds), or IQ scores (with a true zero point)

Q-10) Explain different types of Data Measurement scales.


ANS)
In statistical analysis, data is classified into one of four fundamental
measurement scales: nominal, ordinal, interval, and ratio. Each scale has
distinct characteristics, determining the types of mathematical operations that
can be performed on the data.
1. Nominal Scale
 Variables measured on a nominal scale have distinct categories or labels,
but no inherent order or magnitude.
 Examples: Sex (male/female), blood type (A/B/O), country of origin.
 Mathematical operations: Only equality tests (e.g., “is this value equal to
that value?”) can be performed.
2. Ordinal Scale
 Variables measured on an ordinal scale have a natural order, but no
equal intervals between categories.
 Examples: Likert scales (e.g., strongly agree, agree, neutral, disagree,
strongly disagree), ranking (e.g., 1st, 2nd, 3rd place).
 Mathematical operations: In addition to equality tests, ordinal data
allows for ranking and comparison of values (e.g., “is this value greater
than that value?”).
3. Interval Scale
 Variables measured on an interval scale have equal intervals between
categories, but no true zero point.
 Examples: Temperature in Celsius (0°C is not absolute zero), date of birth
(days, months, years have equal intervals).
 Mathematical operations: In addition to equality tests and ranking,
interval data allows for addition and subtraction, but not division or
multiplication (since there is no true zero).
4. Ratio Scale
 Variables measured on a ratio scale have equal intervals between
categories, with a true zero point.
 Examples: Weight in kilograms, distance in meters, time in seconds.
 Mathematical operations: All mathematical operations are permitted,
including addition, subtraction, multiplication, and division.
Q-11) Define Following Terms:
1. Entropy 2. Information Gain 3. Population
ANS)
Entropy is a measure of the disorder or impurity of a set of occurrences or a
probability distribution. It quantifies the amount of uncertainty or randomness
in a system. In the context of information theory and machine learning,
entropy is typically measured in bits and is used to evaluate the uncertainty or
unpredictability of a random variable or a dataset.
Formally, entropy is defined as the expected value of the logarithm of the
probability density function (PDF) of a random variable. For a binary random
variable X with probabilities p and 1-p, the entropy is:
H(X) = - p log2 p - (1-p) log2 (1-p)
where log2 is the logarithm to the base 2.
In decision trees, entropy is used to measure the impurity of a node’s labels,
with the goal of minimizing entropy by splitting the node into more
homogeneous subsets.
Information Gain
Information Gain (IG) is a metric used in decision trees to evaluate the
reduction in entropy (or impurity) caused by splitting a node based on a
particular attribute or feature. It measures the amount of knowledge gained
about the target variable by partitioning the data based on the attribute.
IG is calculated as the difference between the entropy of the parent node and
the weighted average of the entropies of the child nodes. The weights are the
proportions of instances in each child node.
IG = H(parent) - ∑ (weight_i * H(child_i))
where H(parent) is the entropy of the parent node, and H(child_i) is the
entropy of the i-th child node.
Information Gain is used to select the best attribute to split a node, with the
goal of maximizing the reduction in entropy and creating more homogeneous
subsets.
Population
Population refers to the total number of individuals or units within a specified
group, species, or geographic area. In the context of data analysis and machine
learning, population often refers to the number of samples or instances in a
dataset.
In demographic studies, population typically refers to the total number of
people within a specific geographic area, such as a city, country, or region.

Q-13) What is the need of Skewness and Kurtosis . Explain its types
with example
ANS)
Skewness and Kurtosis are two important statistical measures that help
describe the shape of a probability distribution. They are essential in
understanding the characteristics of a dataset, enabling more accurate data
analysis and decision-making.
Skewness:
Skewness measures the degree of asymmetry of a distribution. It indicates
whether the data is:
1. Symmetric (zero skewness): The distribution looks the same to the left
and right of the center point.
2. Right-skewed (positive skewness): The tail extends to the right, with
more data points above the mean.
3. Left-skewed (negative skewness): The tail extends to the left, with more
data points below the mean.
Example:
 Income distribution: Typically positively skewed, with a long tail of high-
income earners.
 Stock returns: Often positively skewed, with occasional extreme gains.
Kurtosis:
Kurtosis measures the degree of peakedness (leptokurtic) or flatness
(platykurtic) of a distribution compared to a normal distribution. It indicates:
1. Leptokurtic (positive kurtosis): A distribution with heavier tails and a
more pronounced peak than a normal distribution.
2. Platykurtic (negative kurtosis): A distribution with lighter tails and a
flatter peak than a normal distribution.
3. Mesokurtic (zero kurtosis): A distribution with tails and a peak similar to
a normal distribution.
Example:
 IQ scores: Often leptokurtic, with a sharp peak and more extreme
values.
 Height and weight distributions: Typically platykurtic, with shorter tails
and a flatter peak.

Q-14) Below is the dataset of Pizza Price in given cities .Find Mean
and Median of both the cities

ANS)
New Delhi Pizza Prices:
1$, 2$, 3$, 3$, 4$, 5$, 6$, 7$, 9$, 11$, 66$
Lucknow Pizza Prices:
1$, 2$, 3$, 4$, 5$, 6$, 7$, 8$, 9$, 10$
Let me now calculate the mean and median for both cities.
Mean and Median of Pizza Prices:
 New Delhi:
o Mean: 10.64$
o Median: 5.00$
 Lucknow:
o Mean: 5.50$
o Median: 5.00$

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy