0% found this document useful (0 votes)
29 views54 pages

Statistics Course Work Presentation

Uploaded by

nadiope mark
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views54 pages

Statistics Course Work Presentation

Uploaded by

nadiope mark
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 54

BUSINESS STATISTICS

COURSE WORK
PRESENTATION
Kaire Agnes-VU-BPL-2307-0632-EVE
Kayongo Daniel- VU-BPL-2307-0646-EVE
Question 4
Discuss the concept of data presentation
• Tabular, e.g. Univariate frequency distributions, Simple frequency
distribution, Grouped frequency distribution.
• Graphical e.g. Histogram, Frequency polygon, Cumulative frequency
distribution.
• Diagrammatical e.g. Charts
Introduction
• Data presentation is the process of organizing and displaying data in a way that
makes it easy to understand and analyze. Effective data presentation transforms
raw data into meaningful information through various formats and techniques.
One of the most common formats for presenting data is tabular presentation. Let's
explore some specific types of tabular data presentations:
o Univariate Frequency Distributions
o Simple Frequency Distributions, and
o Grouped Frequency Distributions.
1. Univariate Frequency Distributions
• A univariate frequency distribution is used to show the frequency (i.e., the number
of times) each unique value occurs in a dataset for a single variable. This type of
distribution is helpful for understanding the distribution and central tendency of a
particular variable.
• Example:
• A univariate frequency distribution table showing exam scores of students at
Victoria university in year 1.3
• 20,30,30,30,40,40,50,50,50,20,30,40,40,40,40,40,40,50,50,50,50,50,50,50,50,50,50
,50,50,60,60,60,60,60,60,60,60,60,60,70,70,70,70,70,70,80,80,80,90,90.
Table 01. A dataset showing exam scores of students Victoria university in a year

scores frequency cumulative frequency


20 2 2
30 4 6
40 8 14
50 12 26
60 10 36
70 6 42
80 3 45
90 2 47
Total 47
2. Simple Frequency Distributions
• A simple frequency distribution is similar to a univariate frequency distribution
but can be applied to both categorical and numerical data. It involves listing
each unique value of the variable and counting how often each value occurs.
• Example:
• A simple frequency distribution of a dataset containing the ages of participants
in a survey may look like this.
• colors preferred by a group of people:
• 18, 20, 20, 22, 22, 22, 25, 25, 28,30,30,30,35,35,22
Table 02. A dataset containing the ages
of participants in a survey:

Age Frequency cumulative frequency


18 1 1
20 2 3
22 4 7
25 2 9
28 1 10
30 3 13
35 2 15
Total 15
3. Grouped Frequency Distributions

• When dealing with a large range of data, it can be helpful to group the data
into intervals or classes. A grouped frequency distribution summarizes data
by grouping adjacent values into class intervals and showing the frequency
of data points within each interval. This method is particularly useful for
continuous data or when the dataset is large (e.g over 30 observations).
• Example:
A grouped frequency distribution for a dataset of exam scores for a class of
100 students might look like this:
Table 03: A Dataset of scores for a class of
students.
Age Relative Cumulative
Frequency
Range Frequency Frequency
Oct-19 5 0.2 5
20-29 8 0.32 13
30-39 6 0.24 19
40-49 4 0.16 23
50-59 2 0.08 25
Total 25 1
Advantages of Tabular Data Presentation
• Clarity: Tables can present data in a clear and concise manner, making it
easier to identify patterns and trends.
• Ease of Comparison: Tables facilitate the comparison of different data
points or categories.
• Compactness: Tables can effectively summarize large amounts of data
without overwhelming the reader.
• Accessibility: Well-designed tables are straightforward to read and
understand, even for those without advanced statistical knowledge.
Graphical data presentation
Graphical data representation is a key component of data analysis, allowing
for the visualization of data patterns, trends, and distributions.
Here, we will delve into three specific types of graphical data presentations:
Histograms, Frequency polygons, and Cumulative frequency
distributions.
Histogram
• Concept: A histogram is a graphical representation of the distribution of
numerical data. It is an estimate of the probability distribution of a
continuous variable and consists of contiguous (touching) bars where each
bar represents the frequency of data points falling within the specific
interval.
• For example
A histogram based on the example dataset of exam scores below.
Table 04. A dataset of exam marks for a
class 0f BPL 1.3 business statistics
class cumulative
interval frequency class boundary frequency
20-25 10 19.5-25.5 10
26-30 28 25.5-30.5 38
31-35 32 30.5-35.5 70
36-40 45 35.5-40.5 115
41-45 50 40.5-45.5 165
46-50 35 45.5-50.5 200
51-55 12 50.5-55.5 212
Histogram
Histogram showing marks for 1.3 BPL business statis-
tics
60

50

40
Frequency

30

20

10

Class boundary

19.5-25.5 25.5-30.5 30.5-35.5 35.5-40.5 40.5-45.5 45.5-50.5 50.5-55.5


Histogram interpretation

• The x-axis represents the Class boundary


• The y-axis represents the frequency (number of scores) in each interval.
• Each bar's height corresponds to the frequency of scores falling within that interval.
• The histogram shows that the most common scores fall within the range of 41-45,
with a frequency of 50.
• The distribution appears slightly skewed to the right, indicating that there are more
high scores compared to low scores.
• Easily shows the shape of the data distribution.
• Highlights the central tendency, variability, and skewness of the data.
2. Frequency Polygon
A frequency polygon is a graphical device for understanding the shapes of distributions. It is
similar to a histogram but uses a line graph to represent frequencies instead of bars.
Construction:
• X-axis: Midpoints of the intervals used in a histogram(Class mark)
• Y-axis: Frequencies of the corresponding intervals.
• Points are plotted at the midpoints of each interval and connected with straight lines.
• Steps to Construct a Frequency Polygon:
• Calculate the midpoint for each interval.
• Plot points corresponding to the frequency at each midpoint.
• Connect the points with straight lines.
• Optionally, connect the first and last points to the x-axis to close the polygon.
Table 05. A dataset of exam marks for a class
of BPL 1.3 business statistics .

Class Interval Frequency Class Mark


20-25 10 23
26-30 28 28
31-35 32 33
36-40 45 38
41-45 50 43
46-50 35 48
51-55 12 53
Total 212
Graph 2.2: Frequency polygon
A frequency polygon showing exam marks for a class 0f BPL 1.3 business
statistics
60

50

40
Frequency

30

20

10

0
20 25 30 35 40 45 50 55

Class Mark
Notes
• The frequency polygon can serve as an alternative to a histogram. Both
visual representations perfectly reflect the shape of a distribution.
• The frequency polygon represents the frequency distribution of
continuous data graphically. Its relevance lies in its ability to visually
represent data, allowing for more straightforward interpretation and
analysis. As a result, it is a valuable tool in statistics, helping researchers
to identify patterns and trends in large data sets.
3. Cumulative Frequency Distribution
• Cumulative frequency distribution is a tabular summary of the frequencies
of observations in a dataset, sorted from the smallest value to the largest.
• A cumulative frequency distribution (or cumulative frequency curve) shows the
cumulative frequency of data points up to a certain value.
• It is constructed by plotting points representing the cumulative frequency on the y-
axis and the corresponding lower class boundaries on the x-axis.
• Cumulative frequency distributions are useful for analyzing the proportion of data
points below or above certain thresholds and identifying percentiles and quartiles.
Construction of a Graph for Cumulative
Frequency Distribution:
• Data Preparation: Organize the dataset in ascending order.
• Calculation: Calculate the cumulative frequency for each value by adding
up the frequencies as you progress through the dataset.
• Plotting: Plot the values of cumulative frequencies on the y-axis and their
corresponding lower class boundaries on the x-axis.
• For example, consider the data set of exam score of BBA statistics class,
70,75,80,85,90,75,85,80,85,90,95,85,90,85,80,65,70,75
Table 06. cumulative frequency distribution
for the exam score of BBA statistics class
CLASS CUMULATIVE
CLASS BOUNDARY CLASS MARK FREQUENCY
INTERVAL FREQUENCY

65-69 64.5-69.5 67 1 1
70-74 69.5-74.5 72 2 3
75-79 74.5-79.5 77 3 6
80-84 79.5-84.5 82 3 9
85-89 84.5-89.5 87 5 14
90-94 89.5-94.5 92 3 17
95-99 1 18
94.5-99.5 97
18
Total
Cumulative frequency distribution
curve(Ogive)
A Graph Showing Cumulative Frequency Distribution of Exam scores of
20
BBA statistics class
18

16

14
Cumulative Frequency

12

10

0
60 65 70 75 80 85 90 95 100
Lower Class Boundary
Graph interpretation
In this cumulative frequency distribution graph:
• Each point represents a score from the dataset, with the x-coordinate representing the score and the
y-coordinate representing the cumulative frequency.
• The graph shows how the cumulative frequency increases as we move through the sorted dataset.
• The graph allows us to observe the distribution of scores and how the cumulative frequency
accumulates as scores increase.
• It provides insights into the spread of scores and the relative frequencies of different score ranges.
• Graphical presentation of cumulative frequency distribution facilitates a visual understanding of the
distribution of data, making it easier to interpret and analyze.
Advantages Graphical data presentations
• Visual clarity: Graphs and charts provide a clear and concise representation of data, making it easier to interpret and
understand complex patterns and relationships.

• Comparative analysis: Graphical presentations enable comparison between different datasets or variables, facilitating
insights into trends, differences, and similarities.

• Engaging communication: Visualizations are often more engaging and memorable than numerical tables, enhancing
audience comprehension and retention of information.

• Decision-making support: Graphical representations of data can aid decision-making processes by highlighting key
insights and trends, enabling stakeholders to make informed decisions based on evidence.
Disadvantages of graphical presentations
including;

• Accessibility Issues: Certain types of graphs or charts may not be


accessible to all individuals, particularly those with visual impairments or
certain disabilities. Without alternative formats or accommodations, such
individuals may struggle to interpret the information presented graphically.
• Time-consuming to Create: Creating high-quality graphical presentations
can be time-consuming, particularly when dealing with large or complex
datasets. This can be a disadvantage when time is limited or when frequent
updates to the data presentation are required.
.
Disadvantages…cont’d
• Resource Intensive: Graphical presentations may require specialized
software or tools to create, edit, and display effectively. Additionally, they
may consume more computational resources than simpler forms of data
presentation, particularly when dealing with interactive or dynamic
visualizations.
• Subjectivity: The design and interpretation of graphical presentations can be
subjective, influenced by the choices made by the presenter or designer.
Different individuals may interpret the same graph differently, leading to
potential disagreements or misunderstandings
Disadvantages… con’d
• Potential for Misinterpretation: Graphical representations can sometimes be
misinterpreted, especially if they are not properly labeled or if the scale is distorted.
For example, changing the scale of a graph can exaggerate or minimize differences
between data points, leading to inaccurate conclusions.

• Limited Detail: Graphs and charts often provide a condensed summary of data,
which may omit certain details or nuances present in the raw data. This can be a
disadvantage when a more comprehensive understanding of the data is required.
Disadvantages… con’d
• Over emphasis on Visual Appeal: Sometimes, graphical presentations
prioritize aesthetics over clarity or accuracy. This can result in visually
appealing but misleading visualizations that fail to effectively communicate the
intended message.

• Difficulty with Complex Data: Graphical representations may struggle to


effectively convey complex relationships or multivariate data. In such cases,
graphical presentations may oversimplify the data or fail to capture all relevant
dimensions, leading to incomplete or distorted insights.
Diagrammatical e.g. Charts

• Pie Chart:
• A pie chart is a circular chart divided into slices, each representing a proportion of
the whole data.
• It is useful for showing the composition of a categorical variable as parts of a whole.
• Pie charts are ideal for visualizing percentages and relative proportions but may become less
effective when representing large amounts of data.
• Example: shares owned by shareholders of a company:
• Shareholder A: 40%, shareholder B: 30%, shareholder C: 20%, shareholder D: 10%
Pie chart
Apie chart showing percentages of shares owned by shareholders

10

20 40

30

shareholder A shareholder B shareholder C Shareholder D


Conclusion
• It's important to select the most appropriate type of graph or chart based on
the nature of the data and the insights to be conveyed. Additionally,
graphical presentations should be accompanied by clear labels, titles, and
legends to ensure proper interpretation and understanding by the audience.

END
Question Seven

Discuss the concept of probability theory


• Key concepts,
• Probability of an event,
• Rules of probability,
• Probability distributions such as Normal distributions,
• Binomial distributions and
• Poisson distributions
PROBABILITY THEORY

 Probability theory is a branch of mathematics that deals with the study of chance
events and their likelihood of occurrence.
 It provides a mathematical framework for analyzing and modelling uncertain events,
making predictions and estimating the likelihood of occurrences.eg, In
o weather forecasting
o sports outcomes
o card games and other chance games
o insurance
o medical diagnosis
o election outcomes
o Shopping recommendations, etc.
Key concepts in probability theory
Events – These are the occurrences or outcomes of a random experiment/
activity. Eg, when tossing a coin, the events are either getting a head or a
tail.
Probability – This is a number between 0 and 1 representing the
likelihood of an event occurring. It can also be expressed as percentages
ranging from 0% to 100%. A probability of 0 indicates that there is no
chance that a particular event will occur, whereas a probability of 1(100%)
indicates that an event is certain to occur. A probability of 0.45 (45%)
indicates that there are 45 chances out of 100 of the event occurring.
Key concepts….cont’d
Random variables- These are variables whose possible values are determined by chance.
E.g. when tossing a coin, showing of the head is a random variable.
Sample space(s) -The set of all possible outcomes of a random experiment. Eg, when a
coin is tossed, there are only two possible outcomes, head and tail. So the sample space is
2.
Independence – This is when the probability of an event is not affected by another event’s
probability. E.g. when tossing a coin, the probability of getting a head is independent of the
probability of getting a 6 when a die is rolled.
Conditional probability- This is the probability of an event given that another event has
occurred.
Probability of an event
Probability ,P of an event ,E is the likelihood that the event will occur.
For any event, E, 0 ≤ P(E)≤1, where P(E) is the probability of E.
Probability, P=
For example, when we toss a coin, the probability, P of getting a head (H)
is calculated as below;
P(H)=
=
Rules of Probability
• Non-negativity: There’s no negative probability. The probability of an
impossible event is 0 and the probability of a certain event is 1. Therefore,
for any event A, the range of possible probabilities is: 0 ≤ P(A) ≤ 1
• Normalization: The sum of all the probabilities for all possible events
( sample space) of a random experiment is equal to 1. E.g, when a coin is
tossed, the sum of the probability of getting a head , P(H) and the
probability of getting a tail, P(T) is equal to 1.
P(H)+P(T)=1
Rules of Probability…cont’d
• Complementarity: The probability of the compliment(opposite) of an event is 1
minus the probability of the event. Thus, for any event A, P(A’ ) = 1 - P(A).
• Mutual exclusivity: If two events, A and B, are mutually exclusive (also called
disjoint events) , then A and B can not occur at the same time. Thus the probability
that both events occur, P(A ꓵ B) or P(A and B)=0.
The probability of either events happening is given by, P(AUB) or P(A or B) = P(A)
+ P(B).
If the two events are NOT mutually exclusive, then P(A or B) = P(A) + P(B) - P(A and
B).
Rules of Probability…cont’d
• Dependency(Conditional Probability) : This is the probability of an event
given that another event has occurred. (ie, both events occur). For events A
and B,
P(A and B) = P(A)* P(B|A) or P(B)*P(A|B).
• Note: This straight line symbol, |, does not mean divide! It means "conditional"
or "given". For instance P(A|B) means the probability that event A occurs given
event B has occurred. It is given by;
• P(AǀB)= and P(BǀA)=.
For mutually exclusive events, P(AǀB)= = =0. Also, P(BǀA)==0
Rules of Probability…cont’d
• Independency :If A and B are independent events, neither event
influences or affects the probability that the other event occurs. The
probability of independent events is given by, P(AꓵB) or P(A and B) =
P(A)*P(B). This particular rule extends to more than two independent
events. Eg, P(A and B and C) = P(A)*P(B)*P(C).
• Inclusion –Exclusion principle(Rule): This states that probability of a
union of independent events is the sum of their probabilities minus the
probability of their intersection. ie, P(A ∪ B) = P(A) + P(B) – P(A ∩ B)
PROBABILITY DISTRIBUTIONS
• A probability distribution is a statistical function that describes all the
possible values and probabilities for a random variable within a given
range.
• This range will be bound by the minimum and maximum possible values,
but where the possible value would be plotted on, the probability
distribution will be determined by a number of factors like mean
(average), standard deviation, skewness, and kurtosis of the distribution.
Types of Probability Distribution
The probability distribution are divided into two:
• Discrete Probability Distributions
• Continuous Probability Distributions
Discrete Probability Distribution
• A discrete distribution describes the probability of occurrence of
each value of a discrete random variable(one which may take on
only a countable number of distinct values such as 0,1,2,3,4).
Discrete Probability Distribution

• The probability distribution of a discrete random variable X is a list of


each possible value of X together with the probability that X takes that
value in one trial of the experiment.
The probabilities in the probability distribution of a random variable X must
satisfy the following two conditions:
• Each probability P(X) must be between 0 and 1. ie, 0≤P(X)≤1.
• The sum of all the probabilities is 1: ΣP(X)=1.
Types of Discrete Probability Distributions

Binomial distribution
A binomial probability distribution is one in which there is only a probability
of two outcomes. In this distribution, data are collected in one of two forms
after repetitive trials and classified into either success or failure. It generally
has a finite set of just two possible outcomes, such as zero or one. Eg,,
flipping a coin gives you the list {Heads, Tails}.
Types …contd
Bernoulli distribution
Bernoulli distributions are similar to binomial distributions because there are
two possible outcomes but only one trial is conducted. The outcomes in a
Bernoulli distribution are labeled as either a zero or one. A one indicates
success, and a zero means failure.(one trial is called a Bernoulli trial).
Eg , if you used one green marble (for success) and one red marble (for
failure) in a covered bowl and chose without looking, you would record each
result as a zero or one rather than success or failure for your sample.
Discrete Probability Distributions…cont’d

Poisson Distribution
The Poisson distribution expresses the probability that a given number of
events will occur over a fixed period.
For instance, say you have a covered bowl with one red and one green
marble, and your chosen period is two minutes. Your test is to record
whether you pick the green or red marble, with the green indicating success.
After each test, you place the marble back in the bowl and record the results.
Discrete Probability Distributions…cont’d

Multinomial distributions.
Multinomial distributions occur when there is a probability of more than two
outcomes with multiple counts.
For instance, say you have a covered bowl with one green, one red, and one
yellow marble. For your test, you record the number of times you randomly
choose each of the marbles for your sample
Continuous Probability Distributions
• A continuous distribution describes the probabilities of possible values of
a continuous random variable.
• A continuous random variable has an infinite and uncountable set of
possible values (known as the range). Eg , Height could be any one of the
infinite values in between 5 to 5.9feet.
• The probability of a continuous random variable is given by the area
under the curve of the Probability Density Function (PDF).
Probability density function
• The probability density function given by the following equation,
describes the probability that variable x falls between two values (a and
b), and is equal to the area under the curve from a to b.
f(x)= P(a ≤ x ≤b)= dx ≥ 0
x is a continuous random variable that can take on any value within a given
range of values.
Types of continuous probability distribution

• Normal Distribution (also called Gaussian distribution)


This probability distribution is symmetrical around its mean value .In a normal
probability distribution, most of the observations cluster around the central peak. The
normal PDF is also symmetric with a zero skewness such that its median and mode values
are the same as the mean value.
The area under the normal distribution curve represents probability and can be given by
the function;
)
Where, µ is the mean of the distribution, 𝞭 is the standard deviation, and is the variance.
Shape of normal distribution curve
continuous probability distribution…cont’d
• Continuous Uniform Distribution
It refers to an infinite number of equally likely measurable values where the
continuous random variable can take any value that lies between certain
bounds.
In continuous uniform distribution, all outcomes are equally possible. ie,
Each variable has the same chance of being hit as a result.
Conclusion
• Probability provides a foundation for quantifying our uncertainty in the
world. As such, understanding probability is necessary for us to make
decisions. The concept of probability occupies an important role in the
decision-making process, whether the problem is one faced in business, in
engineering, in government, in sciences, or just in one’s own everyday
life. “Most decisions are made in the face of uncertainty”

Thank you

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy