Eportfolio
Eportfolio
Our Statistics 1040 class was assigned a project that we would contribute to
and assess throughout the semester. Each class member purchased a 2.17 ounces
bag of Skittles, recorded the number of different colors in his or her bag and
compared the data with the rest of the class. The objective was to learn how to
organize and collect data, predict potential probabilities, draw conclusions from the
of candies per color in each Skittles bag. Each person recorded the following
We then submitted the data to our professor who created a dotplot of the
total candies in each bag:
I created charts and graphs I had learned about in class utilizing the Skittles
data. The first example shown is a pie chart displaying the total data from the
class. The second example shown is a pie chart expressing the data from my
Skittles data. These groups remained the same throughout the semester. We
contributed to the group on a discussion board, and also submitted our individual
the probability, or chance, that the colors in the Skittles bags were randomly
assumed each color would be roughly equivalent about 20% proportionately. If you
look at the results of the class data set you can see that the variation from 20%
ranges from +0.014 to -0.010. With a sample size of 2,268 that would mean a count
of 454 of each color. We think that the estimate of 20% per color is valid and
relatively accurate. We resisted the temptation to make an inference from our own
bags of Skittles because we considered the sample size to be too small, although
If you consider the overall class data as a sample then the population would
purchased by our class as the population then of course every individual bag would
be a sample. In either case, unless there are distribution factors we are unaware of,
Count Red Count Orange Count Count Green Count Purple Total
Yellow
My Bag 1 12 1 9 16 60
3 0
Class Counts 46 439 48 449 431 2268
4 5
59.5
2.9
1.
The frequency histogram shows the total candies in each bag to be approximately
symmetrical and bell-shaped. The box plot is also symmetric because the
observations are split almost equally at the median. I did expect to see the graphs
as pictured because I assumed Skittles would attempt to be evenly distributed in
each bag to match the weight advertised on the package. In this day and age where
people are measuring 6 Subway sandwiches, Wrigley better get it right! The overall
data agreed with my single bag data.
2.
Categorical (qualitative) data classify individuals based on some attribute or
characteristic. The categories do not necessarily need to have logical order. I always
remember the definition of categorical or qualitative data by the word quality, which
is used as a descriptive term. So, to me, categorical = description. Some examples
of categorical data are gender, race, educational level, make and models of cars,
and colors.
Quantitative data give numerical measures of individuals. These values can be
added or subtracted. I associate quantitative with the word quantity, which means
amount or number. The amount can be indefinite or definite. Some examples of
quantitative data are height, weight, size of a room, amount of ingredients in
recipes, and monetary tips at restaurants.
There are several graphs that work well for categorical data. Side-by-side bar graphs
are optimal because they can make comparisons without requiring numerical
information and can give a visual explanation. Frequency tables can list categories
of data along with occurrences for each category, such as colors in a Skittles bag. A
Pareto chart allows the same information to be expressed as a bar graph, only in
descending order. Finally, pie charts can be used to show sectors of categories
divided into proportional frequencies.
Quantitative data have a lot of options for graphs such as histograms, frequency
polygons, ogives, time-series plots, and stem-and-leaf plots. Histograms show the
frequency or relative frequency by using rectangles for each class of data.
Frequency polygons use points connected by line segments, ogives present
cumulative frequency and cumulative relative frequency in the graphs, and time-
series plots show the value of the variable measured at different points in time.
Stem-and-leaf plots are useful for quantitative data because they display raw data,
but they are not helpful with large amounts of data, and would not be applicable to
categorical data because stem-and-leaf require specific digits and not categories.
Histograms, Ive learned, can be confused with bar charts but they are not the
same. Bar charts are used for categorical data because they are used to compare
variables. Histograms are used to show countable aspects of variables.
For the fourth group project, we each created a confidence interval for
population proportion and population mean of the Skittles bags and added them to
Group Project 4
1) n= 2268 (total Skittles), x= 485 (total yellow Skittles), a= .01, t= .005, critical value= 2.576
Using TI-84 1PropZInt with C-Level 99%: Lower bound = (.1917), Upper bound = (.2360)
I am 99% confident that the population proportion of yellow Skittles lies between .1917 and .2360.
I am 95% confident that true value of the population mean of candies per bag lies between 59.38 and
59.62.
What Ive learned
Not only did I discover that the TI-84 is my best friend, I was able to use critical
thinking skills to break down complicated problems and find solutions. As shown,
the Skittles project utilized many facets of statistics wed learned throughout the
fast-paced semester. I learned a lot visually with pie charts, Pareto charts,
histograms, dotpots and boxplots. This seems to paint a clearer picture of stats to a
suspected the color distribution was based on cost rather than chance, but with
research such as mean, median and mode, probability and confidence intervals, the
data show that Skittles colors appear to be random. I can see the bigger picture now
and, unlike, algebra, I can see myself using statistics on a day-to-day basis.
Reflection Paper
Initially I panicked with the Skittles term project. Not only did I have to show my work,
but I would have to display it to my peers and my teacher. I felt a little vulnerable and was not
sure I was up to the task. When it was disclosed that the project would be ongoing throughout the
semester, I audibly groaned. However, like most projects, there is a learning curve and once that
The Skittles project was surprising. It reminded me that big corporations like Mars are
selling more than sugar in a red package; they are selling accountability. Just like the guys who
sued Subway for selling subs one inch less than were advertised, Mars could find itself in hot
water if it didnt have a normal distribution of color to ratio in each bag. I am not a Skittles fan
but when offered Skittles I will always choose the purple or the red. Yellow is my least favorite.
If I knew that Skittles only offered a few purples or reds per bag I would be far less inclined to
buy them in the future. All it would take is one bag with a higher proportion of the disliked color
to assume that is how all the bags are. How would Mars know which colors are preferred?
Statistics, of course.
I thought about how using statistics with something as simple as a bag of Skittles could
be compared to other forms of statistics in everyday life, such as: relying on the weather forecast
for an outing or a trip, reading quality reviews of products I want to buy online, deciding on the
safety of fluoride for my young son at the dentist office, finding certain brands that arent always
eye-level at the grocery store, and planning which vegetables to plant in my garden, to name a
few.
I wouldnt say the term project was life-changing. It did buoy my resolve to go out of my
comfort zone and allow myself to make mistakes for the sake of education and learning, even
while being scrutinized. I will never look at a Skittles bag the same ever again, not to mention
other candies as well. The biggest thing I learned from the term project is that I have not once