Q13 and 14
Q13 and 14
Ans:
A box plot (also called box and whisker plot) uses box and lines to depict the distribution of one or more groups
of numeric data. Box limits indicate the range of the central 50% of the data, with a central line marking the
median value, lower box limit marking first quartile (Q1) and upper box limit marking the third quartile (Q3).
Lines extend from each box to capture the range of the remaining data, with dots placed past the line edges to
indicate outliers.
The distance between Q3 and Q1 is known as the interquartile range (IQR) and plays a major part in how long
the whiskers extending from the box are. Each whisker extends to the furthest data point in each wing that is
within 1.5 times the IQR. Any data point further than that distance is considered an outlier, and is marked with a
dot.
When a data distribution is symmetric, you can expect the median to be in the exact center of the box: the
distance between Q1 and Q2 should be the same as between Q2 and Q3. Outliers should be evenly present on
either side of the box. If a distribution is skewed, then the median will not be in the middle of the box, and
instead off to the side. You may also find an imbalance in the whisker lengths, where one side is short with no
outliers, and the other has a long tail with many more outliers.
The following chart is a box plot showing the variation of UV observed during 12 months of year 1997 in Big
Bend city of USA. Daily DUV has been measured in J/m^2
From the plot, we can see that there is no data of DUV for month of January and the values of DUV gradually
increased from 2500 J/m2 in the month of February, peaking at June and July at around 5100 J/m2 and
drastically decreased in the following months to just around 1400J/m2 in the last month (December).
There appears to be a dramatic decrease in median Daily DUV in October and November months.
The points show outliers in the months of March (2), May(2), August (!), September(3), Nov (1) and Dec (2).
As most of the outliers are present below the lower whisker, with lower whiskers being longer in length than
upper whisker, we can say that there is a skewed distribution of data. Additionally, within the Box, the central
line representing the Median or Q2 is closer towards the upper box limit, indicating the data distribution is
skewed towards Q3, except in the case of September, October and November.
Similarly, we can also see that data is more spread in the first 9 months of 1997, compared to the remaining 4
months from September to December.
Q.No14. Following table gives the rainfall during 1968 to 1978 of pre-monsoon and Monsoon rainfall (source:
J.L.Shrestha, “Topoclimatology of the Kathmandu Valley”) Generate 3 year 2 year moving average plot the
graph and draw your conclusion which gives better approximation.
Ans:
Year PMR MR 2 year Forecasted 3 year Forecasted
PMR MR Absolute Error PMR MR Absolute Error
1968 180.4 1000.3
1969 161.9 965
1970 154.6 1081.6 171.15 982.65 16.55 98.95
1971 318.9 1101.7 158.25 1023.3 160.65 78.4 165.63 1015.6 153.27 86.067
1972 160.8 968 236.75 1091.7 75.95 123.65 211.8 1049.4 51 81.433
1973 154.9 1454 239.85 1034.9 84.95 419.15 211.43 1050.4 56.533 403.57
1974 162.2 983.2 157.85 1211 4.35 227.8 211.53 1174.6 49.333 191.37
1975 119.2 1221.1 158.55 1218.6 39.35 2.5 159.3 1135.1 40.1 86.033
1976 222 1199 140.7 1102.2 81.3 96.85 145.43 1219.4 76.567 20.433
1977 170.6 1210.1 170.6 1210.1 167.8 1134.4 167.8 1134.4
But, in contrary,
In the case of PMR,