IV AI-DS AD3491 FDSA Unit2
IV AI-DS AD3491 FDSA Unit2
ANALYTICS
II Year/IV Semester
NOTES
Prepared By,
Mrs. S. Porkodi, AP/AI&DS
AD3491_FDSA
4931_Grace College of Engineering,Thoothukudi.
UNIT-II
PART – A
1.What is Frequency Distribution?
Frequency distribution is used to organize the collected data in table form. The data could be
marks scored by students, temperatures of different towns, points scored in a volleyball match,
etc. After data collection, we have to show data in a meaningful manner for better understanding.
Organize the data in such a way that all its features are summarized in a table.
2.List down the Types of Frequency Distribution
* Ungrouped frequency distribution: It shows the frequency of an item in each separate data
value rather than groups of data values.
* Grouped frequency distribution: In this type, the data is arranged and separated into groups
called class intervals. The frequency of data belonging to each class interval is noted in a
frequency distribution table.
3.State the Frequency Distribution Table.
A frequency distribution table is a chart that shows the frequency of each of the items in a data
set. Let's consider an example to understand how to make a frequency distribution table using
tally marks A jar containing beads of different colors- red, green, blue, black, red, green, blue,
yellow, red, red, green, green, green, yellow, red, green, yellow.
4. Define an outlier.
Outliers are data points that are far from other data points In other words, they're unusual values
in a dataset. Outliers are problematic for many statistical analyses because they can cause tests to
either miss significant findings or distort real results.
5. How do you use Z-scores to Detect Outliers?
Z-scores can quantify the unusualness of an observation when your data follow the normal
distribution Z-scores are the number of standard deviations above and below the mean that cach
value falls. For example, a Z-score of 2 indicates that an observation is two standard deviations
above the average while a Z-score of -2 signifies it is two standard deviations below the mean A
Z score of zero represents a value that equals the mean.
6 .What is Data Interpretation?
Data interpretation refers to the process of using diverse analytical methods to review data and
arrive at relevant conclusions. The interpretation of data helps researchers to categorize,
manipulate, and summarize the information in order to answer critical questions. Before any
serious data analysis can begin, the scale of measurement must be decided for the data as this
will have a long-term impact on data interpretation ROL
7.Define T-Test?
Statistical method for the comparison of the mean of the two groups of the normally
distributed sample(s).
8.Define F-Test?
An F-test is any statistical test in which the test statistic has an F-distribution under the null
hypothesis. It is most often used when comparing statistical models that have been fitted to a
data set, in order to identify the model that best fits the population from which the data were
sampled.
9. What is analysis of variance?
Analysis of variance is a collection of statistical models and their associated estimation
procedures used to analyze the differences among means. ANOVA was developed by the
statistician Ronald Fisher.
AD3491_FDSA
4931_Grace College of Engineering,Thoothukudi.
AD3491_FDSA
4931_Grace College of Engineering,Thoothukudi.
PART – B
1. A library systems lends books for the periods of 21 days. This policy is being
reevaluated in view of a possible new loan period that could be either longer or shorter than 21
days. To aid in making this decision, books-lending records were consulted to determine the
loan period actually used by the patrons. A random sample of 8 records revealed the
following loan periods in days: 21,15,12,24,20,21,13 and 16. Test the null hypothesis with t-
test, using the .05 level of significance.
2. A consumers’ group randomly samples 10 “one-pound” package of ground wheat sold by a
super market. Calculate the mean and the estimated standard error of the mean for this sample,
given the following weight in ounces:16,15,14,15,14,15,16,14,14,14
3. Illustrate in detail about one factor ANOVA with example.
4. Estimate the calculations for the t test for gas mileage investigation. Showcase the hypothesis
analysis, t ratio calculation with three panels along with confidence interval .
5. Estimate the calculations for the t test using two independent samples for EPO experiment.
Showcase the hypothesis analysis, sampling distribution, t ratio calculation with three
panels, p value estimation along with confidence interval .
6. State the use of counterbalancing and explain the EPO experiment with repeated measures.
Give the detailed table of summary of t tests for population MEANS for one sample, two
independent samples and two related samples
7. Suggest the hypothesis test summary for t test for a population correlation coefficient
for the case study on Greeting Card Exchange
8. Suggest the hypothesis test summary using One-Factor F Test for Sleep Deprivation
Experiment and also the variance estimates, mean squares, sum of squares with degree of
freedom
9. Blood pressure of 8 patients are before and after are recorded: Before: 180,200,230,
240,170,190,200 and 165 After: 140,145, 150,155,120,130,140 and 130. Find, is there any
significant difference between BP reading before and after by applying two-sample t-test.
10. Marks of student are 10.5, 9, 7, 12, 8.5, 7.5, 6.5, 8, 11 and 9.5.Mean population score is 12
and standard deviation is 1.80.Is the mean value for student significantly differ from the
mean population value.
AD3491_FDSA