0% found this document useful (0 votes)
7 views4 pages

IV AI-DS AD3491 FDSA Unit2

The document provides notes for a course on Fundamentals of Data Science and Analytics at Grace College of Engineering, focusing on descriptive analytics. It covers key concepts such as frequency distribution, outliers, statistical tests like T-tests and F-tests, and correlation, along with practical applications and examples. Additionally, it includes exercises for students to apply these concepts in real-world scenarios.

Uploaded by

lefih93289
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views4 pages

IV AI-DS AD3491 FDSA Unit2

The document provides notes for a course on Fundamentals of Data Science and Analytics at Grace College of Engineering, focusing on descriptive analytics. It covers key concepts such as frequency distribution, outliers, statistical tests like T-tests and F-tests, and correlation, along with practical applications and examples. Additionally, it includes exercises for students to apply these concepts in real-world scenarios.

Uploaded by

lefih93289
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

4931_Grace College of Engineering,Thoothukudi.

B.Tech- Artificial Intelligence and Data Science

Anna University Regulation: 2021

AD3491- FUNDAMENTALS OF DATASCIENCE AND

ANALYTICS

II Year/IV Semester

UNIT II DESCRIPTIVE ANALYTICS

NOTES

Prepared By,
Mrs. S. Porkodi, AP/AI&DS

AD3491_FDSA
4931_Grace College of Engineering,Thoothukudi.

UNIT-II
PART – A
1.What is Frequency Distribution?
Frequency distribution is used to organize the collected data in table form. The data could be
marks scored by students, temperatures of different towns, points scored in a volleyball match,
etc. After data collection, we have to show data in a meaningful manner for better understanding.
Organize the data in such a way that all its features are summarized in a table.
2.List down the Types of Frequency Distribution
* Ungrouped frequency distribution: It shows the frequency of an item in each separate data
value rather than groups of data values.
* Grouped frequency distribution: In this type, the data is arranged and separated into groups
called class intervals. The frequency of data belonging to each class interval is noted in a
frequency distribution table.
3.State the Frequency Distribution Table.
A frequency distribution table is a chart that shows the frequency of each of the items in a data
set. Let's consider an example to understand how to make a frequency distribution table using
tally marks A jar containing beads of different colors- red, green, blue, black, red, green, blue,
yellow, red, red, green, green, green, yellow, red, green, yellow.
4. Define an outlier.
Outliers are data points that are far from other data points In other words, they're unusual values
in a dataset. Outliers are problematic for many statistical analyses because they can cause tests to
either miss significant findings or distort real results.
5. How do you use Z-scores to Detect Outliers?
Z-scores can quantify the unusualness of an observation when your data follow the normal
distribution Z-scores are the number of standard deviations above and below the mean that cach
value falls. For example, a Z-score of 2 indicates that an observation is two standard deviations
above the average while a Z-score of -2 signifies it is two standard deviations below the mean A
Z score of zero represents a value that equals the mean.
6 .What is Data Interpretation?
Data interpretation refers to the process of using diverse analytical methods to review data and
arrive at relevant conclusions. The interpretation of data helps researchers to categorize,
manipulate, and summarize the information in order to answer critical questions. Before any
serious data analysis can begin, the scale of measurement must be decided for the data as this
will have a long-term impact on data interpretation ROL
7.Define T-Test?
Statistical method for the comparison of the mean of the two groups of the normally
distributed sample(s).
8.Define F-Test?
An F-test is any statistical test in which the test statistic has an F-distribution under the null
hypothesis. It is most often used when comparing statistical models that have been fitted to a
data set, in order to identify the model that best fits the population from which the data were
sampled.
9. What is analysis of variance?
Analysis of variance is a collection of statistical models and their associated estimation
procedures used to analyze the differences among means. ANOVA was developed by the
statistician Ronald Fisher.

AD3491_FDSA
4931_Grace College of Engineering,Thoothukudi.

10. Define effect size estimation ?


Effect size estimates provide important information about the impact of a treatment on the
outcome of interest or on the association between variables. • Effect size estimates provide a
common metric to compare the direction and strength of the relationship between
variables across studies.
11. What is mean by multiple comparisons, multiplicity or multiple testing.
The multiple comparisons, multiplicity or multiple testing problem occurs when one
considers a set of statistical inferences simultaneously or infers a subset of parameters
selected based on the observed values.
12. What do you mean by two-factor factorial design?
A two-factor factorial design is an experimental design in which data is collected for all possible
combinations of the levels of the two factors of interest. If equal sample sizes are taken for each
of the possible factor combinations then the design is a balanced two-factor factorial design.
13. Define statistical test in F-test
An F-test is any statistical test in which the test statistic has an F-distribution under the null
hypothesis. It is most often used when comparing statistical models that have been fitted to a
data set, in order to identify the model that best fits the population from which the data were
sampled.
14. What are the two- way analyses of variance?
The two-way analysis of variance is an extension of the one-way ANOVA that examines the
influence of two different categorical independent variables on one continuous dependent
variable.
15. What are the types of ANOVA?
There are two main types of ANOVA: one-way (or unidirectional) and two-way. There also
variations of ANOVA. For example, MANOVA (multivariate ANOVA) differs from ANOVA
as the former tests for multiple dependent variables simultaneously while the latter assesses
only one dependent variable at a time.
16. What is the Trending Market Test?
In an up-trending market, previous resistance becomes support, while in a down-trending market,
past support becomes resistance. Once price breaks out to a new high or low, it often retraces to
test these levels before resuming in the direction of the trend. Momentum traders can use the test
of a previous swing high or swing low to enter a position at a more favorable price than if they
would have chased the initial breakout. A stop-loss order should be placed directly below the
test area to close the trade if the trend unexpectedly reverses.
17. State the term Correlation.
Correlation refers to a process for establishing the relationships between two variables. You
learned a way to get a general idea about whether or not two variables are related, is to plot them
on a "scatter plot". While there are many measures of association for variables which are
measured at the ordinal or higher level of measurement, correlation is the most commonly used
approach.
18. State in brief Correlation Coefficient.
The correlation coefficient, r, is a summary measure that describes the extent of the statistical
relationship between two interval or ratio level variables. The correlation coefficient is scaled so
that it is always between -1 and +1 When r is close to 0 this means that there is little relationship
between the variables and the farther away from 0 r is, in either the positive or negative
direction, the greater the relationship between the two variables.

AD3491_FDSA
4931_Grace College of Engineering,Thoothukudi.

19. List the Types of Correlation.


Positive Correlation when the values of the two variables move in the same direction so that an
increase/decrease in the value of one variable is followed by an increase/decrease in the value of
the other variable.
Negative Correlation when the values of the two variables move in the opposite direction so that
an increase/decrease in the value of one variable is followed by decrease/increase in the value of
the other variable.
No Correlation when there is no linear dependence or no relation between the two variables.

PART – B
1. A library systems lends books for the periods of 21 days. This policy is being
reevaluated in view of a possible new loan period that could be either longer or shorter than 21
days. To aid in making this decision, books-lending records were consulted to determine the
loan period actually used by the patrons. A random sample of 8 records revealed the
following loan periods in days: 21,15,12,24,20,21,13 and 16. Test the null hypothesis with t-
test, using the .05 level of significance.
2. A consumers’ group randomly samples 10 “one-pound” package of ground wheat sold by a
super market. Calculate the mean and the estimated standard error of the mean for this sample,
given the following weight in ounces:16,15,14,15,14,15,16,14,14,14
3. Illustrate in detail about one factor ANOVA with example.
4. Estimate the calculations for the t test for gas mileage investigation. Showcase the hypothesis
analysis, t ratio calculation with three panels along with confidence interval .
5. Estimate the calculations for the t test using two independent samples for EPO experiment.
Showcase the hypothesis analysis, sampling distribution, t ratio calculation with three
panels, p value estimation along with confidence interval .
6. State the use of counterbalancing and explain the EPO experiment with repeated measures.
Give the detailed table of summary of t tests for population MEANS for one sample, two
independent samples and two related samples
7. Suggest the hypothesis test summary for t test for a population correlation coefficient
for the case study on Greeting Card Exchange
8. Suggest the hypothesis test summary using One-Factor F Test for Sleep Deprivation
Experiment and also the variance estimates, mean squares, sum of squares with degree of
freedom
9. Blood pressure of 8 patients are before and after are recorded: Before: 180,200,230,
240,170,190,200 and 165 After: 140,145, 150,155,120,130,140 and 130. Find, is there any
significant difference between BP reading before and after by applying two-sample t-test.
10. Marks of student are 10.5, 9, 7, 12, 8.5, 7.5, 6.5, 8, 11 and 9.5.Mean population score is 12
and standard deviation is 1.80.Is the mean value for student significantly differ from the
mean population value.

AD3491_FDSA

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy