0% found this document useful (0 votes)
59 views3 pages

II CSE CS3352 FDS QB Unit2

Uploaded by

ucebittrichy2020
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views3 pages

II CSE CS3352 FDS QB Unit2

Uploaded by

ucebittrichy2020
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

4931_Grace College of Engineering, Thoothukudi

CS3352-FOUNDATIONS OF DATA SCIENCE


UNIT II
PART-A
1. What are the types of data?
The precise form of a statistical analysis often depends on whether data are
qualitative, ranked, or quantitative.
qualitative data consist of words (Yes or No), letters (Y or N), or numerical
codes (0 or 1) that represent a class or category. Ranked data consist of
numbers (1st, 2nd, . . . 40th place) that represent relative standing within a
group. Quantitative data consist of numbers (weights of 238, 170, . . . 185 lbs)
that represent an amount or a count.

2. Define Variable.
Variable A characteristic or property that can take on different values.

3. Define Constant.
Constant A characteristic or property that can take on only one value.

4. Discrete Variable
A variable that consists of isolated numbers separated by gaps

5. What is continuous variable?


Continuous Variable A variable that consists of numbers whose values, at
least in theory, have no restrictions.

6. What is Experiment and Independent Variable


Experiment A study in which the investigatordecides who receives the special
treatment. Independent Variable The treatment manipulated by the investigator in an
experiment
.
7. What is Observational study?
Observational Study A study that focuses on detecting relationships between
variables not manipulated by the investigator.

8. Define Confounding variable.


Confounding variable An uncontrolled variable that compromises the
interpretation of a study

9. Frequency Distribution for Ungrouped Data & for Grouped Data


A frequency distribution produced whenever observations are sorted into
classes of single values. A frequency distribution produced whenever observations are
sorted into classes of more than one value

10. What is outlier?


Outlier is A very extreme score.

11. What is relative frequency?


Relative Frequency Distribution is A frequency distribution showing the
frequency of each class as a fraction of the total frequency for the entire distribution.

CS3352_FDS
4931_Grace College of Engineering, Thoothukudi

12. What is Cumulative Frequency?


Cumulative Frequency Distribution A frequency distribution showing the
total number of observations in each class and all lower-ranked classes.

13. Define Hiastogram.


Histogram is A bar-type graph for quantitative data. The common boundaries
between adjacent bars emphasize the continuity of the data, as with continuous
variables.

14. Frequency Polygon


A line graph for quantitative data that also emphasizes the continuity of
continuous variables

15. What are the two ways of skewed distribution>


Positively Skewed Distribution A distribution that includes a few extreme
observations in the positive direction (to the right of the majority of observations).
Negatively Skewed Distribution A distribution that includes few extreme
observations in the negative direction (to the left of the majority of observations).

16. What is mean, Mode,Median?


Mode The value of the most frequent score
MedianThe middle value when observations are ordered from least to most..

17. Write the procedure to find median.


1. Order scores from least to most.
2 Find the middle position by adding one to the total number of scores and dividing
by 2.
3 If the middle position is a whole number, as in the left-hand panel below, use this
number to count into the set of ordered scores.
4 The value of the median equals the value of the score located at the middle position.
5 If the middle position is not a whole number, as in the right-hand panel below, use
the two nearest whole numbers to count into the set of ordered scores.
6 The value of the median equals the value midway between those of the two
middlemost scores; to find the midway value, add the two given values and divide by
2.

18. What is population mean and sample mean?


Population A complete set of scores
Sample A subset of scores.

19. What is Standard Deviation?


Standard Deviation A rough measure of the average (or standard) amount by
which scores deviate on either side of their mean.
20. Write the procedure to find population standard deviation.
Assign a value to N representing the number of X scores
Sum all X scores
Obtain the mean of these scores
Subtract the mean from each X score to obtain a deviation score
Square each deviation score

CS3352_FDS
4931_Grace College of Engineering, Thoothukudi

Sum all squared deviation scores to obtain the sum of squares


Substitute numbers into the formula to obtain population variance, σ2
Take the square root of σ 2 to obtain the population standard deviation, σ
PART-B

1. Construct a frequency distribution for the number of different residences occupied


by graduating seniors during their college career, namely1, 4, 2, 3, 3, 1, 6, 7, 4, 3, 3, 9,
2, 4, 2, 2, 3, 2, 3, 4, 4, 2, 3, 3, 5.
b) What is the shape of this distribution?

2. In some racing events, downhill skiers receive the average of their times for three
trials.
Would you prefer the average time to be the mean or the median if usually you have
(a) one very poor time and two average times?
(b) one very good time and two average times?
(c) two good times and one average time?
(d) three different times, spaced at about equal intervals?

3. During their first swim through a water maze, 15 laboratory rats made the following
number of errors (blind alleyway entrances): 2, 17, 5, 3, 28, 7, 5, 8, 5, 6, 2, 12, 10, 4,
3.
(a) Find the mode, median, and mean for these data.
(b) Without constructing a frequency distribution or graph, would you characterize
the shape of this distribution as balanced, positively skewed, or negatively skewed?

4. Given that the mean equals 5, what must be the value of the one missing observation
from each of the following sets of observations?
(a) 1, 2, 10
(b) 2, 4, 1, 5, 7, 7
(c) 6, 9, 2, 7, 1, 2

5. Determine the values of the range and the IQR for the following sets of data.
(a) Retirement ages: 60, 63, 45, 63, 65, 70, 55, 63, 60, 65, 63
(b) Residence changes: 1, 3, 4, 1, 0, 2, 5, 8, 0, 2, 3, 4, 7, 11, 0, 2, 3, 4
6. Indicate whether each of the following statements about degrees of freedom is true or
false.
(a) Degrees of freedom refer to the number of values free to vary in the population.
(b) One degree of freedom is lost because, when expressed as a deviation from the
sample mean, the final deviation in the sample fails to supply information about
population variability.
(c) Degrees of freedom makes sense only if we wish to estimate some unknown
characteristic
of a population.
(d) Degrees of freedom reflect the poor quality of one or more observations.

CS3352_FDS

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy