CC4 Statistics and Computer Application
CC4 Statistics and Computer Application
Statistics is a branch of applied mathematics that involves the collection, description, analysis, and
inference of conclusions from quantitative data. The mathematical theories behind statistics rely heavily
on differential and integral calculus, linear algebra, and probability theory. Statisticians, people who do
statistics, are particularly concerned with determining how to draw reliable conclusions about large
groups and general events from the behavior and other observable characteristics of small samples. These
small samples represent a portion of the large group or a limited number of instances of a general
phenomenon.
The two major areas of statistics are known as descriptive statistics, which describes the properties of
sample and population data, and inferential statistics, which uses those properties to test hypotheses and
draw conclusions.
Some common statistical tools and procedures include the following:
Descriptive * Mean (average)
Variance * Skewness
Kurtosis
Q2 Limitations of Statistics
Besides the importance of statistics in every field of life, it has some limitations. The following are the
main limitations of statistics are:
a) Statistics does not deal with individuals: A part of the definition of statistics is that it must be the
aggregates of facts. That is, it deals only with the mass phenomena. A single item or the isolated
figure cannot be regarded as statistics. This is a serious limitation of statistics. For example: the mark
obtained by a student in English is 75 does not constitute statistics but the average of a group of
students in English is 75 forms statistics.
b) Statistics does not study qualitative phenomena: The science of statistics studies only the
quantitative aspect of the problem. Statistics cannot directly be used for the study of qualitative
phenomena such as honesty, intelligence, beauty, poverty etc. however, some statistical techniques
can be used to study such qualitative phenomena indirectly by expressing them into numbers. For
example: the intelligence of the boys can be studied with the help of marks obtained by them in an
examination.
c) Statistical laws are not exact: 100% accuracy is rare in statistical work because statistical laws are
true only on the average. They are not exact as, are the laws of Physics and Mathematics. For
example: the probability of getting a head in a single toss of a coin is ½. This does not imply that 3
heads will be obtained if a coin is tossed 6 times. Only one head, 2 times head or all the times head or
no head may be obtained.
d) Statistics is only a means: Statistical methods provide only a method of studying problem. There are
other methods also. These methods should be used to supplement the conclusions derived with the
help of statistics.
e) Statistics is liable to be misused: The most important limitation of statistics is that it must be handled
by experts. Statistical methods are the most dangerous tools in the hands of inexpert. Since statistics
deals with masses of figures, so it can easily be manipulated by inexperienced and skilled persons.
Statistical methods if properly be used, may conclude useful results and if misused by inexpert,
unskilled persons, it may lead to fallacious conclusion. We have the following example consisting the
result concluded by an inexpert and unskilled person.
Q3 What is statistical significance?
―Statistical significance helps quantify whether a result is likely due to chance or to some factor of
interest,‖ says Redman. When a finding is significant, it simply means you can feel confident that‘s it
real, not that you just got lucky (or unlucky) in choosing the sample.
When you run an experiment, conduct a survey, take a poll, or analyze a set of data, you‘re taking a
sample of some population of interest, not looking at every single data point that you possibly can.
Consider the example of a marketing campaign. You‘ve come up with a new concept and you want to see
if it works better than your current one. You can‘t show it to every single target customer.
1
When you run the results, you find that those who saw the new campaign spent $10.17 on average, more
than the $8.41 those who saw the old one spent. This $1.76 might seem like a big — and perhaps
important — difference. But in reality you may have been unlucky, drawing a sample of people who do
not represent the larger population; in fact, maybe there was no difference between the two campaigns
and their influence on consumers‘ purchasing behaviors. This is called a sampling error, something you
must contend with in any test that does not include the entire population of interest.
Redman notes that there are two main contributors to sampling error: the size of the sample and the
variation in the underlying population. Sample size may be intuitive enough. Think about flipping a coin
five times versus flipping it 500 times. The more times you flip, the less likely you‘ll end up with a great
majority of heads. The same is true of statistical significance: with bigger sample sizes, you‘re less likely
to get results that reflect randomness. All else being equal, you‘ll feel more comfortable in the accuracy
of the campaigns‘ $1.76 difference if you showed the new one to 1,000 people rather than just 25. Of
course, showing the campaign to more people costs more, so you have to balance the need for a larger
sample
Q4 Diagrammatic Representations: Basics, Types, Examples
Diagrammatic Representations: The use of diagrams to illustrate statistical data is very essential.
The greatest way for representing any numerical data obtained in statistics is through diagrammatic
representations. ―A picture is worth a thousand words,‖ according to one famous quote. In comparison
to tabular or textual representations, the diagrammatic display of data provides an immediate
understanding of the true scenario to be defined by the data.
Diagrammatic Representation of Data: Meaning
Representation of any numerical data by using diagrams is known as diagrammatic representation.
Diagrammatic data representations give a simple and easy understanding of any numerical data
collected as compared with the tabular form of the data or textual form of the data.
One of the famous quotes says that ―A picture speaks more than words.‖ Similarly, to represent the
statistical data, the essential tool is the diagrams. Diagrammatic data representations translate the
highly complex ideas included in the given numerical data into concrete and pretty effectively in a
simple, understandable manner.
Basics of Diagrammatic Presentations
Diagrammatic representation of data gives a lot of information regarding numerical data. It is a more
attractive and easy way of representing any numerical data in statistics. Diagrammatic representations
are like visual assistance to the readers. Diagrammatic representations use the geometrical figures as
diagrams to improve the data representation, such as cartography, pictographs, Pie charts, bar
diagrams, etc.
1. In the cartograms, we represent the geographical location of certain things, and we use maps.
2. Bar graphs are represented by rectangle bars. The height of the bars gives the value or
frequency of the variable. All rectangular bars should have equal width.
3. In the pie charts, a circle is divided into parts, such that each part shows the proportion of
various data.
4. In a line representation of data, we use the line to connect the various portions or parts of the
plotted data on the graph.
Advantages of Diagrammatic Presentations
1. The various advantages of the diagrammatic representations are listed below:
2. The diagrammatic representations of the data are more attractive and pretty impressive
compared with the tabular form of the data or textual form of the data.
3. The diagrammatic representations of the data are easy to remember as they use the geometrical
figures as the diagrams.
4. The diagrammatic representation of data is easy to understand.
5. Diagrammatic data representations translate the highly complex ideas included in the given
numerical data into concrete and pretty effectively in a simple, understandable manner.
2
Types of Diagrammatic Representations
Diagrammatic representations use the geometrical figures as diagrams to improve the data
representation, such as cartographs, pictographs, Pie charts, bar diagrams, etc.
1. Line Diagrams: In the linear diagrammatic representations of the data, we will use the line that
connects the points or portions of the various data in the graph by taking two variables on horizontal
and vertical axes.
2. Bar Diagrams: In the bar diagrammatic representation of data, the data can be represented by
rectangular bars. The height of the bars gives the value or frequency of the variable. All rectangular
bars should have equal width. This is one of the best-used tools for the comparison of the data.
3. Histograms: Histograms are also similar to bar diagrams; they use rectangular bars to represent the
data. But all the rectangular bars are kept without any gaps.
4. Pie Diagrams: Pie Diagram is a diagrammatic representation of data by using circles and spheres. In
the pie diagrams, a circle is divided into parts, such that each part shows the proportion of various
data.
5. Pictographs: The pictographic representation shows the given data graphically by using images or
symbols. The symbol or image is used in the pictographic diagrams describes the frequency of the
object in the given set of data.
Q5 What is Dispersion in Statistics?
Dispersion is the state of getting dispersed or spread. Statistical dispersion means the extent to which
numerical data is likely to vary about an average value. In other words, dispersion helps to understand the
distribution of the data.
Measures of Dispersion
In statistics, the measures of dispersion help to interpret the variability of data i.e. to know how much
homogenous or heterogeneous the data is in simple terms, it shows how squeezed or scattered the variable
is.
Types of Measures of Dispersion
There are two main types of dispersion methods in statistics which are:
Absolute Measure of Dispersion
Relative Measure of Dispersion
Absolute Measure of Dispersion
An absolute measure of dispersion contains the same unit as the original data set. The absolute dispersion
method expresses the variations in terms of the average of deviations of observations like standard or
means deviations. It includes range, standard deviation, quartile deviation, etc.
The types of absolute measures of dispersion are:
1. Range: It is simply the difference between the maximum value and the minimum value given in a
data set. Example: 1, 3,5, 6, 7 => Range = 7 -1= 6
2. Variance: Deduct the mean from each data in the set, square each of them and add each square and
finally divide them by the total no of values in the data set to get the variance. Variance (σ 2) =
∑(X−μ)2/N
3. Standard Deviation: The square root of the variance is known as the standard deviation i.e. S.D. =
√σ.
4. Quartiles and Quartile Deviation: The quartiles are values that divide a list of numbers into
quarters. The quartile deviation is half of the distance between the third and the first quartile.
5. Mean and Mean Deviation: The average of numbers is known as the mean and the arithmetic
mean of the absolute deviations of the observations from a measure of central tendency is known
as the mean deviation (also called mean absolute deviation).
3
Q6 What is Mean?
Mean is the most commonly used measure of central tendency. It actually represents the average of the
given collection of data. It is applicable for both continuous and discrete data.
It is equal to the sum of all the values in the collection of data divided by the total number of values.
Suppose we have n values in a set of data namely as
then the mean of data is given by: It can also be denoted as:
For grouped data, we can calculate the mean using three different methods of formula.
Direct method Assumed mean method Step deviation method
Mean
Mean
Here,
Mean Here,
a = Assumed mean
Here, a = Assumed mean
ui = (xi – a)/h
∑fi = Sum of all frequencies di = xi – a
h = Class size
∑fi = Sum of all frequencies
∑fi = Sum of all frequencies
To learn more about the mean, visit here.
Q7 What is Median?
Generally median represents the mid-value of the given set of data when arranged in a particular order.
Median: Given that the data collection is arranged in ascending or descending order, the following
method is applied:
If number of values or observations in the given data is odd, then the median is given by
observation.
If in the given data set, the number of values or observations is even then the median is given by
the average of observation.
Median for grouped data can be calculated using the formula,
To understand in detail about the median, visit here.
Q8 What is Mode?
The most frequent number occurring in the data set is known as the mode.
Consider the following data set which represents the marks obtained by different students in a subject.
Name Anmol Kushagra Garima Ashwini Geetika Shakshi
Marks Obtained (out of 100) 73 80 73 70 73 65
The maximum frequency observation is 73 ( as three students scored 73 marks), so the mode of the given
data collection is 73.
We can calculate the mode for grouped data using the below formula:
Q9 Quartile Deviation Definition
The Quartile Deviation can be defined mathematically as half of the difference between the upper and
lower quartile. Here, quartile deviation can be represented as QD; Q3 denotes the upper quartile and
Q1 indicates the lower quartile.
Quartile Deviation is also known as the Semi Interquartile range.
Quartile Deviation Formula
Suppose Q1 is the lower quartile, Q2 is the median, and Q3 is the upper quartile for the given data set, then
its quartile deviation can be calculated using the following formula.
QD = (Q3 – Q1)/2
In the next section, you will learn how to calculate these quartiles for both ungrouped and grouped data
separately.
Quartile Deviation for Ungrouped Data
For an ungrouped data, quartiles can be obtained using the following formulas,
Q1 = [(n+1)/4]th item
Q2 = [(n+1)/2]th item
Q3 = [3(n+1)/4]th item
Where n represents the total number of observations in the given data set.
4
Also, Q2 is the median of the given data set, Q1 is the median of the lower half of the data set and Q3 is
the median of the upper half of the data set.
Before, estimating the quartiles, we have to arrange the given data values in ascending order. If the value
of n is even, we can follow the similar procedure of finding the median.
Quartile Deviation for Grouped Data
For a grouped data, we can find the quartiles using the formula,
Here,
Qr = the rth quartile
l1 = the lower limit of the quartile class
l2 = the upper limit of the quartile class
f = the frequency of the quartile class
c = the cumulative frequency of the class preceding the quartile class
N = Number of observations in the given data set
Q10 What are the differences between Mean, Median, and Mode?
These three terms are related to each other. There‘s a relationship between mean, median and mode and is
called an empirical relationship between them. Below are some of the most integral differences between
the mean, median, mode.
Sl. No. Mean Median Mode
1 The average was taken for a set The middle value in the data The number that occurs the
of numbers is called a mean. set is called Median. most in a given list of numbers
is called a mode.
2 Add all of the numbers together Place all the given numbers It shows the frequency of
and divide this sum of all in an ascending order occurrence.
numbers by a total number of
numbers.
3 The result is the mean or The next step is to find the We can have more than one
average score. middle number on the list. It mode or no mode at all.
is called as the median.
4 Example: To find the average Example: If the given list is Example: In the given series
of the four numbers 2, 4, 6, 8, 4, 2, 8, 10, 19. 3,3,5,6,7,7,8,1,1,1,4,5,6
we need to add the number
first. 1. Arrange the numbers 1. Find the frequency of
in ascending order i .e each number.
1. 2 + 4 + 6+ 8 = 20 2, 4, 8, 10, 19. 2. For number 3 it‘s 2, for
2. Divide the sum by the 5 it‘s 2, for 6 it‘s 2, for
total number of numbers, As the total numbers are 5, 7 it‘s 2, for 8 it‘s one,
i. e 4. so the middle number 8 is for 1 it‘s 3, for 4 it‘s 1.
the median here.
20/4 = 5 is the average or mean The number with the highest
frequency is the mode.
Q11 Mean Deviation Definition
The mean deviation is defined as a statistical measure that is used to calculate the average deviation from the mean
value of the given data set. The mean deviation of the data values can be easily calculated using the below
procedure.
Step 1: Find the mean value for the given data values
Step 2: Now, subtract the mean value from each of the data values given (Note: Ignore the minus symbol)
Step 3: Now, find the mean of those values obtained in step 2.
Mean Deviation Formula
5
The formula to calculate the mean deviation for the given data set is given below.
Mean Deviation = [Σ |X – µ|]/N
Here,
Σ represents the addition of values
X represents each value in the data set
µ represents the mean of the data set
N represents the number of data values
Q12 What is Standard deviation?
Standard Deviation: Grouped Data and Ungrouped Data (Prof. Karl Pearson 1894) Standard Deviation
is the square root of the mean of the square of the Individual deviation from the mean of the distribution.
Standard deviation is a measure which shows how much variation (such as spread, dispersion, spread,)
from the mean exists. The standard deviation indicates a ―typical‖ deviation from the mean. It is a popular
measure of variability because it returns to the original units of measure of the data set. Like the
variance, if the data points are close to the mean, there is a small variation whereas the data points are
highly spread out from the mean, then it has a high variance. Standard deviation calculates the extent to
which the values differ from the average. Standard Deviation, the most widely used measure of
dispersion, is based on all values. Therefore a change in even one value affects the value of standard
deviation. It is independent of origin but not of scale. It is also useful in certain advanced statistical
problems.
Variance and Standard Deviation Formula
The formulas for the variance and the standard deviation is given below:
Standard Deviation Formula
The population standard deviation formula is given as:
Here,
σ = Population standard deviation
N = Number of observations in population
Xi = ith observation in the population
μ = Population mean
Q13 Spearman correlation coefficient: Definition, Formula and Calculation with Example
Spearman correlation coefficient: Definition
The Spearman‘s rank coefficient of correlation is a nonparametric measure of rank correlation (statistical
dependence of ranking between two variables).
Named after Charles Spearman, it is often denoted by the Greek letter ‘ρ’ (rho) and is primarily used
for data analysis.
It measures the strength and direction of the association between two ranked variables. But before we talk
about the Spearman correlation coefficient, it is important to understand Pearson‘s correlation first. A
Pearson correlation is a statistical measure of the strength of a linear relationship between paired data.
For the calculation and significance testing of the ranking variable, it requires the following data
assumption to hold true:
Interval or ratio level, o Linearly related, o Bivariant distributed
If your data doesn‘t meet the above assumptions, then you would need Spearman‘s Coefficient. It is
necessary to know what monotonic function is to understand Spearman correlation coefficient. A
monotonic function is one that either never decreases or never increases as it is an independent variable
increase. A monotonic function can be explained using the image below:
Here,
n= number of data points of the two variables
di= difference in ranks of the ―ith‖ element
The Spearman Coefficient,⍴, can take a value between +1 to -1 where,
A ⍴ value of +1 means a perfect association of rank
A ⍴ value of 0 means no association of ranks
A ⍴ value of -1 means a perfect negative association between ranks.
Closer the ⍴ value to 0, weaker is the association between the two ranks.
We must be able to rank the data before proceeding with the Spearman‘s Rank Coefficient of Correlation.
It is important to observe if increasing one variable, the other variable follows a monotonic relation.
Q14 Karl Pearson’s Coefficient of Correlation
The study of Karl Pearson Coefficient is an inevitable part of Statistics. Statistics is majorly dependent on
Karl Pearson Coefficient Correlation method. The Karl Pearson coefficient is defined as a linear
correlation that falls in the numeric range of -1 to +1.
This is a quantitative method that offers the numeric value to form the intensity of the linear relationship
between the X and Y variable. But is it really useful for any economic calculation? Let, us find and delve
into this topic to get more detailed information on the subject matter – Karl Pearson Coefficient of
Correlation.
What do You mean by Correlation Coefficient?
Before delving into details about Karl Pearson Coefficient of Correlation, it is vital to brush up on
fundamental concepts about correlation and its coefficient in general.
The correlation coefficient can be defined as a measure of the relationship between two quantitative or
qualitative variables, i.e., X and Y. It serves as a statistical tool that helps to analyze and in turn, measure
the degree of the linear relationship between the variables.
For example, a change in the monthly income (X) of a person leads to a change in their monthly
expenditure (Y). With the help of correlation, you can measure the degree up to which such a change can
impact the other variables.
Types of Correlation Coefficient
Depending on the direction of the relationship between variables, correlation can be of three types,
namely –
Positive Correlation (0 to +1)
Negative Correlation (0 to -1)
Zero Correlation (0)
Positive Correlation (0 to +1): In this case, the direction of change between X and Y is the same. For
instance, an increase in the duration of a workout leads to an increase in the number of calories one burns.
Negative Correlation (0 to -1): Here, the direction of change between X and Y variables is opposite. For
example, when the price of a commodity increases its demand decreases.
7
Zero Correlation (0): There is no relationship between the variables in this case. For instance, an
increase in height has no impact on one’s intelligence.
Now that we have refreshed our memory of these basics, let‘s move on to Karl Pearson Coefficient of
Correlation.
What is Karl Pearson’s Coefficient of Correlation?
This method is also known as the Product Moment Correlation Coefficient and was developed by Karl
Pearson. It is one of the three most potent and extensively used methods to measure the level of
correlation, besides the Scatter Diagram and Spearman‘s Rank Correlation.
The Karl Pearson correlation coefficient method is quantitative and offers numerical value to
establish the intensity of the linear relationship between X and Y.
Q15 Use of Chi-Square Test
A chi-squared test (symbolically represented as χ2) is basically a data analysis on the basis of
observations of a random set of variables. Usually, it is a comparison of two statistical data sets. This test
was introduced by Karl Pearson in 1900 for categorical data analysis and distribution. So it was
mentioned as Pearson’s chi-squared test.
The chi-square test is used to estimate how likely the observations that are made would be, by
considering the assumption of the null hypothesis as true.
A hypothesis is a consideration that a given condition or statement might be true, which we can test
afterwards. Chi-squared tests are usually created from a sum of squared falsities or errors over the sample
variance.
Chi-Square Distribution
When we consider, the null speculation is true, the sampling distribution of the test statistic is called
as chi-squared distribution. The chi-squared test helps to determine whether there is a notable
difference between the normal frequencies and the observed frequencies in one or more classes or
categories. It gives the probability of independent variables.
Note: Chi-squared test is applicable only for categorical data, such as men and women falling under the
categories of Gender, Age, Height, etc.
Properties
The following are the important properties of the chi-square test:
Two times the number of degrees of freedom is equal to the variance.
The number of degree of freedom is equal to the mean distribution
The chi-square distribution curve approaches the normal distribution when the degree of freedom
increases.
Formula
The chi-squared test is done to check if there is any difference between the observed value and expected
value. The formula for chi-square can be written as;
or
χ2 = ∑(Oi – Ei)2/Ei
where Oi is the observed value and Ei is the expected value.
Chi-Square Test of Independence
The chi-square test of independence also known as the chi-square test of association which is used to
determine the association between the categorical variables. It is considered as a non-parametric test. It is
mostly used to test statistical independence.
The chi-square test of independence is not appropriate when the categorical variables represent the pre-
test and post-test observations. For this test, the data must meet the following requirements:
Two categorical variables
Relatively large sample size
Categories of variables (two or more)
Independence of observations
8
Q What is a chi-square test?
Pearson‘s chi-square (Χ2) tests, often referred to simply as chi-square tests, are among the most
common nonparametric tests. Nonparametric tests are used for data that don‘t follow the assumptions of
parametric tests, especially the assumption of a normal distribution.
If you want to test a hypothesis about the distribution of a categorical variable you‘ll need to use a chi-
square test or another nonparametric test. Categorical variables can be nominal or ordinal and represent
groupings such as species or nationalities. Because they can only have a few specific values, they can‘t
have a normal distribution.
The chi-square formula
Both of Pearson‘s chi-square tests use the same formula to calculate the test statistic, chi-square (Χ2):
Where:
2
Χ is the chi-square test statistic
Σ is the summation operator (it means ―take the sum of‖)
O is the observed frequency
E is the expected frequency
When to use a chi-square test
A Pearson‘s chi-square test may be an appropriate option for your data if all of the following are true:
1. You want to test a hypothesis about one or more categorical variables. If one or more of your
variables is quantitative, you should use a different statistical test. Alternatively, you could convert
the quantitative variable into a categorical variable by separating the observations into intervals.
2. The sample was randomly selected from the population.
3. There are a minimum of five observations expected in each group or combination of groups.
Types of chi-square tests
The two types of Pearson‘s chi-square tests are:
Chi-square goodness of fit test
Chi-square test of independence
Mathematically, these are actually the same test. However, we often think of them as different tests
because they‘re used for different purposes.
Chi-square goodness of fit test
You can use a chi-square goodness of fit test when you have one categorical variable. It allows you to
test whether the frequency distribution of the categorical variable is significantly different from your
expectations. Often, but not always, the expectation is that the categories will have equal proportions.
Example: Hypotheses for chi-square goodness of fit testExpectation of equal proportions
Null hypothesis (H0): The bird species visit the bird feeder in equal proportions.
Alternative hypothesis (HA): The bird species visit the bird feeder in different proportions.
Expectation of different proportions
Null hypothesis (H0): The bird species visit the bird feeder in the same proportions as the average
over the past five years.
Alternative hypothesis (HA): The bird species visit the bird feeder in different proportions from
the average over the past five years.
Chi-square test of independence
You can use a chi-square test of independence when you have two categorical variables. It allows you
to test whether the two variables are related to each other. If two variables are independent (unrelated),
the probability of belonging to a certain group of one variable isn‘t affected by the other variable.
Example: Chi-square test of independence
Null hypothesis (H0): The proportion of people who are left-handed is the same for Americans
and Canadians.
9
Alternative hypothesis (HA): The proportion of people who are left-handed differs between
nationalities.
Q16 Characteristics of Chi square test in Statistics
The Characterstics of Chi square test in statiscs are given below
1. This test (as a non-parametric test) is based on frequencies and not on the parameters like mean
and standard deviation.
2. The test is used for testing the hypothesis and is not useful for estimation.
3. This test possesses the additive property as has already been explained.
4. This test can also be applied to a complex contingency table with several classes and as such is a
very useful test in research work.
5. This test is an important non-parametric test as no rigid assumptions are necessary in regard to
the type of population, no need of parameter values and relatively less mathematical details are
involved.
Q17 Meaning and Types of Correlation
Correlation is a statistical calculation that indicates that two variables are parallelly related (which means
that the variables change together at a constant rate). It is a simple and popularly used tool for defining
relationships without delivering a statement concerning the cause and effect.
In simple words, correlation is a statistical calculation that estimates the point at which the two variables
shift in relation to each other.
A positive and perfect correlation indicates that the coefficient correlation is exactly one. It indicates that
when one variable moves upward or downward, the another variable moves in the same direction.
However, a negative and perfect correlation indicates that both the variables move in the opposite
directions. When there is a zero correlation, it means that there is no relationship at all.
What is Correlation?
Correlation measures the relationship, or association, between two variables by looking at how the
variables change with respect to each other. Statistical correlation also corresponds to simultaneous
changes between two variables, and it is usually represented by linear relationships. Importantly,
correlation does not necessarily mean causation. This is because a correlation describes how two or more
variables are related, and not whether they cause changes in one another.
Types of Correlation
High and Low Correlation
High correlation describes a stronger correlation between two variables, wherein a change in the first has
a close association with a change in the second. Low correlation describes a weaker correlation, meaning
that the two variables are probably not related.
Positive, Negative, and No Correlation
A correlation in statistics denotes a linear relationship. A positive correlation means that this linear
relationship is positive, and the two variables increase or decrease in the same direction. A negative
correlation is just the opposite, wherein the relationship line has a negative slope and the variables change
in opposite directions (i.e, one variable decreases while the other increases). No correlation simply means
that the variables behave very differently and thus, have no linear relationship.
As the corresponding graphs show, we can conclude the following correlations:
Temperature and ice cream sales: the hotter the day, the higher the ice cream sales. This is a
positive correlation.
Length of workout and body mass index (BMI): the longer the workout, the lower the BMI. This is
a negative correlation.
Shoe size and hair color: show size has no relation to hair color. This has no correlation.
Correlation Coefficient
The correlation coefficient is an important statistical indicator of a correlation and how the two variables
are indeed correlated (or not). This is a value denoted by the letter r, and it ranges between -1 and +1.
r < 0 implies negative correlation
r > 0 implies positive correlation
r = 0 implies no correlation
10
For example, if the hot days and ice cream sales correlation coefficient was found to be 0.8, this means
that the correlation between the two variables is positive and strong.
Q18 Different Types of Correlation
(A) MEANING OF CORRELATION
a. Two variables can have some kind of relationship, i.e., change in one may cause a change in
the other. Examples: Price and demand, height and weight, temperature and demand for soft
drinks
b. If a change in the value of one variable causes a simultaneous change in the other variable in
the same or opposite direction, then it is termed as correlation, or these variables are said to be
correlated.
(B) TYPES OF CORRELATION
There are three types of correlation:
a. Positive and negative correlation
b. Linear and non-linear correlation
c. Simple, multiple, and partial correlation
Q19 Importance of Statistics
The important functions of statistics are:
Statistics helps in gathering information about the appropriate quantitative data
It depicts the complex data in the graphical form, tabular form and in diagrammatic representation,
to understand it easily
It provides the exact description and better understanding
It helps in designing the effective and proper planning of the statistical inquiry in any field
It gives valid inferences with the reliability measures about the population parameters from the
sample data
It helps to understand the variability pattern through the quantitative observations
Characteristics of Statistics
The important characteristics of Statistics are as follows:
Statistics are numerically expressed.
It has an aggregate of facts
Data are collected in systematic order
It should be comparable to each other
Data are collected for a planned purpose
Q20 Arithmetic mean
Arithmetic mean (or, simply, ―mean‖) is nothing but the average. It is computed by adding all the values
in the data set divided by the number of observations in it. If we have the raw data, mean is given by the
formula.
Mean
Where, (the uppercase Greek letter sigma) refers to summation, refers to the individual value and n is
the number of observations in the sample (sample size). The research articles published in journals do not
provide raw data and, in such a situation, the readers can compute the mean by calculating it from the
frequency distribution (if provided).
Mean
Where, f is the frequency and is the midpoint of the class interval and n is the number of observations.[3]
The standard statistical notations (in relation to measures of central tendency) are mentioned in [Table 1].
Readers are cautioned that the mean calculated from the frequency distribution is not exactly the same as
that calculated from the raw data. It approaches the mean calculated from the raw data as the number of
intervals increase.
11
Q20 MS Excel
Introduction: MS Excel is a commonly used Microsoft Office application. It is a spreadsheet program
which is used to save and analyse numerical data.
In this article, we bring to you the important features of MS Excel, along with an overview of how to use
the program, its benefits and other important elements. A few sample MS Excel question and answers are
also given further below in this article for the reference of Government exam aspirants.
Basics of MS Excel
MS Excel is a spreadsheet program where one can record data in the form of tables. It is easy to analyse
data in an Excel spreadsheet. The image given below represents how an Excel spreadsheet looks like:
How to open MS Excel?
To open MS Excel on your computer, follow the steps given below:
Click on Start
Then All Programs
Next step is to click on MS Office
Then finally, choose the MS-Excel option
Features of MS Excel
Various editing and formatting can be done on an Excel spreadsheet. Discussed below are the various
features of MS Excel.
Home: Comprises options like font size, font styles, font colour, background colour, alignment,
formatting options and styles, insertion and deletion of cells and editing options
Insert: Comprises options like table format and style, inserting images and figures, adding graphs,
charts and sparklines, header and footer option, equation and symbols
Page Layout: Themes, orientation and page setup options are available under the page layout
option
Formulas: Since tables with a large amount of data can be created in MS excel, under this feature,
you can add formulas to your table and get quicker solutions
Data: Adding external data (from the web), filtering options and data tools are available under this
category
Review: Proofreading can be done for an excel sheet (like spell check) in the review category and
a reader can add comments in this part
View: Different views in which we want the spreadsheet to be displayed can be edited here.
Options to zoom in and out and pane arrangement are available under this category
Benefits of Using MS Excel
MS Excel is widely used for various purposes because the data is easy to save, and information can be
added and removed without any discomfort and less hard work.
Given below are a few important benefits of using MS Excel:
Easy To Store Data: Since there is no limit to the amount of information that can be saved in a
spreadsheet, MS Excel is widely used to save data or to analyse data. Filtering information in
Excel is easy and convenient.
Easy To Recover Data: If the information is written on a piece of paper, finding it may take
longer, however, this is not the case with excel spreadsheets. Finding and recovering data is easy.
Application of Mathematical Formulas: Doing calculations has become easier and less time-
taking with the formulas option in MS excel
More Secure: These spreadsheets can be password secured in a laptop or personal computer and
the probability of losing them is way lesser in comparison to data written in registers or piece of
paper.
Data at One Place: Earlier, data was to be kept in different files and registers when the paperwork
was done. Now, this has become convenient as more than one worksheet can be added in a single
MS Excel file.
Neater and Clearer Visibility of Information: When the data is saved in the form of a table,
analysing it becomes easier. Thus, information is a spreadsheet that is more readable and
understandable.
12
Excel Basic Functions
Below is a list of ten simple but helpful functions that are essentially needed for your expertise skill in
Excel.
SUM: This is the first Excel function I will be familiarizing you with. It is the one that performs the
basic arithmetic operation of addition. Your Sum formula in Excel should include at least 1 number,
referenced to a cell or a range of cells.
AVERAGE: The second function we will be looking at is the average. Excel AVERAGE function does
exactly what its name implies, that is; it finds an average, or arithmetic mean, of numbers.
MAXIMUM & MINIMUM: The MAX and MIN formulas in Excel get the highest and lowest value in a
set of numbers, respectively.
COUNT & COUNTA: Another important function in Excel is the COUNT & COUNTA function. If you
need to know how many are cells in a given range contain numeric values (numbers or dates), you
don’t need to waste time by counting them by hand, the Excel COUNT function will do the trick.
Q21 MS Office
Introduction: Microsoft Office is a software which was developed by Microsoft in 1988. This Office
suite comprises various applications which form the core of computer usage in today‘s world.
From the examination point of view, questions from MS Office and its applications are frequently asked
in all the major Government Exams conducted in the country.
In this article, we shall discuss at length Microsoft Office, its applications, important notes to prepare for
the upcoming examinations and some sample questions and answers for the reference of candidates.
Competitive exams including Bank, SSC, Railways, Insurance, etc. have Computer Knowledge as an
integral part of their exam syllabus and candidates must note that it can be the most scoring too.
Thus, candidates must focus on this section to improve their overall performance and improve their mark
sheet. Given below are a few important links which may help candidates with their preparation for
competitive exams:
MS Office Applications & its Functions
Currently, MS Office 2016 version is being used across the world and all its applications are widely used
for personal and professional purposes.
Discussed below are the applications of Microsoft Office along with each of their functions.
1. MS Word
First released on October 25, 1983
Extension for Doc files is ―.doc‖
It is useful in creating text documents
Templates can be created for Professional use with the help of MS Word
Work Art, colours, images, animations can be added along with the text in the same file which is
downloadable in the form of a document
Authors can use for writing/ editing their work
To read in detail about Microsoft Word, its features, uses and to get some sample questions based on this
program of Office suite, visit the linked article.
2. MS Excel
Majorly used for making spreadsheets
A spreadsheet consists of grids in the form of rows and columns which is easy to manage and can
be used as a replacement for paper
It is a data processing application
Large data can easily be managed and saved in tabular format using MS Excel
Calculations can be done based on the large amount of data entered into the cells of a spreadsheet
within seconds
File extension, when saved in the computer, is ―.xls‖
Also, visit the Microsoft Excel page to get more information regarding this spreadsheet software and its
components.
3. MS PowerPoint
13
It was released on April 20, 1987
Used to create audiovisual presentations
Each presentation is made up of various slides displaying data/ information
Each slide may contain audio, video, graphics, text, bullet numbering, tables etc.
The extension for PowerPoint presentations is ―.ppt‖
Used majorly for professional usage
Using PowerPoint, presentations can be made more interactive
In terms of Graphical user interface, using MS PowerPoint, interesting and appealing presentation and
documents can be created. To read more about its features and usage, candidates can visit the linked
article.
4. MS Access
It was released on November 13, 1992
It is Database Management Software (DBMS)
Table, queries, forms and reports can be created on MS Access
Import and export of data into other formats can be done
The file extension is ―.accdb‖
5. MS Outlook
It was released on January 16, 1997
It is a personal information management system
It can be used both as a single-user application or multi-user software
Its functions also include task managing, calendaring, contact managing, journal logging and web
browsing
It is the email client of the Office Suite
The file extension for an Outlook file is ―.pst‖
6. MS OneNote
It was released on November 19, 2003
It is a note-taking application
When introduced, it was a part of the Office suite only. Later, the developers made it free,
standalone and easily available at play store for android devices
The notes may include images, text, tables, etc.
The extension for OneNote files is ―.one‖
It can be used both online and offline and is a multi-user application
Apart from the applications mentioned above, various other applications are included in the MS Office
suite but these are most commonly used ones and questions based on the same may be asked in the
upcoming exams as well.
Aspirants can also learn more about Microsoft Office through the video given below, specially curated
for candidates assistance:
Q22 MS Word
Microsoft Word is one of the popular applications for documenting by all types of users. This guide lets
users who want to learn Microsoft Word basics but don‘t have much experience with computers or
Microsoft software. It will provide you with a solid foundation in MS Word, allowing you to progress to
greater levels of proficiency.
Basics of Microsoft Word
You may use Microsoft Office Word to create and modify personal and business documents like letters,
reports, invoices, emails, and books. Documents saved in Word are saved with the.docx extension by
default.
Microsoft Word can be used for a variety of tasks:
Creating business papers with a variety of images, such as photos, charts, and diagrams.
Saving and reusing pre-formatted text and elements like cover pages and sidebars.
Making letters and letterheads for both personal and professional use.
Creating a variety of documents, including resumes and invitation cards.
Producing a variety of letters, ranging from simple office memos to legal copies and reference
documents.
14
Now, let us first understand some basic aspects of the application. You can open the application on your
personal computer while following these simple steps:
Start → All Programs → MS Office → MS Word
List of Features of MS Word
Home: This feature of MS word has options like font colour, font size, font style, alignment, bullets,
line spacing, etc. Additionally, all the basic elements that one may need to edit their document are
available under the Home option.
Insert: You can enter tables, shapes, images, charts, graphs, header, footer, page number, etc., in the
document. These Features of MS word are available in the “Insert” category.
Design: You can create or select the template or the design under the Design Tab in which you
want your document to be by using this Features of MS word. Moreover, choosing an appropriate
tab will enhance the appearance of your document on MS Word.
Page Layout: This Features of MS word under the Page Layout tab come with options like margins,
orientation, columns, lines, indentation, spacing, etc.
References: This tab is the most useful feature of MS word for those who are creating a thesis or
writing books or lengthy documents. Options like citation, footnote, table of contents, caption,
bibliography, etc. are present under this tab.
Review: Spell check, grammar, thesaurus, word count, language, translation, comments, etc.,
everything is trackable under the review tab. Additionally, it benefits those who review their
documents in Microsoft Word.
Advanced Features Of MS Word
With the basic features of MS Word out of the way, here are a couple advanced features that many of you
are most likely in the dark about. These features will totally blow you away as they offer a cleaner and
more customised MS Word experience.
Moreover, we have listed the shortcuts to these features so that you don‘t have to waste much of your
time. Just use these shortcuts and see the magic! What are we waiting for? Let‘s hop right in!
Turn on the Distraction Free Mode by using the Alt + W + F shortcut
Quickly summon the Clipboard and hold up to 24 items for you to cut, copy and paste around
using the Ctrl + C Double Press shortcut
You can translate documents anytime anywhere by heading over to Review > Translate
Transform tables into graphs by navigating to Insert > Object > Object Types > Microsoft
Graph Chart
You can easily hide the Ribbon Panel by using the Ctrl + F1 shortcut
Q23 MS PowerPoint
MS PowerPoint is a program that is included in the Microsoft Office suite. It is used to make
presentations for personal and professional purposes.
In this article, we shall discuss in detail the functions and features of a PowerPoint presentation, followed
by some sample questions based on this topic for the upcoming competitive exams.
Basics of MS PowerPoint
Discussed below are a few questions that one must be aware of while discussing the basics of MS
PowerPoint. Once this is understood, using the program and analysing how to use it more creatively shall
become easier.
Question: What is MS PowerPoint?
Answer: PowerPoint (PPT) is a powerful, easy-to-use presentation graphics software program that allows
you to create professional-looking electronic slide shows.
The image given below shows the main page of MS PowerPoint, where a person lands when the program
is opened on a computer system:
Features of MS PowerPoint
There are multiple features that are available in MS PowerPoint which can customise and optimise a
presentation. The same have been discussed below.
15
Slide Layout
Multiple options and layouts are available based on which a presentation can be created. This option is
available under the ―Home‖ section and one can select from the multiple layout options provided.
Insert – Clipart, Video, Audio, etc.
Under the ―Insert‖ category, multiple options are available where one can choose what feature they want
to insert in their presentation. This may include images, audio, video, header, footer, symbols, shapes,
etc.
The image below shows the features which can be inserted:
Slide Design
MS PowerPoint has various themes using which background colour and designs or textures can be added
to a slide. This makes the presentation more colourful and attracts the attention of the people looking at it.
Animations
During the slide show, the slides appear on the screen one after the other. In case, one wants to add some
animations to the way in which a slide presents itself, they can refer to the ―Animations‖ category.
Uses of PowerPoint Presentation
PowerPoint presentations are useful for both personal and professional usage. Given below are a few of
the major fields where PPT is extremely useful:
Education – With e-learning and smart classes being chosen as a common mode of education
today, PowerPoint presentations can help in making education more interactive and attract students
towards the modified version of studying
Marketing – In the field of marketing, PowerPoint presentations can be extremely important.
Using graphs and charts, numbers can be shown more evidently and clearly which may be ignored
by the viewer if being read
Business – To invite investors or to show the increase or decrease in profits, MS PowerPoint can
be used
Creating Resumes – Digital resumes can be formed using MS PowerPoint. Different patterns,
photograph, etc. can be added to the resume
Depicting Growth – Since both graphics and text can be added in a presentation, depicting the
growth of a company, business, student‘s marks, etc. is easier using PPT.
Q24 Use of computer and internet in social work practice
Overview Of Information Technology In Social Work Practice
In this session, we will look at applications that have been developed specifically for use in social work or
more broadly human service practice.
Assessment and Testing: Assessment and testing applications constitute by far the largest number of
computer applications in social work practice. There are more than 250 such programs currently
available, most of them consisting of computer administered testing, scoring, and interpretation packages.
Most of the commonly used assessment devices are now available in computerized versions, including
DSMIV interviewing and diagnostic programs. The Clinical Measurement Package developed by Walter
Hudson provides administration and scoring of all Hudson rapid assessment scales, now numbering more
than 30(Nurius & Hudson, 1993b). Demonstration and educational versions of this package are available
free of cost. Information about many of these applications is readily available on the internet.
Computerized Clinical Records: computerized clinical record keeping systems maintain the complete
case record on the computer and usually incorporate case management and caseload management
functions as well. Typically, clinical record keeping systems operate on a network or large computer, and
the individual worker accesses the system through a computer located in his or her office. Because of the
complexities and idiosyncrasies of record keeping requirements, most of these systems have been custom
designed for a particular agency and are not available on the open market. Other record keeping programs
emphasize treatment planning and case management. These systems assist the particular in conducting
the assessment and selecting treatment goals, and then help monitor client change overtime (Corcoran &
Gingerich, 1994; Gingerich, 1995b). Increasingly, case management programs will be able to use the
information contained in the case record to assist workers in managing their caseloads more efficiently
and may even reduce the amount of time spent on paperwork.
16
Q Obtain Quartile deviation and standard deviation of the following (Discrete Series)
X F FC
0 2 2
1 4 2+4= 6
2 7 6+7= 13
3 6 13+6= 19
4 2 19+2= 21
5 1 21+1= 22
22
Q.D = Q3-Q1/2
Q1 = N+1/4 = 22+1/4 = 5.75
Q3 = 3(N+1)/4 = 3(22+1)/4 = 17.25
Q1= 1, Q3 = 3
Quartile Deviation = Q3-Q1/2 = 3-1/2 = 2/2 = 1 Ans
X f f.x d=(x-x) d2 fd2
0 2 2 0-2.22 = 2.22 4.92 9.84
1 4 4 1-2.22 = 1.22 1.48 5.92
2 7 14 2-2.22 = 0.22 0.04 0.28
3 6 18 3-2.22 = 0.78 0.60 3.60
4 2 8 4-2.22 = 1.78 3.16 6.32
5 1 5 5-2.22 = 2.78 7.72 7.723
33.68
Mean x = ∑fx/N = 49/22 = 2.22
Standard Deviation = σ = ∑fd2/N = 33.68/22
Q Standard Deviation Continuous Series
Class Frequency x(Mid Value) f.x d = (x-x) d2 f.d f.d2
0-10 1 0+10/2 = 5 5 5-26 = -21 441 -21 441
10-20 2 10+20/2 = 15 30 15-26 = -11 121 -22 242
20-30 3 20+30/2 = 25 75 25-26 = -1 1 -3 3
30-40 3 30+40/2 = 35 105 35-26 = 9 81 27 243
40-50 1 40+50/2 = 45 45 45-26 = 19 361 19 361
260 0 1290
Mean = Sum of All observation /Total Number
Mean x = ∑f.x/N = 260/10 = 26
Standard Deviation σ = ∑f.d2/N – (∑f.d)2/N = 1290/10 – (0/10)2 = 129 = 11.35 Ans
Q Quartile Deviation (Discrete Series)
Marks No of Student C.f
f
10 4 4
20 7 11
30 15 26
40 8 34
50 7 41
60 2 43
Quartile Deviation = Q3-Q1/2
Q1 = N+1/4 = 43+1/4 = 44/4 = 11, Q3 = 3(N+1)/4 = 3(43+1)/4 = 132/4 = 33
Q1= 20, Q3 = 40
Quartile Deviation = Q3-Q1/2 = 40-20/2 = 20/2 = 10
Coefficent of Quartile Deviation = Q3-Q1/Q3+Q1 = 40-20/40+20 = 20/60 = 1/3 Ans
17
Q Quartile deviation (Continuous Series)
C.I Frequency Cumulative frequencies
f C.f
0-10 3 3
10-20 5 8
20-30 7 15
30-40 10 25
40-50 12 37
50-60 15 52
60-70 12 64
70-80 6 70
80-90 2 72
90-100 8 80
Quartile Deviation = Q3-Q1/2
Q1 = N/4 = 80/4 = 20, Q3 3x80/4 = 60
l = Lower Boundary of Quartile Group 0 h =Width of Quartile Group
18
Median = L+N/2-f/fm x I
N/2 = 52/2 = 26, fm = 12, L = 15-0.5 = 14.5, f = 9, i = 5
Median = L+N/2-f/fmxi fm1 = Frequency of the class interval
= 14.5+26-9/12x5 = 21.58 Ans
lying just above the class interval
Mode = L+ fm1 /fm1+fm2x I counting the frequency
L = 2.5 fm2 = Frequency of the class interval
fm1 = 9, fm2 = 7, I = 5 lying just below the class interval
Mode = 2.5+9/9+7 x 52.81 counting the frequency
Q Chi-Square
O1 E1 (O1-E1) (O1-E1)2 (O1-E1)2 (O1-E1)2
E1 E1
12 17 -5 25 25/17 1.47
14 17 -3 9 9/17 0.53
16 17 -1 1 1/17 0.05
17 17 0 0 0/17 0
19 17 2 4 4/17 0.23
24 17 7 49 49/17 2.88
X2= chi square
O1 = observed value
E1 = expected value
∑ (O1-E1)2 = 1.47+0.53+0.05+0+0.23+2.88
E1
2
X = 5.15 , df = (R-1) (C-1) = (6-1) (2-1) = 5x1 = 5
df (5) = 11.070, X2 = X2 < df, Null Hypothesis
X2<df(5)
It is not significant and null hypothesis, It is rejected.
Q Mean Deviation ∑f|D|
∑f
19