STATISTICS Grand Viva
STATISTICS Grand Viva
1. What do you mean by the term “Statistics”? What are the uses, importance, scope and
limitations of Statistics?
Answer: Statistics may be defined as the collection, presentation, analysis and interpretation of
numerical data.
Statistics is a set of decision-making techniques which helps businessmen in making suitable
policies from the available data. In fact, every businessman needs a sound background of
statistics as well as of mathematics.
The purpose of statistics and mathematics is to manipulate, summarize and investigate data so
that the useful decision-making results can be executed.
Statistics Meaning
The term ‘statistics’ has been derived from the Latin word ‘status’ Italian word ‘statista’ or
German word ‘statistik’.
All these words mean ‘Political state’. In ancient days, the states were required to collect
statistical data mainly for the number of youngmen so that they can be recruited in the Army.
Also to calculate the total amount of land revenue that can be collected. Due to this reason,
statistics is also called ‘Political Arithmetic’.
Statistics Definition
Statistics has been defined in different ways by different authors.
Statistics are numerical statements of facts in any department of enquiry placed in relation to
each other.Bowley
By statistics, we mean quantitative data affected to a marked extend by multiplicity of
causesYule and Kendall
By statistics, we mean aggregate of facts affected to a marked extent by multiplicity of causes,
numerically expressed, enumerated or estimated according to reasonable standards of accuracy,
collected in a systematic manner for a predetermined purpose and placed in relation to each
other.Horace Secrist
Statistics may be defined as the collection, presentation, analysis and interpretation of numerical
data.Croxton and Cowden
With the help of statistical methods, quantitative information about production, sale,
purchase, finance, etc. can be obtained. This type of information helps businessmen in
formulating suitable policies.
By using the techniques of time series analysis which are based on statistical methods,
the businessman can predict the effect of a large number of variables with a fair
degree of accuracy.
In business decision theory, most of the statistics techniques are used in taking a
business decision which helps us in doing the business without uncertainty.
By using ‘Bayesian Decision Theory’, the businessmen can select the optimal
decisions for the direct evaluation of the payoff for each alternative course of action.
Uses of Mathematics for Decision Making
The number of defects in a roll of paper, bale of cloth, sheet of a photographic film
can be judged by means of Control Chart based on Normal distribution.
In statistical quality control, we analyse the data which are based on the principles
involved in Normal curve.
Uses of Statistics in Economics
Statistics is the basis of economics. The consumer’s maximum satisfaction can be determined on
the basis of data pertaining to income and expenditure. The various laws of demand depend on
the data concerning price and quantity. The price of a commodity is well determined on the basis
of data relating to its buyers, sellers, etc.
Functions of Statistics
Statistics can be well-defined as a branch of research which is concerned with the development
and application of techniques for collecting, organising, presenting, analysing and interpreting
data in such a manner that the reliability of conclusions may be evaluated in terms of probability
statements.
Statistical methods and processes are useful for business development and, hence, applied to
enormous numerical facts with an objective that “behind every figure, there’s a story”.
Comparison
Statistics facilitate comparing different quantities. For example, the price-to-earnings ratio of
ITC as of January 22, 2021 is 19.54 as compared to HUL. HUL is overvalued, quoting a price-
to-earnings ratio of 71 times.
Forecast
Statistics helps forecast by looking at trends of a variable. It is essential for planning and
decision-making. Predictions or forecasts based on intuition can be disastrous for any business.
For example, to decide the production capacity for a vehicle-manufacturing plant, we need to
predict the demand for the product mix, supply of components, cost of manpower, competitor
strategy, etc., over the next 5 to 10 years, before committing an investment.
Testing of hypotheses
Hypotheses are statements about population parameters based on knowledge from literature that
a researcher would like to test for validity in the light of new information. Drawing inferences
about the population using sample estimates involves an element of risk.
Preciseness
Statistics visualises and presents facts precisely in a quantitative form. Facts and information
conveyed in quantitative terms are more convincing than qualitative data. For example, ‘increase
in profit margin is less in the year 2020 than in the year 2019’ does not convey a precise and
complete piece of information.
On the other hand, statistics summarise the information more precisely. For example, ‘profit
margin is 5% of the turnover in the year 2020 against 7% in the year 2019’.
Expectation
Statistics can act as the basic building block for framing clear plans and policies. For example,
how much raw material to be imported in a year, how much capacity to be expanded, or
manpower to be recruited, etc., depends on the expected value of outcome of our decisions taken
under different situations.
Importance of Statistics
Statistics in today’s life has become an essential part of various business activities which is clear
from the following points.
For example, by using the testing hypothesis, we can reject or accept the null hypothesis which
are based upon the assumption made from the population or universe
By using ‘Bayesian Decision Theory’ or ‘Decision Theory’, we can select the optimal decisions
for the direct evaluation of the payoff for each alternative course of action.
Mathematics and statistics have become ingredients of various decisions problems which is clear
from the following:
In Calculating E.O.L., C.O.L., etc.: In business, the opportunity loss is very often,
which can be defined as the difference between the highest possible profit for an event
and the actual profit obtained for the actual action taken. The expected opportunity
loss (E.O.L.) and conditional opportunity loss (C.O.L.) can be easily calculated by
using the concept of maximum and minimum criteria of pay-off.
Scope of Statistics
The following are the main scope of statistics:
For example, using statistical techniques a firm can know the tastes and preferences of the
consumers and decide to make its product accordingly.
Helps in forecasting
The success of planning by the Government or of a business depends to a large extent upon the
accuracy of their forecasts. Statistics provides a scientific basis for making such forecasts.
Limitations of Statistics
Statistics is considered to be a science as well as an art, which is used as an instrument of
research in almost every sphere of our activities.
The characteristics like honesty, goodwill, duty, character, beauty, intelligence, efficiency,
integrity etc. are not capable of quantitative measurement and hence cannot be directly dealt
with statistical methods. These characteristics are qualitative in nature.
In such type of characteristics, only comparison is possible The use of statistical methods is
limited to quantitative characteristics and those qualitative characteristics which are capable of
being expressed numerically.
Statistical Results are not Exact
The task of statistical analysis is performed under certain conditions. It is not always possible,
rather not advisable, to consider the entire population during statistical investigations.
The use of samples is called for in statistical investigations. And the results obtained by using
samples may not be universally true for the entire population. Data collected for a statistical
enquiry may not be hundred percent true. Statistical results are true on an average.
It cannot help in taking remedial steps to improve the result of that class. Statistics should be
taken as a means and not as an end. The methods of statistics are used to study the various
aspects of the data.
2. Explain the difference between:
(a) Central tendencies and measures of dispersion
What is central tendency?
Central tendency refers to and locates the center of the distribution of values. Mean,
mode, and median are the most commonly used indices in describing the central
tendency of a data set. If a data set is symmetric, then both the median and the mean
of the data set coincide with each other.
Given a data set, the mean is calculated by taking the sum of all the data values and
then dividing it by the number of data. For example, the weights of 10 people (in
kilograms) are measured to be 70, 62, 65, 72, 80, 70, 63, 72, 77 and 79. Then the
mean weight of the ten people (in kilograms) can be calculated as follows. Sum of
the weights is 70 + 62 + 65 + 72 + 80 + 70 + 63 + 72 + 77 + 79 = 710. Mean =
(sum) / (number of data) = 710 / 10 = 71 (in kilograms). It is understood that
outliers (data points that deviate from the normal trend) tend to affect the mean.
Thus, in the presence of outliers mean alone will not give a correct picture about the
center of the data set.
The median is the data point found at the exact middle of the data set. One way to
compute the median is to order the data points in ascending order, and then locate
the data point in the middle. For example, if once ordered the previous data set
looks like, 62, 63, 65, 70, 70, 72, 72, 77, 79, 80. Therefore, (70+72)/2 = 71 is at the
middle. From this, it is seen that median need not be in the data set. Median is not
affected by the presence of the outliers. Hence, median will serve as a better
measure of central tendency in the presence of outliers.
The mode is the most frequently occurring value in the set of data. In the previous
example, the value 70 and 72 both occurs twice and thus, both are modes. This
shows that, in some distributions, there is more than one modal value. If there is
only one mode, the data set is said to be unimodal, in this case, the data set is
bimodal.
What is dispersion?
Dispersion is the amount of spread of data about the center of the distribution.
Range and standard deviation are the most commonly used measures of dispersion.
The range is simply the highest value minus the lowest value. In the previous
example, the highest value is 80 and the lowest value is 62, so the range is 80-62 =
18. But range does not provide a sufficient picture about the dispersion.
To calculate the standard deviation, first the deviations of data values from the
mean are calculated. The root square mean of deviations is called the standard
deviation. In the previous example, the respective deviations from the mean are (70
– 71) = -1, (62 – 71) = -9, (65 – 71) = -6, (72 – 71) = 1, (80 – 71) = 9, (70 – 71) = -
1, (63 – 71) = -8, (72 – 71) = 1, (77 – 71) = 6 and (79 – 71) = 8. The sum of squares
of deviation is (-1)2 + (-9)2 + (-6)2 + 12 + 92 + (-1)2 + (-8)2 + 12 + 62 + 82 = 366. The
standard deviation is √(366/10) = 6.05 (in kilograms). Unless the data set is greatly
skewed, from this it can be concluded that the majority of the data is in the interval
71±6.05, and it is indeed so in this particular example.
Thus, Index numbers occupy an important place due to their efficacy in measuring the extent of
economic changes across a stipulated period. It helps to study such changes' effects due to
factors that cannot be directly measured.
How would You identify an Index Number? – Features and Characteristics of Index
Numbers
The main highlighting features of index numbers are mentioned as below–
It is a special category of average for measuring relative changes in such instances where
absolute measurement cannot be undertaken
Index number only shows the tentative changes in factors that may not be directly
measured. It gives a general idea of the relative changes
The method of index number measure alters from one variable to another related variable
It helps in the comparison of the levels of a phenomenon concerning a specific date and
to that of a previous date
It is representative of a special case of averages especially for a weighted average
Index numbers have universal utility. The index that is used to ascertain the changes in
price can also be used for industrial and agricultural production.
Value Index
A value index number is formed from the ratio of the aggregate value for a particular period with
that of the aggregate value that is found in the base period. The value index is utilized for
inventories, sales, and foreign trade, among others.
Quantity Index
A quantity index number is used to measure changes in the volume or quantity of goods that are
produced, consumed, and sold within a stipulated period. It shows the relative change across a
period for particular quantities of goods. Index of Industrial Production (IIP) is an example of
Quantity Index.
Price Index
A price index number is used to measure how price alters across a period. It will indicate the
relative value and not the absolute value. The Consumer Price Index (CPI) and Wholesale Price
Index (WPI) are major examples of a price index.
Index numbers are useful in many basic to complicated studies. Like it is used in the basic study
of human population in a country and also it is used to determine the extinction rate of the rare
animals in a particular region. There are many more usages of Index Numbers, let us find out:
It helps in measuring changes in the standard of living as well as the price level.
Wage rate regulation is consistent with the changes in the price level. With the
determination of price levels, wage rates may be revised.
Government policies are framed following the index number of prices. This price
stability inherent to fiscal and economic policies is based on index numbers.
It gives a pointer for international comparison concerning different economic variables—
for instance, living standards between two countries.
3. Using the formula calculate mean, mode, median, mean deviation and standard deviation
from the following data:
X: 40 45 42 28 48 20 36 40
Y: 50 47 40 38 45 28 38 48
5. “Statistics if the most dangerous in the hands of inexpert” Discuss and explain this
limitation of statistics?
Despite its immense use, Statistics has many limitations. These are as follows:
Statistics deals only with quantitative data and not the qualitative and descriptive facts
like efficiency, intelligence, honesty, blindness, etc.
Statistics does not deal with individuals but with groups. This is one of the biggest limitations of
statistics. To give you an example, the income of an individual or profit of a particular business unit
is not statistics since those figures are unrelated and incomparable.
On the other hand, the aggregate of figures relating to prices and consumption of various
commodities and over varying time periods are statistics.
Statistical laws are not exact. In fact, the results are true only on averages. Also, they are valid only
under a certain set of assumptions. Therefore, the science of statistics is less exact than natural
sciences like physics, chemistry, etc.
Misuse
Statistics deal with figures which are innocent in themselves and can be easily manipulated or
distorted by people for their selfish motives. Therefore, it is a dangerous tool in the hands of a non-
expert.
It is hence important that the user of the statistical methods has sound knowledge of the subject
along with the self-control of an artist.
According to Prof. W.I. King, “Statistics are like clay from which you can make a God or a devil as
you please.” He also said that “The science of statistics is the useful servant but only of great values
to those who understand its proper use.”
Distrust of Statistics
By definition, distrust means a lack of confidence or belief. Further, the science of statistics is
always subject to doubt and suspicion because of its misuse by unscrupulous elements for their
selfish motives. The common beliefs about statistics are:
There are three types of lies – lies, damn lies, and statistics
Numbers, though accurate, are open to manipulation by selfish people to conceal the
truth and present a distorted picture of the facts.
Therefore, it is important to understand that statistics is a tool, which if misused can cause a
disaster. Statistics neither approves or disapproves anything. Hence, you must take utmost care and
precaution while interpreting statistical data in all manifestations.
2. Proper sequence of questions: Questions must be placed in the proper sequence, like
simple and direct questions must be placed at the start of the questionnaire, and hard and
indirect questions must be placed at the last.
3. Simplicity: The language of the questions should be simple and easy to understand,
and the questions should be short. Complex questions must be avoided.
4. Instructions: A good questionnaire must have clear and proper instructions for filling
out the forms.
6. Non-controversial question: The question should be asked in such a way that they can
be answered impartially.
4. No Personal Questions:
No personal question should be asked from, respondents. Such questions should be
avoided.
6. Avoidance of Calculations:
Questions should not be based on calculations. Only those questions should be asked which
the respondents may reply immediately. Moreover, questions should avoid memories.
9. Pre-testing:
Before sending the questionnaire to the respondents, it must be properly tested.
10. Instructions:
Precise and simple instructions of filling the questionnaire should be added in the foot note.
Q7.What do you mean by Primary data? What are the methods of collecting primary data?
Also specify their pros and cons.
Answer: What is Primary Data ?
The data collected by the researcher himself for finding the solution of a particular problem
or situation, is known as primary data. This type of data is characterized by its originality as
it is freshly collected. Various organisations conduct surveys, observations, interviews, etc.
and as a result generate primary data. Although secondary data provides a basic
understanding to the research problems, but sometime, it becomes necessary to collect
primary data as the previously generated secondary data may not serve the purpose. Just like
secondary data, researchers should also take additional care while collecting primary data
such that it is accurate, reliable, and unbiased. For collecting primary data, researchers need
to take many decisions regarding proper selection of relevant sources, sampling techniques,
research tools, etc.
4) Volume of Data :
The primary data is at raw stage and researchers need to do thorough study step-by-step and
summarize it to use it efficiently. Researchers find the values of data statistically in a
presentable format or in simple statements so that the outcome is easily understood by the
general public.
Primary data are collected during the course of doing experiments in an experimental
research but in case we do research of the descriptive type and perform surveys, whether
sample surveys or census surveys, we can obtain primary data either through observation or
through direct communication with respondents in one form or another or through. personal
interviews. Thus, in other words, means that there are several methods of collecting primary
date, particularly in surveys and descriptive researches. Important ones are :
Methods of collecting primary data
Major tools and techniques for collecting primary data are as follows :
1) Interview :
Interview is the exchange of ideas. which takes place between two more people with the
purpose of getting information from the respondent. In this method, the interviewer organises
a meeting with the respondent regarding an object or issue related to the research objective,
and asks some questions. The responses of the interviewee are recorded and compiled to get
a better insight into the research problem. Interview-can be conducted through various
methods such is personal interview. telephonic interview, nail interview, panel interview, etc.
2) Questionnaire :
In order to collect the relevant information from the respondents by asking questions, it is
necessary to design a questionnaire comprising of questions related to the research problem.
Questionnaire is used to explore the unidentified facts and figures about a particular
objective or issue. The responses of the individuals about the research problem are kept
confidential. Questionnaires are the standardized and structured forms that are usually filled
by the respondents. Questionnaires can be administered personally as well as through mail.
When the questionnaire is filled by the researcher himself by asking questions from the
respondents, it is called "schedule". With the help of questionnaires, researchers can gather
genuine responses from the respondents, which enhance the effectiveness of data analysis.
3) Schedules :
Just like the questionnaire, a schedule is also a collection of questions. These questions are
separated through different sub headings, as per the research problem. Questions are placed
in a specific sequence, following the pattern of relevant topic. The researcher or the field
worker describes the questions to the individuals and records the responses. The major
difference between questionnaire and schedules that schedules are filled I by the field worker
or the enumerator specifically appointed for this purpose, whereas in questionnaire,
respondents fill the form. Enumerator explains the purpose of the research and data
collection to the respondents and collects their responses. By explaining the objective to the
participants, enumerators help in easy understanding of the research topic.
4) Observation :
Another technique for gathering primary data is observation. When the researcher records
information about a person, organisation, or situation, without making any personal contact,
it is known as "observation method". In this, the researcher or the field executive observes
the activity of the concerned person or organisation, to draw a pattern of behavior or
response to a particular incident. Sometimes, an artificial environment is created to collect
the actual responses of the participants..
5) Experimentation :
An important method to collect primary data is experimentation. In experimentation, the
causal relationship is determined and analysed between variables. Experimentation is
carried-out with the objective 10, study effect on a dependent variable by causing a change in
the independent variable. For example, a research can be conducted to analyse the influence
on learning due to guidelines and instructions in schools.
6) Other Methods :
Other methods for collection of data are described below :
i) Warranty Cards :
Warranty cards are generally used by the dealers of consumer durable to get the feedback of
products from their consumers. These are the postal sized cards placed within the package of
product. These cards contain various questions regarding the performance of product and to
know the needs of consumers. Customers are requested to fill and mail it back. It helps in
new product development for the manufacturer
ii) Auditing :
Auditing is a technique for assessing the performance and current position of any department
or the organisation. Sometimes, it is also used for understanding the market and buying
behavior of customers. Distributors or manufacturers use this tool for gaining the
competitive advantage and satisfying the need of customers. It is also used by the researchers
for inspecting the products. services or food purchased by consumers, also known as pantry
audit.
iv) Simulation :
Simulation is a quantitative technique for data collection. It is the creation of an artificial
environment resembling a real life situation. This real life situation is simulated by using
various mathematical equations and variables. Researchers can determine the relation
between different variables by altering one of the variables and finding its effect on the
others.
1) Reliability :
As the primary data is collected originally by the researcher and it is current and accurate, it
is more reliable than secondary data.
2) Variety of Techniques :
Primary data can be collected through various techniques. There are numerous tools and
techniques available to record and analyse primary data such as interviews, questionnaires,
observation, audits, etc. It allows the researchers to explore effectively in almost every area
where research is possible.
1) Costly Affair :
Primary data collection is an expensive task. It involves different activities. like selecting
type of technique, preparing questions, and hiring trained professionals for collecting
information or observing targets, etc. In this process, a huge amount is spent, which is why it
is costly to conduct.
2) Time Consuming :
Collecting primary data effectively takes more time. Developing research plan, deciding
sources of information, and selecting the methods of data collection are time consuming
activities.
3) Infeasible Sometime :
Although, primary data considered to be reliable source of information, but, sometimes it is
not an easy task to collect the primary data, as the sources of information may not be in the
reach of researcher or may incur a huge amount of money.
5) Unwillingness to Answer :
Sometimes participants do not cooperate in data collection by showing unwillingness to
answer or by giving wrong information. These factors act as burdies in primary data
collection and also reflect biasness in responses.
8. What do you mean by Secondary data? What are the methods of collecting primary
data? What are the advantages of collecting secondary data as compared to primary data?
Answer: Secondary data is the data that have been already collected for another purpose
but has some relevance to your research needs. In addition, the data is collected by
someone else instead of the researcher himself.
Secondary data is second-hand information. It is not used for the first time. That is why it
is called secondary.
Secondary data sources provide valuable interpretations and analysis based on primary
sources. They may explain in detail primary sources and often uses them to support a
specific thesis or a point of view.
Ease of Access
The secondary data sources are very easy to access. The internet world changed how
secondary research exists. Nowadays, you have so much information available just by
clicking with the mouse in front of the computer.
Low Cost or Free
The majority of secondary sources are absolutely free for use or at very low costs. It
saves not only your money but your efforts. In comparison with primary research where
you have to design and conduct a whole primary study process from the beginning,
secondary research allows you to gather data without having to put any money on the
table.
Time-saving
As the above advantage suggests, you can perform secondary research in no time.
Sometimes it is a matter of a few Google searches to find a credible source of
information.
Generating new insights and understandings from previous analysis
Reanalyzing old data can bring unexpected new understandings and points of view or
even new relevant conclusions.
Larger sample size
Big datasets often use a larger sample than those that can be gathered by primary data
collection. Larger samples mean that the final inference becomes much more
straightforward.
Longitudinal analysis
Secondary data allows you to perform a longitudinal analysis which means the studies
are performed spanning over a large period of time. This can help you to determine
different trends. In addition, you can find secondary data from many years back up to a
couple of hours ago. It allows you to compare data over time.
Anyone can collect the data
Secondary data research can be performed by people that aren’t familiar with the
different types of quantitative and qualitative research methods. Practically, anyone can
collect secondary data.
Disadvantages:
Not specific to your needs
Here is the main difference with the primary method. Secondary data is not specific to
the researcher’s need due to the fact that it was collected in the past for another reason.
That is why the secondary data might be unreliable and unuseful and in many business
and marketing cases. Secondary data sources can give you a huge amount of information,
but quantity does not mean appropriateness.
Lack of control over data quality
You have no control over the data quality at all. In comparison, with primary methods
that are largely controlled by the data-driven marketer, secondary data might lack quality.
It means the quality of secondary data should be examined in detail since the source of
the information may be questionable. As you relying on secondary data for your
decision-making process, you must evaluate the reliability of the information by finding
out how the information was collected and analyzed.
Biasness
As the secondary data is collected by someone else than you, typically the data is biased
in favor of the person who gathered it. This might not cover your requirements as a
researcher or marketer.
Not timely
Secondary data is collected in the past which means it might be out-of-date. This issue
can be crucial in many different situations.
Not proprietary Information
Generally, secondary data is not collected specifically for your company. Instead, it is
available to many companies and people either for free or for a little fee. So this is not
exactly an “information advantage” for you and your competitors also have access to the
data.
Line Graphs – Line graph or the linear graph is used to display the continuous data and
it is useful for predicting future events over time.
Bar Graphs – Bar Graph is used to display the category of data and it compares the data
using solid bars to represent the quantities.
Histograms – The graph that uses bars to represent the frequency of numerical data that
are organised into intervals. Since all the intervals are equal and continuous, all the bars
have the same width.
Line Plot – It shows the frequency of data on a given number line. ‘ x ‘ is placed above a
number line each time when that data occurs again.
Frequency Table – The table shows the number of pieces of data that falls within the
given interval.
Circle Graph – Also known as the pie chart that shows the relationships of the parts of
the whole. The circle is considered with 100% and the categories occupied is represented
with that specific percentage like 15%, 56%, etc.
Stem and Leaf Plot – In the stem and leaf plot, the data are organised from least value to
the greatest value. The digits of the least place values from the leaves and the next place
value digit forms the stems.
Box and Whisker Plot – The plot diagram summarises the data by dividing into four
parts. Box and whisker show the range (spread) and the middle ( median) of the data.
General Rules for Graphical Representation of Data
There are certain rules to effectively present the information in the graphical representation.
They are:
Suitable Title: Make sure that the appropriate title is given to the graph which indicates
the subject of the presentation.
Measurement Unit: Mention the measurement unit in the graph.
Proper Scale: To represent the data in an accurate manner, choose a proper scale.
Index: Index the appropriate colours, shades, lines, design in the graphs for better
understanding.
Data Sources: Include the source of information wherever it is necessary at the bottom
of the graph.
Keep it Simple: Construct a graph in an easy way that everyone can understand.
Neat: Choose the correct size, fonts, colours etc in such a way that the graph should be a
visual aid for the presentation of information.
Histogram
Smoothed frequency graph
Pie diagram
Cumulative or ogive frequency graph
Frequency Polygon
10. Calculate :
i) Laspeyre’s Index number
ii) Paasche’s Index number
iii) Fisher’s Index Number
iv) Dorbish Bowley’s Index Number
Answer: https://www.shaalaa.com/question-bank-solutions/calculate-laspeyre-s-paasche-s-
dorbish-bowley-s-and-marshall-edgeworth-s-price-index-numbers-construction-of-index-
numbers-weighted-aggregate-method_156587