RM Full Note
RM Full Note
Definitions of research
The term research is derived from the French word Recherche which means to travel. In
common man’s language refers to “ search for knowledge “.
According to Albert Scent Gyorgyi, “Research is to see what everybody else has seen and to
think what no body else had thought to do”.
OR
Definition of Research
A search for the truth.
• A movement from known to the unknown.
• An organised and systematic way of finding
answers to questions.
• A systematic process of identifying a question
or problem, setting forth a plan of action to
answer the question or resolve the problem
and rigorously collecting, analysing and
interpreting data for the purpose.
Definition of Manager
A manager is a person who is responsible for a part of a company, i.e., they ‘manage‘ the
company. Managers may be in charge of a department and the people who work in it. In some
cases, the manager is in charge of the whole business. For example, a ‘restaurant manager’ is
in charge of the whole restaurant.
OBJECTIVES OF RESEARCH
• It extends knowledge of human beings, social life and environment.
• It brings to light hidden information that might never be discovered fully during the ordinary
course of life.
• Research establishes generalisations and general laws and contributes to theory building in
various fields of knowledge.
• It verifies and tests existing facts and theory and this help improving our knowledge and
ability to handle situations and events.
• It aims to analyse inter-relationships between variables and to derive causal explanations
and thus enables us to have a better understanding of the world in which we live.
Need For Business Research
Business research is one of the most effective ways to understand customers, the market
and competitors. Such research helps companies to understand the demand and supply of
the market. Using such research will help businesses reduce costs, and create solutions or
products that are targeted to the demand in the market and the correct audience.
• Estimating expenses
• Determination of price
• Assisting managers in the decision making
process
• Evaluating market trends
• Achieving competitive advantage
Eg: Different ways that the women professionals adapt to manage work family conflict.
Conclusive research:
Descriptive Research: It is a fact that find investigation with adequate interpretation. Definite
conclusions can be arrived at, but it does not establish a cause and effect relationship. It tries
to describe the characteristics of the respondent in relation to a particular product.
Eg: Trends in the consumption of soft drinks with respect to socioeconomic characteristics
such as age, family, income, education level.
Casual Research: This is conducted to determine the cause and effect relationship between
two variables.
Basic research, also called pure research or fundamental research, is a type of scientific
research with the aim of improving scientific theories for better understanding and prediction
of natural or other phenomena. It is not directly involved with practical problem. There is no
intention to apply this in practise.
Applied Research
It aims at finding solution to a real life problem requiring an action or policy decision. It is
often referred to as a scientific method of inquiry or contractual research because it involves
the practical application of scientific methods to everyday problems.
Eg: To measure the purchase intentions for Nanos as a function of the demographic variables
of income, family size, and distance travelled, one would need to use quantitative methods.
Qualitative research can be used to explore, describe, or understand the reasons for a certain
phenomenon.
Eg: To understand what a low cost car means to an Indian consumer, this kind of investigation
would be required.
EXPLORATORY RESEARCH
Exploratory research is defined as a research used to investigate a problem which is not clearly
defined. It is conducted to have a better understanding of the existing problem, but will not provide
conclusive results. For such a research, a researcher starts with a general idea and uses this research
as a medium to identify issues that can be the focus for future research. An important aspect here is
that the researcher should be willing to change his/her direction subject to the revelation of new data
or insight. Such a research is usually carried out when the problem is at a preliminary stage. It is often
referred to as grounded theory approach or interpretive research as it used to answer questions like
what, why and how.
For example: Consider a scenario where a juice bar owner feels that increasing the variety of juices
will enable increase in customers, however he is not sure and needs more information. The owner
intends to carry out an exploratory research to find out and hence decides to do an exploratory
research to find out if expanding their juices selection will enable him to get more customers of if
there is a better idea.
While it may sound a little difficult to research something that has very little information about it,
there are several methods which can help a researcher figure out the best research design, data
collection methods and choice of subjects. There are two ways in which research can be conducted
namely primary and secondary. Under these two types, there are multiple methods which can used
by a researcher. The data gathered from these research can be qualitative or quantitative. Some of
the most widely used research designs include the following:
1.PRIMARY RESEARCH METHODS
Primary research is information gathered directly from the subject. It can be through a group of
people or even an individual. Such a research can be carried out directly by the researcher himself or
can employ a third party to conduct it on their behalf. Primary research is specifically carried out to
explore a certain problem which requires an in-depth study.
For example: A survey is sent to a given set of audience to understand their opinions about
the size of mobile phones when they purchase one. Based on such information organization
can dig deeper into the topic and make business related decision.
• Interviews: While you may get a lot of information from public sources, but sometimes an in
person interview can give in-depth information on the subject being studied. Such a research
is a qualitative research method. An interview with a subject matter expert can give you
meaningful insights that a generalized public source won’t be able to provide. Interviews are
carried out in person or on telephone which have open-ended questions to get meaningful
information about the topic.
For example: An interview with an employee can give you more insights to find out the degree
of job satisfaction, or an interview with a subject matter expert of quantum theory can give
you in-depth information on that topic.
• Focus groups: Focus group is yet another widely used method in exploratory research. In
such a method a group of people is chosen and are allowed to express their insights on the
topic that is being studied. Although, it is important to make sure that while choosing the
individuals in a focus group they should have a common background and have comparable
experiences.
For example: A focus group helps a research identify the opinions of consumers if they were
to buy a phone. Such a research can help the researcher understand what the consumer value
while buying a phone. It may be screen size, brand value or even the dimensions. Based on
which the organization can understand what are consumer buying attitudes, consumer
opinions, etc.
Secondary research is gathering information from previously published primary research. In such a
research you gather information from sources likes case studies, magazines, newspapers, books, etc.
• Online research: In today’s world, this is one of the fastest way to gather information on any
topic. A lot of data is readily available on the internet and the researcher can download it
whenever he needs it. An important aspect to be noted for such a research is the
genuineness and authenticity of the source websites that the researcher is gathering the
information from.
For example: A researcher needs to find out what is the percentage of people that prefer a
specific brand phone. The researcher just enters the information he needs in a search engine
and gets multiple links with related information and statistics.
• Literature research: Literature research is one of the most inexpensive method used for
discovering a hypothesis. There is tremendous amount of information available in
libraries, online sources, or even commercial databases. Sources can include
newspapers, magazines, books from library, documents from government agencies,
specific topic related articles, literature, Annual reports, published statistics from
research organizations and so on.
• Case study research: Case study research can help a researcher with finding more
information through carefully analyzing existing cases which have gone through a similar
problem. Such analysis is very important and critical especially in today’s business world.
The researcher just needs to make sure he analyses the case carefully in regards to all
the variables present in the previous case against his own case. It is very commonly used
by business organizations or social sciences sector or even in the health sector.
For example: A particular orthopaedic surgeon has the highest success rate for
performing knee surgeries. A lot of other hospitals or doctors have taken up this case to
understand and benchmark the method in which this surgeon does the procedure to
increase their success rate.
• Identify the problem: A researcher identifies the subject of research and the problem is
addressed by carrying out multiple methods to answer the questions.
• Create the hypothesis: When the researcher has found out that there are no prior studies
and the problem is not precisely resolved, the researcher will create a hypothesis based on
the questions obtained while identifying the problem.
• Further research: Once the data has been obtained, the researcher will continue his study
through descriptive investigation. Qualitative methods are used to further study the subject
in detail and find out if the information is true or not.
• The researcher has a lot of flexibility and can adapt to changes as the research progresses.
• It is usually low cost.
• It helps lay the foundation of a research, which can lead to further research.
• It enables the researcher understand at an early stage, if the topic is worth investing the
time and resources and if it is worth pursuing.
• It can assist other researchers to find out possible causes for the problem, which can be
further studied in detail to find out, which of them is the most likely cause for the problem.
• Even though it can point you in the right direction towards what is the answer, it is
usually inconclusive.
• The main disadvantage of exploratory research is that they provide qualitative data.
Interpretation of such information can be judgmental and biased.
• Most of the times, exploratory research involves a smaller sample, hence the results
cannot be accurately interpreted for a generalized population.
• Many a times, if the data is being collected through secondary research, then there
is a chance of that data being old and is not updated.
CONCLUSIVE RESEARCH
Conclusive research design, as the name implies, is applied to generate findings that are practically
useful in reaching conclusions or decision-making. In this type of studies research objectives and data
requirements need to be clearly defined. Findings of conclusive studies usually have specific uses.
Conclusive research design provides a way to verify and quantify findings of exploratory studies.
Conclusive research design usually involves the application of quantitative methods of data collection
and data analysis. Moreover, conclusive studies tend to be deductive in nature and research
objectives in these types of studies are achieved via testing hypotheses.
The table below illustrates the main differences between conclusive and exploratory research
design:
Outcome Findings used as input to decision making Generally followed by further exploratory
conclusive research
It has to be noted that “conclusive research is more likely to use statistical tests, advanced
analytical techniques, and larger sample sizes, compared with exploratory studies. Conclusive
research is more likely to use quantitative, rather than qualitative techniques”. Conclusive
research is helpful in providing a reliable or representative picture of the population through
the application of valid research instrument.
1.Descriptive Research
Descriptive research can be explained as a statement of affairs as they are at present with the
researcher having no control over variable. Moreover, “descriptive studies may be
characterized as simply the attempt to determine, describe or identify what is, while
analytical research attempts to establish why it is that way or how it came to be”.
Descriptive research is “aimed at casting light on current issues or problems through a process
of data collection that enables them to describe the situation more completely than was
possible without employing this method.”
In its essence, descriptive studies are used to describe various aspects of the phenomenon.
In its popular format, descriptive research is used to describe characteristics and/or behaviour
of sample population.
An important characteristic of descriptive research relates to the fact that while descriptive
research can employ a number of variables, only one variable is required to conduct a
descriptive study. Three main purposes of descriptive studies can be explained as describing,
explaining and validating research findings.
Descriptive studies are closely associated with observational studies, but they are not limited
with observation data collection method. Case studies and surveys can also be specified as
popular data collection methods used with descriptive studies.
Advantages of Descriptive Research
Causal research, also known as explanatory research is conducted in order to identify the
extent and nature of cause-and-effect relationships. Causal research can be conducted in
order to assess impacts of specific changes on existing norms, various processes etc.
Causal studies focus on an analysis of a situation or a specific problem to explain the patterns
of relationships between variables. Experiments are the most popular primary data collection
methods in studies with causal research design.
The presence of cause cause-and-effect relationships can be confirmed only if specific causal
evidence exists. Causal evidence has three important components:
1. Temporal sequence. The cause must occur before the effect. For example, it would not be
appropriate to credit the increase in sales to rebranding efforts if the increase had started
before the rebranding.
2. Concomitant variation. The variation must be systematic between the two variables. For
example, if a company doesn’t change its employee training and development practices, then
changes in customer satisfaction cannot be caused by employee training and development.
3. Nonspurious association. Any covariation between a cause and an effect must be true and
not simply due to other variable. In other words, there should be no a ‘third’ factor that
relates to both, cause, as well as, effect.
Basic Research
Basic research is a type of research approach that is aimed at gaining a better understanding
of a subject, phenomenon or basic law of nature. This type of research is primarily focused
on the advancement of knowledge rather than solving a specific problem. Basic research is
also referred to as pure research or fundamental research.
Typically, basic research can be exploratory, descriptive or explanatory; although in many
cases, it is explanatory in nature. The primary aim of this research approach is to gather
information in order to improve one’s understanding, and this information can then be useful
in proffering solutions to a problem.
Applied Research
Applied research is a type of research design that seeks to solve a specific problem or provide
innovative solutions to issues affecting an individual, group or society. It is often referred to
as a scientific method of inquiry or contractual research because it involves the practical
application of scientific methods to everyday problems.
When conducting applied research, the researcher takes extra care to identify a problem,
develop a research hypothesis and goes ahead to test these hypotheses via an experiment. In
many cases, this research approach employs empirical methods in order to solve practical
problems.
Types of Applied Research
There are 3 types of applied research. These are evaluation research, research and
development, and action research.
Evaluation Research
Evaluation research is a type of applied research that analyses existing information about a
research subject to arrive at objective research outcomes or reach informed decisions. This
type of applied research is mostly applied in business contexts, for example, an organisation
may adopt evaluation research to determine how to cut down overhead costs.
Research and Development
Research and development is a type of applied research that is focused on developing new
products and services based on the needs of target markets. It focuses on gathering
information about marketing needs and finding ways to improve on an existing product or
create new products that satisfy the identified needs.
Action Research
Action research is a type of applied research that is set on providing practical solutions to
specific business problems by pointing the business in the right directions. Typically, action
research is a process of reflective inquiry that is limited to specific contexts and situational in
nature.
1. Marketing
• Product identification
• Demand estimation
• Demand-supply analysis
• Product development
• Market segmentation
• Sales promotion programme
• Product launching
• Buyer behaviour
• Product Research
• Pricing Research
• Promotional research
• Place Research
• Packaging
2. Human Resources Management
• Manpower planning
• Performance appraisal systems
• Conflict management
• Study of organisational climate
• Design of incentive plans
• Leadership styles
• Training and Development
• Change management
• Negotiation and wage settlement
• Labour welfare study
3. Finance
• Economic evaluation of alternatives
• Study of financial parameters of an organisation
• Capital budgeting
• Ratio analysis
• Working capital
• Portfolio management
• Balance of payment
• Inflation
• Deflation
• Behavioural finance
4. Production & Operations Management
• Operation planning
• Demand forecasting and decision analysis
• Process planning
• Project management and maintenance management studies
• Logistics and supply chain and inventory management analysis.
• Quality estimation and assurance studies.
Cross-functional Research
Cross functional Research requires an open orientation where experts from across the
discipline contribute to and gain from the study.
For example, an area such as new product development requires commitment of the
marketing, production, and consumer insights team to exploit new opportunities.
Managerial Effectiveness
The term ‘managerial effectiveness’ could mean achievement of organizational goals,
increase in productivity, profit, workers’ satisfaction, growth, diversification etc. Managerial
effectiveness aims at optimum allocation and utilization of scarce organizational resources in
order to achieve the goals at minimum cost. It aims at deriving maximum output out of
minimum input.
1
MODULE 2
Research Methodology
Research: The word research is composed of two syllables “Re” and “Search”.
“Re” is the prefix meaning ‘Again or over again or a new’ and “Search” is
the latter meaning ‘to examine closely and carefully’ or ‘to test and try’.
2. Systematic and scientific search for getting relevant answers on any taken
up specific topic.
Acc to Bulmer,
Research comprises of
• Collecting
• Organizing
• Evaluating datas
• Making decisions
• Suggesting solutions
• Reaching conclusions
not using a particular method or technique so that research results are capable
of being evaluated either by the researcher or others.
Steps:
3) What way and why the hypothesis (basic idea) has been formulated?
4) Why a particular technique of analyzing data is used? (or) How the data
were collected?
4. Valid & Verifiable: The findings should be valid & can be verified by you
or others at any time.
Types of Research
3) Exploratory
4) Explanatory
1) Pure Research: (Basic or Fundamental Research)
1) Descriptive Research:
2) Correlative Research :
Analytical Research:
The researcher has to use facts / information already existing and analyze
these data to make a critical evaluation.
Explanatory Research:
1) Quantitative Research:
• In this type of Research, the objectives, design, sample and all the
other factors influencing the research is pre determined.
Quantitative Research:
Empirical research
Qualities of a Researcher
10) Systematic: Check, check and check again. Spending a proper amount
of time for checking always pays.
1) Econ omics:
2) Business Decisions:
All the above three are responsible for business decision making.
Social scientist gain their knowledge for their own sake and for the
development of the society.
5. Formulate objectives
7. Double check
The best way to understand the problem is to discuss with his own
colleague or guide.
Conceptual literature :
Empirical Literature : Concerning studies made earlier which are similar to the
one proposed.
Step II
Stage III:
Hypothesis should be very specific and very well limited to the place of
research in hand because it has to be tested.
1) Discuss with collogues / experts, about the problems, its origin, its
objectives and solutions.
STEP IV:
4) Operational Research Design: How the above three are carried out.
15
So we select few items from the entire population for our study purpose.
The items so selected constitutes what is technically called “sample”.
Non Probability sampling: All the items do not have an equal chance of being
selected for the study.
Data can be collected in several ways either through (1) Experiment (or)
(2) through surveys.
1) By observations
4) By mailing of questionnaires
The Researcher should select one of these methods of collecting the data
taking in account the
1) Nature of investigation
3) Financial Resources
4) Time frame
8. Analysis of Data :
After the data are collected the researcher turns to the task of
analyzing the data the analysis of data require closely related
operations, like ‘coding, Editing & Tabulation’.
Coding: The collected data are transformed into symbols that may be
tabulated or counted.
Tabulation: Technical procedure where the data are put in the form of
tables.
Research Design:
Research Design
1) Sampling design: Deals with, the method of ‘selecting items’ for the study.
3) Statistical Design: Deals with the “no of items” selected or the study
and how the selected data will be analysed.
1) Helps to identify the type and source of information needed for the study.
19
3) Specifies the time schedule of the research and the monetary budget involved.
Continuous variable : Values that can be expressed even in decimal poins are
known as continuous variables
Non continuous Variables:Value that can be expressed only in integer values are
called Non continuous variables
Endogenous variables :
When the change in one variable depends on the change in other variable,
it is known as dependent or Endogenous variable.
The variable that causes the change in the dependent variable is known
as independent or exogenous variable.
Extraneous variable :
Control:
Confounded Relationship
Research Hypothesis:
8. Treatments:
The different conditions to which the experimental & control groups are
subject to is known as treatments.
Absolute Experiment:
Comparative Experiment:
Eg : animal testing
Adopt procedure that not only reduce bias but enhance reliability – and
facilitates deriving Inferences (results) about the Research problem.
Helps yield maximum information with minimum effort, time and money.
Attain reliability
Time schedule
HYPOTHESIS
dependent valuables. (eg) the female students perform as well as the male
students.
Characteristics of Hypothesis
1) A hypothesis should be precise and clear. If not clear, the inferences will
not be reliable.
Null Hypothesis: Denoted by H0. If both the variables (say male or female)
or (Head or Tail) are equally good, it is Null Hypothesis.
Null Hypothesis
Ho : u = 100
Alternative Hypothesis
Ha : u = 100
Ha : u > 100
Ha : u < 100
2) Null hypothesis when it is actually true, when rejected involves great risk,
the level of significance should be considered.
3. Decision Rule
(i) Researcher may reject Ho, when it is true – Type I Error (which must have
been accepted).
(ii) Researcher may accept Ho, when it is false – Type II Error (which must
have been rejected)
(i) One tailed test rejects the Null hypothesis when the sample mean is
either greater or lower than the hypothesized value of the population
mean.
Two tailed Test: When the sample mean is both greater and lower than
the hypothesized value of the population mean.
Example:
Nominal
Ordinal
Interval
Ratio
•The four scale types are ordered in that all later scales have all the properties
of earlier scales— plus additional properties
Nominal Scale
• Not really a ‘scale’ because it does not scale objects along any dimension
Male = 1
Female = 2
Religious Affiliation
Catholic= 1
Protestant= 2
Jewish = 3
Muslim = 4
Other = 5
Categorical data are measured on nominal scales which merely assign labels to
distinguish categories
None = 0
Mild = 1
Moderate = 2
Severe = 3
29
Ordinal Scale
Interval Scale
Fahrenheit Scale
• A 10-degree difference has the same meaning anywhere along the scale
Ratio Scale
It isn’t so straight-forward??
• A constant is a number that does not change its value (is constant) in a given
situation
•Independent variables:
31
•Dependent variables:
Statistical Notation
10 12 25 7 40
To refer to a single score, without specifying which one, we will use Xi, where i
can take on any value from 1 to 5, or 1 to N.
32
Summation Notation
One of the most common symbols in statistics is the uppercase Greek letter
sigma (Σ)
ΣXi = 10 + 12 + 25 + 7 + 40 = 94
ΣXi = ΣX
In a simple random sample, every member of the population has an equal chance
of being selected. Your sampling frame should include the whole population.
To conduct this type of sampling, you can use tools like random number
generators or other techniques that are based entirely on chance.
Example
You want to select a simple random sample of 100 employees of Company X.
You assign a number to every employee in the company database from 1 to 1000,
and use a random number generator to select 100 numbers.
2. Systematic sampling
Example
All employees of the company are listed in alphabetical order. From the first 10
numbers, you randomly select a starting point: number 6. From number 6
onwards, every 10th person on the list is selected (6, 16, 26, 36, and so on), and
you end up with a sample of 100 people.
If you use this technique, it is important to make sure that there is no hidden
pattern in the list that might skew the sample. For example, if the HR database
groups employees by team, and team members are listed in order of seniority,
there is a risk that your interval might skip over people in junior roles, resulting
in a sample that is skewed towards senior employees.
3. Stratified sampling
Stratified sampling involves dividing the population into subpopulations that may
differ in important ways. It allows you draw more precise conclusions by ensuring
that every subgroup is properly represented in the sample.
To use this sampling method, you divide the population into subgroups (called
strata) based on the relevant characteristic (e.g., gender, age range, income
bracket, job role).
Based on the overall proportions of the population, you calculate how many
people should be sampled from each subgroup. Then you use random or
systematic sampling to select a sample from each subgroup.
Example
The company has 800 female employees and 200 male employees. You want to
ensure that the sample reflects the gender balance of the company, so you sort the
population into two strata based on gender. Then you use random sampling on
each group, selecting 80 women and 20 men, which gives you a representative
sample of 100 people.
4. Cluster sampling
Cluster sampling also involves dividing the population into subgroups, but each
subgroup should have similar characteristics to the whole sample. Instead of
sampling individuals from each subgroup, you randomly select entire subgroups.
If it is practically possible, you might include every individual from each sampled
cluster. If the clusters themselves are large, you can also sample individuals from
within each cluster using one of the techniques above. This is called multistage
sampling.
This method is good for dealing with large and dispersed populations, but there
is more risk of error in the sample, as there could be substantial differences
between clusters. It’s difficult to guarantee that the sampled clusters are really
representative of the whole population.
Example
The company has offices in 10 cities across the country (all with roughly the same
number of employees in similar roles). You don’t have the capacity to travel to
every office to collect your data, so you use random sampling to select 3 offices
– these are your clusters.
Non-probability sampling methods
In a non-probability sample, individuals are selected based on non-random
criteria, and not every individual has a chance of being included.
This type of sample is easier and cheaper to access, but it has a higher risk of
sampling bias. That means the inferences you can make about the population are
weaker than with probability samples, and your conclusions may be more
limited. If you use a non-probability sample, you should still aim to make it as
representative of the population as possible.
1. Convenience sampling
This is an easy and inexpensive way to gather initial data, but there is no way to
tell if the sample is representative of the population, so it can’t produce
generalizable results.
Example
You are researching opinions about student support services in your university,
so after each of your classes, you ask your fellow students to complete a survey
on the topic. This is a convenient way to gather data, but as you only surveyed
students taking the same classes as you at the same level, the sample is not
representative of all the students at your university.
2. Voluntary response sampling
Example
You send out the survey to all students at your university and a lot of students
decide to complete it. This can certainly give you some insight into the topic, but
the people who responded are more likely to be those who have strong opinions
about the student support services, so you can’t be sure that their opinions are
representative of all students.
3. Purposive sampling
It is often used in qualitative research, where the researcher wants to gain detailed
knowledge about a specific phenomenon rather than make statistical inferences,
or where the population is very small and specific. An effective purposive sample
must have clear criteria and rationale for inclusion.
Example
You want to know more about the opinions and experiences of disabled students
at your university, so you purposefully select a number of students with different
support needs in order to gather a varied range of data on their experiences with
student services.
4. Snowball sampling
Example
You are researching experiences of homelessness in your city. Since there is no
list of all homeless people in the city, probability sampling isn’t possible. You
meet one person who agrees to participate in the research, and she puts you in
contact with other homeless people that she knows in the area.
The size of a sample depends upon the basic characteristics of the population, the
type of information required from the survey, and the cost involved.
The researcher must keep in mind the following points while determining the
sample size
• Researchers may arbitrary decide the size of the sample without giving any
explicit consideration to the accuracy of the sample results or the cost of
sampling. The arbitrary approach should be avoided.
• For some of the projects, the total budget for the field survey in a project
proposal is allocated. If the cost of sampling per sample unit is known, one can
easily obtain the sample size by dividing the total budget allocation by the cost of
sampling per unit.
• This method concentrates only on the cost aspect of sampling rather than
the value of information obtained from such a sample.
• There are other researchers who decide on the sample size based on what
was done by other researchers in similar studies. This cannot be a substitute for
the formal scientific approach.
• The most commonly used approach for determining the sample size of the
population is the confidence interval approach.
Following points are taken into account for estimating sample size in estimation
of problems involving means
• In case the universe is divided into different strata, the accuracy for
determining the sample size for each strata may be different.
SAMPLING ERROR
A sampling error is a statistical error that occurs when an analyst does not
select a sample that represents the entire population of data. As a result, the results
found in the sample do not represent the results that would be obtained from the
entire population.
Sampling is an analysis performed by selecting a number of observations
from a larger population. The method of selection can produce both sampling errors
and non-sampling errors.
Sampling Error
• The difference between the sample mean and the population mean is called
sampling error.
Non-sampling error
1. The respondents when asked for a particular variable may not give the correct
answers.
2. The error can arise while transferring the data from the questionnaire on the
computer.
• If the population of the study is not properly defined, it can lead to errors.
• The chosen respondent may not be available to answer the questions or may
refuse to be part of the study
DATA COLLECTION
Data is a collection of facts, figures, objects, symbols, and events gathered from
different sources. Organizations collect data to make better decisions. Without
data, it would be difficult for organizations to make appropriate decisions, and so
data is collected at various points in time from different audiences.
Primary data is collected from the first-hand experience and is not used in the
past. The data gathered by primary data collection methods are specific to the
research’s motive and highly accurate.
Primary data collection methods can be divided into two categories: quantitative
methods and qualitative methods.
Quantitative Methods:
Quantitative techniques for market research and demand forecasting usually
make use of statistical tools. In these techniques, demand is forecast based on
historical data. These methods of primary data collection are generally used to
make long-term forecasts. Statistical methods are highly reliable as the element
of subjectivity is minimum in these methods.
Qualitative Methods:
Qualitative methods are especially useful in situations when historical data is
not available. Or there is no need of numbers or mathematical calculations.
Qualitative research is closely associated with words, sounds, feeling, emotions,
colors, and other elements that are non-quantifiable. These techniques are based
on experience, judgment, intuition, conjecture, emotion, etc.
Quantitative methods do not provide the motive behind participants’ responses,
often don’t reach underrepresented populations, and span long periods to collect
the data. Hence, it is best to combine quantitative methods with qualitative
methods.
Surveys
Surveys are used to collect data from the target audience and gather insights into
their preferences, opinions, choices, and feedback related to their products and
services. Most survey software often a wide range of question types to select.
You can also use a ready-made survey template to save on time and effort.
Online surveys can be customized as per the business’s brand by changing the
theme, logo, etc. They can be distributed through several distribution channels
such as email, website, offline app, QR code, social media, etc. Depending on
the type and source of your audience, you can select the channel.
Once the data is collected, survey software can generate various reports and run
analytics algorithms to discover hidden insights. A survey dashboard can give
you the statistics related to response rate, completion rate, filters based on
demographics, export and sharing options, etc. You can maximize the effort
spent on online data collection by integrating survey builder with third-party
apps.
Interviews
In this method, the interviewer asks questions either face-to-face or through
telephone to the respondents. In face-to-face interviews, the interviewer asks a
series of questions to the interviewee in person and notes down responses. In
case it is not feasible to meet the person, the interviewer can go for a telephonic
interview. This form of data collection is suitable when there are only a few
respondents. It is too time-consuming and tedious to repeat the same process if
there are many participants.
Delphi Technique
In this method, market experts are provided with the estimates and assumptions
of forecasts made by other experts in the industry. Experts may reconsider and
revise their estimates and assumptions based on the information provided by
other experts. The consensus of all experts on demand forecasts constitutes the
final demand forecast.
Questionnaire
A questionnaire is a printed set of questions, either open-ended or closed-ended.
The respondents are required to answer based on their knowledge and
experience with the issue concerned. The questionnaire is a part of the survey,
whereas the questionnaire’s end-goal may or may not be a survey.
Secondary data is the data that has been used in the past. The researcher can
obtain data from the sources, both internal and external, to the organization.
• Sales Report
• CRM Software
• Executive summaries
External sources of secondary data:
• Government reports
• Press releases
• Business journals
• Libraries
• Internet
The secondary data collection methods, too, can involve both quantitative and
qualitative techniques. Secondary data is easily available and hence, less time
consuming and expensive as compared to the primary data. However, with the
secondary data collection methods, the authenticity of the data gathered cannot
be verified.
Data Processing
• Once the data has been collected, it has to be processed and reduced so that it
can be analysed by the researcher.
• This form of data often has errors and inconsistencies which are not relevant
for the study.
• This raw data has to be transformed into a relevant set by the researcher
through the process of editing, coding, and tabulation.
• This stage is very important for effective research work, as processing the data
reduces the errors and biasness, resulting in relevant and specific data which is
appropriate for analysis.
Data Editing
• Raw data is subjected to many errors and omissions which can occur during
the data collection process.
• In editing, these errors are corrected so that readers do not get confused or
misled. Editing converts the raw data into a presentable format, so that further
analysis and interpretations can be done efficiently.
Essentials of Editing
• Completeness
• Accuracy
• Consistency
Stages of Editing
• Field Editing
• While collecting the data, the researchers instantly check for accuracy,
uniformity, and completeness of the answers.
• By the researcher
• By the supervisor
Office/ Central Editing
• When all completed forms are brought at the central office, an individual or a
team of individuals performs the editing activities on these forms.
Coding
• Coding is the process of converting the data into meaningful categories and
then assigning symbols to each of these categories.
Example: While coding the responses, the researcher assigns a code ‘1’ to all
male respondents and ‘2’ to all female respondents. It can be seen that this
process reduces the entire data of responses into two mutually exclusive classes
of ‘Male’ and Female’. Also, the logic that governs the assigning of a code to a
data is uniform i.e ‘Gender’.
Classification/ Categorisation
For example, a researcher can classify a group of 100 respondents into smokers
(60) and non-smokers (40). By analysing this classification, it can be noticed
that all respondents in the respective categories are either smokers, or
nonsmokers, and hence are homogenous in the sense that they share the habit of
smoking or non-smoking.
• The data gathered from the questionnaires should be clear and legible so
that process of data coding can be carried out properly.
• The researcher may also face problems like incomplete responses and
unanswered questions while analysing the recorded data.
• If the response sheet has too many blanks/ illegible or multiple responses for
a single answer, the form is not worth correcting and editing. So, it is better to
completely discard the whole questionnaire.
• If too many forms are discarded, then the sample for the study might become
too small for an analysis or generalization so, here it is advisable to carry out another
round of field visits.
QUESTIONNAIRE
• One of the most difficult steps in the research process is designing a well-
structured instrument.
These are questions where respondents are free to answer in their own words.
Note- There is no option given to this question and so the respondent is free to
answer in his own words.
These are questions, where the respondents have an option to choose from the
question.
Note- There are options given to this question and so respondent is free to
choose any one of the options.
• Dichotomous Questions
These questions have only 2 answers” Yes or No”, “true” or false”
Scales
Do you intend to buy a new car within the next six months?
TESTING QUESTIONNAIRE
Yes No
If Yes, then why:
If No, then why
Note: Here the researcher had made some amendments in the questionnaire by
adding reasons to know why the respondent will suggest the XYZ brand or why he/
she will not suggest the brand.
• Good layout and physical attractiveness are crucial, particularly in in mail, Internet,
and other self-administered questionnaires.
• Questionnaires should be presented in a professional, attractive and uncluttered
format.
• Questionnaires should be designed to appear as short as possible.
• Always allow enough room for respondents to answer questions and provide plenty
of white space between questions.
PROJECTIVE METHODS
Sensation is gained through the sense organs which depend upon the physical
alertness of the observer. The sense organs are receptive to stimuli and get
attracted leading to the first stage in observation.
Then comes attention or concentration which is largely a matter of
commitment and will-power. Adequate training and experience can make it
almost a matter of habit.
There are different types of observation method of data collection in research. The
important one’s are listed below:
Scientific observation, on the other hand, is carried out with due preparations and is
done with the help of right tools of measurement experienced enumerators and under
able guidance. Scientific observations yield thorough and accurate data.
2. Simple and Systematic Observation
Simple Observation is found in almost all research studies, at least in the initial
stages of exploration. Its practice is not very standardized. It befits the heuristic
nature of exploratory research. Participant studies are also usually classified as
simple observation because participant roles do not permit systematic observation.
In every act of observation there are two components namely, the object (or what is
observed) and the subject (or the observer). It may be that sometimes one may have
to observe one’s own immediate experience. That is called Subjective Observation
or Self-Observation or introspection. Prejudices and biases are generally parts of
subjective observation. Many data of psychological interest are gathered by the
method of subjective observation. To avoid such prejudices, the observer takes stock
of him and discovers what prejudices and biases will prevent impartial study and
disinterested points of view. Persistent self-observation and criticism by others may
ultimately overcome prejudice and biases. Such introspection may have another
social value i.e., it sensitizes the observer to the problems of others and creates
sympathetic insight which facilitates, at least to some degree, the understanding of
people’s behavior in similar circumstances and similar cultural contexts. The net
result is impartial subjective observation. When the observer is an entity apart from
the thing observed, the observation of this type is objective.
In factual observation things or phenomena observed with naked eyes are reported.
In the case of direct Observation, the observer is physically present and personally
monitors what takes place. This approach is very flexible of events and behavior as
they occur. He is also free to shift places, change the focus of the observation, on
concentrate unexpected events if they should occur.
GOODNESS-OF-FIT
The goodness-of-fit test is a statistical hypothesis test to see how well sample
data fit a distribution from a population with a normal distribution. Put
differently, this test shows if your sample data represents the data you would
expect to find in the actual population or if it is somehow skewed. Goodness-of-
fit establishes the discrepancy between the observed values and those that would
be expected of the model in a normal distribution case.
There are multiple methods for determining goodness-of-fit. Some of the most
popular methods used in statistics include the chi-square, the Kolmogorov-
Smirnov test, the Anderson-Darling test, and the Shipiro-Wilk test.
• Goodness-of-fit tests are statistical tests aiming to determine whether a set of
observed values match those expected under the applicable model.
• There are multiple types of goodness-of-fit tests, but the most common is the
chi-square test.
• Chi-square determines if a relationship exists between categorical data.
• The Kolmogorov-Smirnov test—used for large samples—determines
whether a sample comes from a specific distribution of a population.
• Goodness-of-fit tests can show you whether your sample data fit an expected
set of data from a population with normal distribution.
Goodness-of-fit tests are commonly used to test for the normality of residuals or
to determine whether two samples are gathered from identical distributions.
Goodness-of-Fit tests help determine if observed data aligns with what is expected.
Decisions can be made based on the outcome of the hypothesis test conducted.
For example, a retailer wants to know what product offering appeals to young
people. The retailer surveys a random sample of old and young people to identify
which product is preferred. Using chi-square, they identify that, with 95%
confidence, a relationship exists between product A and young people. Based on
these results, it could be determined that this sample represents the population of
young adults. Retail marketers can use this to reform their campaigns.
RELIABILITY AND VALIDITY IN RESEARCH
I.RELIABILITY
For ex- A weighing scale is reliable if it gives the same reading when the same object
is weighted several times.
1. Stability
2. Equivalence
1.STABILITY
• Stability aspect stands for securing consistent results with repeated measurements
by the same person with the same instrument.
• Equivalence aspect considers how much error may get introduced by different
investigators or different sample of items being studied.
A. Test-Retest method
B. Split-half method
C. Equivalence –form method
1.Test-Retest method
• If the measure is stable over time, the test administered under the same
conditions each time should obtain similiar results.
For example- The researcher measures job satisfaction and finds that 64 percent
of the population is satisfied with their jobs.
If the study is repeated a few weeks later under similiar conditions, the researcher
finds that 64 percent of the population is satisfied with their jobs. It appears that
the measure has repeatability. The high stability correlation or consistency
between the 2 measures at time 1 and time 2 indicates a high degree of reliability.
2.Split-half method
• In the split-half method the researcher may take the result obtained from one
half of the scale items (eg. Odd numbered items) and check them against the
results from the other half of the items. (eg. Even numbered items).
3.Equivalent-form method
• If there is high correlation betweeen the 2 forms, the researcher concludes that
that the scale is reliable.
• There are certain social characteristics which are highly abstract in nature and
can be measured only indirectly.
Types of Validity
1. Content Validity
2. Concurrent Validity
3. Predictive Validity
1.Content Validity
2.Concurrent Validity
3.Predictive Validity
• If this coefficient is greater than 0.50, the predictive validity of the admission
test is established.
4. Sensitivity
• A more sensitive measure, with numerous items on the scale may be needed.
For example- adding strongly agree, mildly agree, neither agree nor disagree,
mildly disagree, and strongly disagree as categories increases a scale’s
sensitivity.
Conclusion
• Measuring instruments are evaluated in terms of reliability, validity and
sensitivity.
• Validity refers to the degree to which the instrument measures the concept, the
researcher wants to measure.
DATA ANALYSIS
Data analysis can be defined as the process of gathering, modelling, and transforming data so as to
get useful information, suggestions, and conclusion in decision making.
DESCRIPTIVE STATISTICS
• Descriptive statistics or descriptive analysis are used to present quantitative
descriptions in a manageable form
• Descriptive statistics help in simplifying large amount of data in a sensible way
• each descriptive statistic reduces lot of data into a simple summary .
• This sort of analysis may describe data on one variable, two variable, or more than
two variables.
• Mean ,median, mode ,variance, range and standard deviation are the widely applied
descriptive statistics .
UNIVARIATE ANALYSIS
If gender was measured, we would look at how many participants were men and how many were
women
BI-VARIATE ANALYSIS
Bivariate analysis is concerned with the relationships between perils of variables in a data set.
MULTIVARIATE ANALYSIS
multivariate analysis is done analysis of the simultaneous relationship among three or more
Phenomena.
Eg: relationship between age, weight and height.
Measures of Dispersion
• Distribution cannot be clearly depicted by measuring the averages or central tendency.
• Averages provide the observations of only the central part of the distribution. So the study
of scatteredness of observation is very important and this study is known as measure of
dispersion.
• The word ‘dispersion’ literally means ‘fluctuation’, scatter, variation, deviation or spread.
• So the measure of dispersion shows the variation of an individual item from its average.
Methods of measuring the dispersion
• Range
• Coefficient of Range
Range
• Range is the simplest absolute measure of dispersion
• It is the difference between the highest and lowest value in a series.
Range=Highest value – Lowest value
• Range therefore measures the maximum variation in the values of a series.
Coefficient of Range
• Coefficient of range is defined as:
Largest value – Smallest value/ Largest value+ smallest value
Quartile Deviation
• Quartile deviation partially solves the limitations of range.
• It performs the calculation over middle half of the values in a data set thereby it minimizes
the influence of extreme values.
• A quartile helps break up the observations and divide into four intervals based upon the
values of the data. It also examines how far is the distribution of the observations from the
mean
• Quartiles help to measure the spread of values above and below the mean by dividing the
data into four groups.
• the observations and divide into four intervals based upon the values of the data. It also
examine show far is the distribution of the observations from the mean.
Mean Deviation
• Mean Deviation is defined as the arithmetic mean of deviations of all the values in a series
from their average, counting all such deviations as positive. The average selected may be
mean, median, or mode.
• It can be calculated from any one of the measures of central tendency such as mean,
median, and mode.
• It is also known as first moment of dispersion.
Standard Deviation
• The standard deviation is the square-root of the arithmetic average of the squares of the
deviations measured from the mean.
• Standard deviation is used to measure the spread of items in a set of observation.
• If all the observation value are identical or distribution of items of a set would be uniform,
then deviation of every value from mean is zero. When elements of the set are more
dispersed, then standard deviation becomes larger.
Descriptive analysis of bivariate data
• Bivariate analysis is concerned with the relationship between pairs of variables in a data set.
• It is the simultaneous analysis of 2 variables.
• It is usually undertaken to see if one variable, such as gender is related to another variable,
perhaps attitudes toward male/female equality.
Bivariate Statistical Techniques
• Correlation analysis
• Regression Analysis
Correlation Analysis
• It is the study of the linear relationship between two variables.
• If there are 2 variables and changes in the value of one variable will affect the value of the
other variable, then both the variables are correlated.
Example- An increase in income will lead to improved job performance.
Regression Analysis
• It is used for prediction.
• It is used to find the variations in the dependent variable due to changes in the independent
variable.
ANCOVA
Analysis of covariance is used to test the main and interaction effects of categorical variables
on a continuous dependent variable, controlling for the effects of selected other continuous
variables, which co-vary with the dependent. The control variables are called the "covariates."
ANCOVA is used for several purposes:
* In experimental designs, to control for factors which cannot be randomized but which can
be measured on an interval scale.
* In observational designs, to remove the effects of variables which modify the relationship
of the categorical independents to the interval dependent.
* In regression models, to fit regressions where there are both categorical and interval
independents. (This third purpose has become displaced by logistic regression and other
methods.
SEM
Structural equation modeling (SEM) is a set of statistical techniques used to measure and
analyze the relationships of observed and latent variables. Similar but more powerful than
regression analyses, it examines linear causal relationships among variables, while
simultaneously accounting for measurement error.
Nonparametric Tests
Non-parametric testsare statistical tests without parameters. For these types of tests, you
need not characterize your population’s distribution based on specific parameters. They are
also referred to as distribution-free tests due to the fact that they are based on fewer
assumptions (e.g. normal distribution). These tests are particularly used for testing
hypothesis, whose data is usually non normal and resists transformation of any kind. Due to
the lesser amount of assumptions needed, these tests are relatively easier to perform. They
are also more robust. An added advantage is the reduction in the effect of outliers and
variance heterogeneity on our results. This test can be used for ordinal and sometimes even
for nominal data. However, nonparametric tests do have their own disadvantages as well.
Firstly, the results that they provide may be less powerful compared to the results provided
by the parametric tests. To overcome this problem, it is preferred that a larger number of
samples be taken if one is adopting this approach. Secondly, their results are usually more
difficult to interpret than the results of parametric tests. This is because we usually assign
ranks to samples in the case of non-parametric tests rather than using the original data. This
further complicates the system and distorts our intuitive understanding of the data. Non-
parametric tests are useful and important in many cases, but they may not provide us with
the ideal results.
When to use Non-parametric testing?
- When the outcome is a rank or an ordinal variable – For example in the case of movie
ranking etc.
- When there are a number of explicit outliers – The samples may show a continuous
pattern with some very extreme ended outliers.
- When the outcome has a clear limit of detection – This means that the outcome being
determined has been done so with some limitations or imprecision.
CHI-SQUARE TEST
Χ2 quantity
• The quantity, χ2 describes the magnitude of the frequency between theory and observation.
• It describes the magnitude of difference between observed frequencies and the frequencies
expected under certain assumptions.
• It is a statistical test which tests the signifance of difference between the observed
frequencies and the expected frequencies.
• Χ2 value ranges from 0 to infinity. It is 0 when the expected and observed frequencies
completely coincide.
• So, greater the value of Χ2 , greater is the discrepancy between observed and expected
frequencies.
Characteristics of Chi-Square test
• It is a non-parametric test. Assumptions about the form of the distribution or its parameters
is not required.
• It is a distribution free test, which can be used in any type of distribution of population.
• It analyses a set of differences between a set of observed frequencies and a set of
corresponding expected frequencies.
Applications of Chi-Square Test
• Useful for the test of goodness of fit.
• Useful for the test of independence of attributes.
• Useful for testing homogeneity
• Useful for testing given population variance
Introduction
• A two- sample sign test is a non-parametric test based upon the sign of a pair of
observations.
• Suppose a sample of respondents is selected and their views on the image of a company are
sought. After some time, these respondents are shown an advertisement, and thereafter, the
data is again collected on the image of a company.
• For those respondents, where the image has improved, there is a positive and for those
where the image has declined there is a negative sign assigned and for the one where there
is no change, the corresponding observation is dropped from the analysis and the sample size
reduced accordingly.
• The key concept underlying the test is that if the advertisement is not effective in improving
the image of the company, the number of positive signs should be approximately equal to the
number of negative signs. For small samples, a binomial distribution could be used, whereas
for a large sample, a normal approximation to the binomial distribution could be used, as
already explained in the one sample test.
Mann-Whitney U Test
Introduction
• This test is used to examine whether two samples have been drawn from populations with same
mean.
• This test is an alternative to t test for testing the equality of means of two independent samples
H0 : Two samples come from identical populations or Two populations have identical probability
distribution.
H1 : Two samples come from different populations or Two populations differ in locations
• The two samples are combined (pooled) into one large sample and then we determine the rank of
each observation in the pooled sample. If two or more sample values in the pooled samples are
identical, if there are ties, the sample values are each assigned a rank equal to the mean of the ranks
that would otherwise be assigned.
Let R1 and R2 represent the sum of the ranks of the first and the second sample whereas n1 and n2
are the respective sample sizes of the first and the second sample. For convenience, choose n1 as a
small size if they are unequal so that n1< n2 . A significant difference between R1 and R2 implies a
significant difference between the samples
• If n1 or n2 is greater than 10, a large sample approximation can be used for the distribution of the
Mann-Whitney U statistic.
• For this purpose, either of U1 or U2 could be used for testing a one tailed or a two-tailed test. In
this test, U2 will be used for the purpose.
• Under the assumption that the null hypothesis is true, the U2 statistic follows an approximately
normal distribution with mean uu2=n1n2 /2
If the absolute sample value of Z is greater than the absolute critical value of Z, the null hypothesis is
rejected
Example
• The table below represents the number of bounced cheques in two banks- Bank A and Bank B- on
randomly chosen 12 days for Bank A and 15 days for Bank B. Use a Mann-Whitney U test to examine
at a 5 percent level of significance whether Bank A has more bounced cheques as compared to Bank
B.
We consider the sample of Bank B as coming from the population B whereas that of Bank A belonging
to the population A. R1= Sum of ranks of Bank A= 249 R2= Sum of ranks of Bank B= 129
The critical value of Z at a 5 percent level of significance is given by 1.645. The sample value of Z
exceeds the critical value of Z and the null hypothesis is rejected. Therefore, Bank A has a larger
number of bounced cheques as compared to Bank B
WILCOXON SIGNED RANKED TEST FOR
PAIRED SAMPLES
Introduction
• There are instances where the sample data consists of paired observations.
• Examples of paired samples include a study where husband and wife are matched or where subjects
are studied before and after experimentation or observations are taken on a variable for brother and
sister.
• In two sample sign test, only the sign of the difference (positive or negative) was taken into account
and no weightage was attached to the magnitude of the difference
The Wilcoxon-matched pair signed rank test takes care of this limitation and attaches a greater
weightage to the matched pair with a larger difference.
• The test, therefore, incorporates and makes use of more information than the sign test
Test procedure for Wilcoxon Signed Ranked test for paired samples
• Let di denote the difference in the score for the ith matched pair. Retain signs, but discard any pair
for which d=0.
• Ignoring the signs of the difference, rank all the di ’s from the lowest to highest.
• Compute the sum of the absolute value of the negative and positive ranks to be denoted as T_ and
T+ respectively.
• When the number of the pairs of observations (n) for which the difference is not zero and is greater
than 15, the T statistic follows an approximate normal distribution under the null hypothesis, that the
population differences are centered at 0.
Problem
• A sample of 16 salesmen was selected in an organisation and their score on the performance
appraisal was noted. The salesmen were sent for a three-week training programme and in the next
appraisal, their scores were noted again. The appraisal scores before and after the training are given
below:
Use a 5 percent level of significance to test the hypothesis that the training has not caused any change
in the performance appraisal system score
The number of typing errors per page made by 17 students who joined a typing institute before and
after the training is given below. Use a 5 percent level of significance to test the hypothesis that the
average number of typing errors decreased after the training.
KRUSKAL-WALLIS TEST
Introduction
• When testing the equality of more than two population means, one way ANOVA
technique was used.
• One of the assumptions used in ANOVA is all the involved populations from where the
samples are drawn are normally distributed. If this assumption does not hold true, the F-
statistic used in ANOVA becomes invalid.
• The normality assumption may not hold true when we are dealing with ordinal data
or when the size of the sample is very small.
• The Kruskal Wallis test comes to our rescue in such situations. This is, in fact, a non-
parametric counterpart to one-way ANOVA. The test is an extension of the Mann-Whitney
U test.
• Both methods require that the scale of measurement of a sample value should be at
least ordinal.
• Obtain random samples of size n ,….., n from each of the k populations. Therefore,
1 k
the total sample size is n=n1+n2+…..+nk
• Pool all the samples and rank them, with the lowest score receiving a rank of 1. Ties
are to be treated in the usual fashion by assigning an average rank to the tied positions.
The Kruskal- Wallis test uses the X2 to test the null hypothesis. The test statistic is given by:
Problem
• Three machines are used in the packaging of 16kg of wheat flour. Each machine is
designed so as to pack on an average of 16 kg of flour per bag. Samples of six bags were
selected from each machine and the amount of wheat packaged in each bag is shown
below: Use a percent level of significance to test the hypothesis that the amount of wheat
packaged by the three machines is the same.
Solution
H0 : Amount of wheat packaged by the three machines is the same.
H1 : Amount of wheat packaged by at least two machines is different.
15.4 1 2 16 10 2
15.7 3 1 16.1 11 2
15.7 3 3 16.2 13 1
15.7 3 3 16.2 13 2
15.9 8 2 16.5 18 2
Pool the elements of the different samples and rank them. These rankings are shown below:
r1 (Total of ranks of machine 1)=50.5
r2 (Total of ranks of machine 2)=61
r3 ( Total of ranks of machine 3)=59.5
CORRELATION ANALYSIS
What is Correlation?
• If there are two variables and the changes in the value of one variable will affect the value of the
other variable, then both the variables are correlated.
• For example, when the price of a commodity changes, the demand of that commodity also changes
Types of Correlation
• Positive correlation
• Negative correlation
Positive correlation
• Correlation is positive when two variables vary in the same direction.
Negative Correlation
• Two variables are said to be negatively correlated when both the variables vary in the opposite
direction. When one variable increases, then the other variable decreases and vice versa is also a
negative correlation.
• For example, consider the correlation between production and price of a crop.
• Simple correlation
• Multiple correlation
• Partial correlation
Simple Correlation
• When we measure the linear relationship between two variables, then this interpretation is known
as simple correlation.
Ex- Relationship between sales and expenses, income and consumption etc.
Multiple Correlation
For example- If we try to find out the relationship of rainfall and temperature on the yield of wheat,
then this is known as multiple correlation
Partial Correlation
• If we have various related variables and try to find out the relationship between two variables, then
it is known as partial correlation.
For example-Consider the two variables height and weight which are partially correlated because of
the effect of the third variable ‘age’ on height and weight. In this condition, if we neglect the effect of
age, and the study the relationship between height and weight, then this correlation is known as
partial correlation.
• Linear correlation
• Non-linear correlation
Linear Correlation
• In a linear correlation, change in the values of one variable has a fixed ratio to the variation in the
values of the other variable.
• When these variables are plotted on a graph, then plotted points would fall on a straight line. For
example, consider the following relationship shown in the table.
Non-Linear Correlation • In a non-linear correlation, changes in values of one variable does not have
a fixed ratio to the variation in the value of the other variable.
• When these variables are plotted on a graph, then plotted points would fall on a curve.
Degree of Correlation
• Perfect correlation- If two variables change in the same direction and in the same proportion, the
correlation between the two variables is called perfect positive correlation. The correlation coefficient
in this case is +1.
• If two variables change in the opposite direction and in the same proportion, the correlation
between the two variables is called perfect negative correlation. In this case, the coefficient of
correlation is -1
• If two series of two variables show no relation between them or change in one variable does not
lead to a change in the other variable, then it means that there is no relationship between variables.
In this case, the coefficient of correlation is zero.
REGRESSION ANALYSIS
• Regression analysis means the estimation or the prediction of the unknown value of one variable
from the known value of other variable.
• For example, if the agriculturist knows that the yield of rice and rainfall are closely related, then he
will want to know the amount of rain required to achieve a certain production. In this situation, he
will use a regression analysis
• The variable whose value is influenced or is to be predicted is called dependent variable and the
variable which influences the values or is used for prediction is called independent variable
Classification of Regression
• Simple Regression
• Multiple Regression
• Linear Regression
• Non-linear Regression
Simple Regression
• When there are only two variables, the regression equation obtained is called simple regression
equation
Multiple Regression
• In multiple regression analysis, there are more than 2 variables and we try to find out the effect of
more than 2 independent variables on one dependent variable. Let X, Y, and Z, be the three variables.
X and Y are the independent variables and Z is the dependent variables.
Then we use multiple regression analysis to study the relative movement of Z, for a unit movement in
X and Y. For example, if there are three variables- yield, rainfall, and temperature. If yield is depending
on rainfall and temperature, then we get regression equation of Z on X and Y where Z is yield, X-
rainfall, and Y-temperature
Linear Regression
• On the basis of the proportion of changes in the variables, the regression can be classified as Linear
and non-linear regression.
• If the given bivariate date is plotted on a graph, the points so obtained on the scatter diagram will
more or less concentrate around a curve, called curve of regression.
• If the regression curve is a straight line, we say that there is linear relationship between the variables
under study. Mathematically, the relation between x and y in a linear regression can be expressed as
y= a+bx
• In a linear regression, the change in the dependent variable is proportionate to the changes in the
independent variable
Non-linear Regression
• If the curve of regression is not a straight line, then the regression is termed as curved or non-linear
regression. In this case, the dependent variable does not change by a constant amount of change in
the independent variable
DISCRIMINANT ANALYSIS
Introduction
• This technique is used to classify individuals/objects into one of the alternative groups on the basis
of a set of predictor variables.
• The dependent variable in the discriminant analysis is categorical and on a nominal scale, whereas
the independent variables or predictor variables are either interval or ratio scale in nature.
• For example, the dependent variable may be choice of a brand of a personal computer (brand A, B,
or C) and the independent variables may be ratings of attributes of PC’s on a 7 point Likert Scale.
• When there are two groups (categories) of dependent variable, we have two-group discriminant
analysis and when there are more than two groups, it is a case of multiple discriminant analysis.
• Thus, each respondent rates the product according to each of the four characteristics and then
indicates whether he would be a prospective buyer of the product or not.
• The rating is done on a 11-point scale (where 0 represents very and poor and 10-excellent).
• Which variables (durability, light weight, low investment, and rot resistance) are relatively better in
discriminating between two groups.
• Those who buy our brand and those who buy competitor’s brand.
• Those who go to Food world to buy and those who buy in a Kinara shop.
• It is used for Scale construction-Discriminant analysis is used to identify the variables or statements
that are discriminating and on which people with diverse views will respond differently.
• Segment discrimination
• Perceptual mapping.
CONJOINT ANALYSIS
Introduction
• It attempts to identify the most desirable attributes that could be offered in a product or service.
• An attempt is made to determine the relative importance that consumers attach to the attributes
and the utilities that they attach to the levels of attributes.
• The utilities describe the importance that consumer attach to the levels of each attribute.
• Here the respondents are told about the various combinations of the attribute levels and are asked
to evaluate the combinations in terms of their desirability.
• Conjoint analysis makes use of subjective evaluation of the combinations presented to the
consumer.
• Conjoint analysis is a statistical technique used in market research to determine how people value
different attributes (feature, function, benefits) that make up an individual product or service.
• Conjoint Analysis is search for the development of part-worth utility functions which describe the
utility, consumers attach to the levels of each attribute
• Determine relative importance of the attributes in the choice process of the consumers.
• Determine the market share of the brand that differ in attribute levels.
• The idea is that a product or service can be broken down into its constituent parts - so for instance
a mobile phone has a size, weight, battery life, size of address book, type of ring.
• When we compare between mobile phones each will have a different specification on each of these
attributes.
• You might have choices in terms of battery life between 72, 108, 120 hours of battery life.
Suppose we ask a set of respondents to express their preference among movies that varied on three
attributes, each with two levels as shown below:
There are in total 2X2X2=8 combinations. Each of these features is presented to say, respondent
Number 1. The various features would look like: Hero of the movie: Sharukh Khan or Akshay Kumar
• The respondent could be presented with the above 8 combinations and asked to give their
preferences in terms of desirability of the feature, either on an interval scale or an ordinal scale.
• Identification of Attributes
• Aggregation of judgements
FACTOR ANALYSIS
Introduction
It is a very useful method to reduce a very large number of variables resulting in data
complexity to a few manageable factors.
• When the objective is to summarise information from a large set of variables into fewer
factors, principal component factor analysis is used.
• On the other hand, if the researcher wants to analyse the components of the main
factor, common factor analysis is used.
• Example- Common factor- Inconvenience inside a car. Common factors may be
1. Leg Room
2. Seat Arrangement
3. Entering the rare seat
4. Inadequate dickey space
5. Door locking mechanism
Method – The Marketing Researcher prepares a questionnaire to study the customer feedback. The
researcher has identified 6 variables or factors for this purpose.
Grouping of variables
• A, B, D, E into factor-1
• F into Factor 2
• C into Factor 3
CLUSTER ANALYSIS
Cluster Analysis
• Cluster analysis is a grouping technique. The basic assumption underlying the technique is
the fact that similarity is based on multiple variables, and the technique attempts to measure
the proximity in terms of the study variables.
• The emerging groups are homogenous in their composition and heterogenous as compared
to other groups.
• The grouping can be done for objects, individuals, and products.
• The researcher identifies a set of clustering variables which have been assumed as
significant for the purpose of classifying the objects into groups.
• It is also referred to as classification technique, numerical taxonomy, and Q analysis.
• Cluster analysis starts with an undifferentiated group of people, events, or objects, and
attempts to reorganize them into homogenous groups. Cluster analysis classifies objects so
that each person is similar to others in the cluster with respect to some predetermined
selection criteria.
• The resulting clusters of objects then exhibit high internal (within cluster) homogeneity and
high external (between cluster) heterogeneity. Thus if successful, the objects within cluster
will be close together when plotted geometrically, and different clusters will be far apart.
Simple cluster solution of breakfast food based on people who seek
nutrition and convenience (ease of preparation)
A person might be using different criteria for a weekday and for weekend breakfast. A bakery/
confectionary shop selling sandwiches, bread rolls would know
1. The lucrative segment
2. The segment which might be motivated to buy if one takes care of their weekday/weekend
needs
3. A segment which is currently not interested in getting a ready-toeat breakfast solution and
might not look at the bakery as an outlet to visit in the morning.
• Once the homogenous clusters emerge, the next step is to determine the profile of the
group in terms of who they are?
• What is their gender, age group, family size, etc?
• What deals motivate them to buy from a particular store when are buying eatables in
general?
Applications
• It is used to segment the market in marketing. It helps marketers discover distinct groups in
their customer bases, and then use this knowledge in targeted marketing programs.
• It is used in social networking sites in making new groups based on user’s data.
• It is also used in city planning – Identifying groups of houses according to their house type,
value , and geographical location.