Jasika RM Lab
Jasika RM Lab
PRACTICAL FILE
Submitted by
“JASIKA”
“111”
STUDENT NAME
Before we get into the project, I would like to add a few words of appreciation
for the people who have been a part of this project right from its inception. The
writing of this project has been one of the significant academic challenges I
have faced and without the support, patience, and guidance of the people
involved, this task would not have been completed. It is to them I owe my
deepest gratitude.
It gives me immense pleasure in presenting this project report on “Research
Methodology (using MS Excel and R Studio”. The success of this project is a
result of sheer hard work, and determination put in by me with the help of my
project guide. I hereby take this opportunity to add a special note of thanks for
DR. AANCHAL AGGARWAL, who undertook to act as my mentor despite
her many other academic and professional commitments. Her wisdom,
knowledge and commitment to the highest standards inspired and motivated me,
without her insight, support and energy, this project wouldn’t have kick-started
and neither would have reached fruitfulness.
INDEX
TOPIC PAGE NO.
Data Analysis
Descriptive statistics
Histogram frequency distribution
Correlation (Positive, Negative, zero)
HYPOTHESIS TESTING
One sample t test using dummy (one-tailed)
Two sample t test (two-tailed)
Two sample - t test (one tailed)
Paired Sample t test
Two sample z test
F test
ANOVA – Single Factor
ANOVA – Two Factor without replication
ANOVA – Two Factor with replication
HYPOTHESIS TESTING in R Studio
How to install R Studio
Introduction to R studio
Import of Data Sheet in R studio
Descriptive statistics
Correlation
Hypothesis Testing: One sample t test (two tail)
Hypothesis Testing: Two independent sample t test
Hypothesis Testing: Paired Sample t test (alpha 10%, one tail )
Hypothesis Testing: Paired Sample t test 2 (One tail)
Hypothesis Testing: F test
Hypothesis Testing: One-way ANOVA
DATA
ANALYSIS
Descriptive statistics
AGE
5
25
15
45
26
45
48
59
15
48
47
77
16
28
25
84
75
59
25
48
13
20
30
69
AGE
39.458333
Mean 33
4.5983531
Standard Error 89
Median 37.5
Mode 25
Standard 22.527237
Deviation 94
507.47644
Sample Variance 93
-
0.8383230
Kurtosis 41
0.4426490
Skewness 84
Range 79
Minimum 5
Maximum 84
Sum 947
Count 24
HISTOGRAM FREQUENCY DISTRIBUTION
STEPS:
1) Go to Data Tab Data analysis optionSelect Histogram Option and
Click OK
2) Select the input range, Labels, Output range and Pareto, Chart Output
and Cumulative Percentage Option and Click OK
3) The Output is Displayed
Frequenc Cumulative BIN Frequenc Cumulative
BINS y % S y %
25 0 0.00% 85 47 21.27%
30 3 1.36% 90 37 38.01%
35 4 3.17% 95 35 53.85%
40 2 4.07% 80 19 62.44%
45 0 4.07% 70 15 69.23%
50 6 6.79% 65 14 75.57%
55 11 11.76% 55 11 80.54%
60 8 15.38% 75 11 85.52%
65 14 21.72% 100 9 89.59%
70 15 28.51% 60 8 93.21%
75 11 33.48% 50 6 95.93%
80 19 42.08% 35 4 97.74%
85 47 63.35% 30 3 99.10%
90 37 80.09% 40 2 100.00%
95 35 95.93% 25 0 100.00%
100 9 100.00% 45 0 100.00%
Mor
More 0 100.00% e 0 100.00%
Histogram
50 120.00%
45
100.00%
40
35
80.00%
30
Frequency
25 60.00% Frequency
20 Cumulative %
40.00%
15
10
20.00%
5
0 0.00%
85 90 95 80 70 65 55 75 100 60 50 35 30 40 25 45 ore
M
BINS
CORRELATION
The correlation coefficient (a value between -1 and +1) tells you how strongly two variables
are related to each other.
a. POSITIVE CORRELATION
What is the correlation between the advertisement of a product in a month and its sales in
crores?
Sales in
Advertisement in month crores
32 5
54 10
67 15
65 20
98 24
112 34
101 25
34 34
Result:
Advertisement in Sales in
month crores
Advertisement in
month 1
Sales in crores 0.485149134 1
Inference:
Here r = +0.48, therefore there is a positive correlation between advertisements and sales.
b. NEGATIVE CORRELATION
What is the correlation between no of cigarettes in a week and life
expectancy?
Cigarette Life
s expectancy
5 80
23 78
25 60
48 53
17 85
8 84
4 73
26 79
11 81
19 75
14 68
35 72
29 58
4 92
23 65
Inference:
Here r = -0.71, therefore there is a negative correlation between number of cigarettes in a
week and life expectancy.
c. NO ZERO CORRELATION
Inference:
Here r=0, therefore there is no correlation between shoe size and IQ level.
HYPOTHESIS
TESTING
ONE SAMPLE T-TEST USING A DUMMY (ONE-TAILED)
Problem: To determine that the population mean of age is greater than 40
at a=0.05
Age Dummy
42 0
76 0
56 0
67
65
65
89
45
45
65
78
55
44
65
76
89
54
56
56
76
45
Hypothesis Testing:
Null hypothesis (H0): The mean age of the population is not greater than 40.
Alternate hypothesis (H1): The mean age of the population is greater than 40.
H0 = µ≤40
H1 = µ>40
Result:
t-Test: Two-Sample Assuming Equal Variances
Age Dummy
Mean 62.33333 0
Variance 208.6333 0
Observations 21 3
Pooled Variance 189.6667
Hypothesized Mean Difference 40
df 22
t Stat 2.627379
P(T<=t) one-tail 0.007691
t Critical one-tail 1.717144
P(T<=t) two-tail 0.015382
t Critical two-tail 2.073873
Decision Rule:
If t-stat is greater than t-critical, reject Null Hypothesis.
If p(t) is less than a, reject Null Hypothesis
Inference:
Since t Stat (2.62) is greater than t critical (1.71), reject null hypothesis.
Since P (0.007) is less than α (0.05), reject null hypothesis.
Conclusion:
The population mean age is greater than 40 at α=0.05
TWO SAMPLE T-TEST (TWO TAILED)
Problem: To analyse that there is a significant difference between the marks
scored by class groups A & B in mathematics at α=10%
Group A Group B
76 95
87 97
98 87
78 89
76 87
78 45
76 76
88 56
78 76
87 87
87 76
87 76
76 45
89 88
65 76
78 66
89 78
87 56
87 77
Hypothesis Testing:
Null hypothesis (H0): There is no significant difference between the marks
scored by class groups A & B in mathematics at α=10%
Alternate hypothesis (H1): There is a significant difference between the marks
scored by class groups A & B in mathematics at α=10%
H0 = µA = µB; µA - µB = 0
H1 = µA ≠ µB; µA - µB ≠ 0
Result:
t-Test: Two-Sample Assuming Equal Variances
Group Group
A B
82.4736 75.4210
Mean 8 5
57.3742 238.812
Variance 7 9
Observations 19 19
148.093
Pooled Variance 6
Hypothesized Mean Difference 0
df 36
1.78626
t Stat 1
0.04124
P(T<=t) one-tail 1
1.30551
t Critical one-tail 4
0.08248
P(T<=t) two-tail 2
1.68829
t Critical two-tail 8
Decision Rule:
If t-stat is greater than t-critical, reject Null Hypothesis.
If p(t) is less than α, reject Null Hypothesis
Inference:
Since t Stat (1.78) is greater than t critical (1.68), reject null hypothesis.
Since P (0.08) is less than α (0.1), reject null hypothesis.
Conclusion:
There is a significant difference between the marks scored by class groups A &
B in mathematics at α=10%.
TWO SAMPLE T TEST (ONE TAILED)
Problem: To analyze that the time spent by full-time students
studying statistics is more than the time spent by part-time
Full Part
time time
3.2 3.1
1.5 3.4
6.5 4.6
0.2 2.8
3.7 2.3
3.3 1.5
1.7 3.8
3.6 9.5
3.8 4.3
5.3 2.7
6.9 1.6
3.6 1.6
1.7 3.2
1.2 4.2
7.2 3.9
3.9 1.2
1.9 0
5.3 0
t-Test: Two-Sample Assuming Unequal Variances
Part
Full time time
2.98333
Mean 3.583333333 3
4.56617
Variance 4.133235294 6
Observations 18 18
Hypothesized Mean Difference 0
df 34
t Stat 0.86306312
P(T<=t) one-tail 0.19707508
t Critical one-tail 1.690924255
P(T<=t) two-tail 0.394150159
t Critical two-tail 2.032244509
DECISION RULE:
If T stat > T Critical, Reject Null Hypothesis
If P< Alpha, Reject Null Hypothesis
INFERENCE:
Since T stat (0.86) is less than t critical (1.69. Therefore, accept Null
Hypothesis.
Since P value (0.39) which is greater than alpha. Therefore, accept Null
Hypothesis.
CONCLUSION:
Therefore, the time spent by part time students in studying statistics is same as
the time spent by part time students at Alpha= 0.05
PAIR SAMPLE T-TEST
Problem: To determine that there is a significant difference
between the time to finish the race when race is completed with local
shoes and branded shoes.
Athelet Local Branded
e shoes shoes
1 3.2 3.1
2 1.5 3.4
3 6.5 4.6
4 0.2 2.8
5 3.7 2.3
6 3.3 1.5
7 1.7 3.8
8 3.6 9.5
9 3.8 4.3
10 5.3 2.7
11 6.9 1.6
12 3.6 1.6
13 1.7 3.2
14 1.2 4.2
15 7.2 3.9
Hypothesis Testing:
Null hypothesis (H0): There is no significant difference between the
time to finish the race when race is completed with local shoes and
branded shoes.
Alternate hypothesis (H1): There is a significant difference between
the time to finish the race when race is completed with local shoes
and branded shoes.
H0 = µA = µB; µA - µB = 0 or tl = tb, tl – tb = 0
H1 = µA ≠ µB; µA - µB ≠ 0 or tl ≠ tb, tl – tb ≠ 0
Result:
t-Test: Paired Two Sample for Means
Local shoes
Mean 3.56 3.5
4.59828
Variance 6 3.76
Observations 15 15
-
Pearson Correlation 0.02216
Hypothesized Mean
Difference 0
df 14
0.07950
t Stat 6
0.46887
P(T<=t) one-tail 8
t Critical one-tail 1.76131
0.93775
P(T<=t) two-tail 5
2.14478
t Critical two-tail 7
Decision Rule:
If t-stat is greater than t-critical, reject Null Hypothesis.
If p(t) is less than α, reject Null Hypothesis
Inference:
Since t Stat (0.079) is less than t critical (2.14), accept null
hypothesis.
Since P (0.93) is greater than α (0.05), accept null hypothesis.
Conclusion:
There is no significant difference between the time to finish the race
when race is completed with local shoes and branded shoes.
TWO SAMPLE Z TEST
PROBLEM- The net annual returns (the returns on investment after deducting
all relevant fees) in percentage are given. Can investors do better by buying
mutual funds directly from banks or other financial institutions than by
purchasing mutual funds through brokers. Can we conclude at the 5%
significance level that directly-purchased mutual funds outperform mutual funds
bought through brokers?
Broke
Direct r
9.33 3.24
6.94 -6.76
16.17 12.8
16.97 11.1
5.94 2.73
12.61 -0.13
3.33 18.22
16.13 -0.8
11.2 -5.75
1.14 2.59
4.68 3.71
3.09 13.15
7.26 11.05
2.05 -3.12
13.07 8.94
0.59 2.74
13.57 4.07
0.35 5.6
2.69 -0.85
18.45 -0.28
4.23 16.4
10.28 6.39
7.1 -1.9
-3.09 9.49
5.6 6.7
5.27 0.19
8.09 12.39
15.05 6.54
13.21 10.92
1.72 -2.15
14.69 4.36
-2.97 -11.07
10.37 9.24
-0.63 -2.67
-0.15 8.97
0.27 1.87
4.59 -1.53
6.38 5.23
-0.24 6.87
10.32 -1.69
10.29 9.43
4.39 8.31
-2.06 -3.99
7.66 -4.44
10.83 8.63
14.48 7.06
4.8 1.57
13.12 -8.44
-6.54 -5.72
-1.06 6.95
HYPOTHESIS TESTING
Null Hypothesis: Directly purchased mutual funds do not outperform mutual
funds bought through brokers.
Alternate Hypothesis: Directly purchased mutual funds do outperform mutual
funds bought through brokers.
H0: µ0 ≤ µ1
H1 : µ0 > µ1
RESULT:
z-Test: Two
Sample for Means
DIREC BROK
T ER
Mean 6.6312 3.7232
37.488
Known Variance 2 43.3393
Observations 50 50
Hypothesized
Mean Difference 0
2.2871
z 77
0.0110
P(Z<=z) one-tail 93
1.6448
z Critical one-tail 54
0.0221
P(Z<=z) two-tail 85
1.9599
z Critical two-tail 64
DECISION RULE :
If Z STAT IS LESS THAN Z critical accept null hypothesis
If P(Z) greater than α, accept null hypothesis
INFERENCE:
Since z-stat (2.28) is greater than z-critical (1.64), we will reject Null
hypothesis.
Since p(z) value (0.011) is less than α(0.05), we will reject Null hypothesis.
CONCLUSION:
Directly purchased mutual funds outperform funds bought through brokers.
F TEST
Determine whether the variance of Class 1 is greater than the variance of class2 in
mathematics.
Class1 Class2
65 76
76 54
65 67
76 65
56 76
45 66
HYPOTHESIS TESTING
NULL HYPOTHESIS: Variance of class 1 is not greater than variance of class 2.
ALTERNATE HYPOTHESIS: Variance of class 1 is greater than variance of class 2.
H0: V1≤V2: V1-V2≤0
H1: V1>V2: V1-V2>0
Decision rule
If F STAT IS LESS THAN f critical accept null hypothesis
If P(F) greater than α, accept null hypothesis
INFERENCE
SINCE F STAT = 2.13 is less than F critical = 5.05 therefore accept null hypothesis
Since P = 0.21 is less than α = 0.05 accept null hypothesis
CONCLUSION
Therefore, variance of class 1 is not greater than class 2
Hypothesis Testing:
Null hypothesis: There is no significant difference between the mean of population.
H0: μ1 = μ2 = μ3
Alternate hypothesis: There is a significant difference between the mean of population.
H1: at least one of the means is different.
Anova: Single
Factor
SUMMARY
Groups Count Sum Average Variance
Economics 9 435 48.33333 23.5
Science 7 420 60 32.33333
History 9 393 43.66667 50.5
ANOVA
Source of
Variation SS df MS F P-value F crit
Between
Groups 1085.84 2 542.92 15.19623 7.16E-05 3.443357
Within Groups 786 22 35.72727
Total 1871.84 24
INFERENCE
Since f stat (15.196) is greater than f critical (3.443) therefore reject the null hypothesis.
Since P value (0.0000715) which is less than alpha therefore reject null hypothesis.
CONCLUSION
Therefore, the mean marks of the students in economics science and history are all not equal assuming α= 0.
ANOVA-TWO FACTOR WITHOUT REPLICATION
Studen Histor
ts Eco Sci y
A 42 69 35
B 53 54 40
C 49 58 53
D 53 64 42
E 43 64 50
COLUMN WISE
ALTERNATIVE HYPOTHESIS: THERE IS A DIFFERENCE IN MARKS EACH
STUDENT SCORES
NULL HYPOTHESIS:THERE IS NO DIFFERENCE BETWEEN STUDENT MARKING
AND SUBJECTS SCORES
ROW WISE
ALTERNATIVE HYPOTHESIS: THERE IS A SIGNIFICANT DIFFERENCE IN
MARKS STUDENT WISE
Va
ria
SUMMARY Count Sum Average nce
322
.33
A 3 146 48.66667 33
B 3 147 49 61
20.
333
C 3 160 53.33333 33
D 3 159 53 121
114
.33
E 3 157 52.33333 33
Eco 5 240 48 28
34.
Sci 5 309 61.8 2
54.
History 5 220 44 5
ANOVA
P-
val F
Source of Variation SS df MS F ue crit
0.3 0.8 3.8
002 698 378
Rows 60.93333 4 15.23333 63 89 53
8.5 0.0 4.4
952 101 589
Columns 872.1333 2 436.0667 69 72 7
Error 405.8667 8 50.73333
Total 1338.933 14
DECISION RULE:
If f is greater than f critical, reject null hypothesis.
If p(f) is less than α, reject null hypothesis.
INFERENCE:
ROW WISE
Since f(0.3) is lesser than f critical (3.83), we will accept null hypothesis.
Since p(f) value (0.86) is greater than α(0,05), we will accept null hypothesis.
COLUMN WISE
Since f(8.59) is greater than f critical (4.45), we will reject null hypothesis.
Since p(f) value is greater than α(0.05), we will reject null hypothesis.
CONCLUSION:
ROW WISE- there is enough evidence that there is no significant difference between marks
of the students.
COLUMN WISE- there is enough evidence that there is significant difference between
marks of three subjects – Economics, Science, History.
ANOVA-TWO FACTOR WITH REPLICATION
Problem : To test whether or not marks of students differ with respect to school, subject wise
and school wise in conjunction with the subjects.
53 54 40
49 58 53
53 64 42
43 64 50
SCHOOL B 44 55 39
45 56 55
52 0 39
54 0 40
0 0 0
Hypothesis Testing:
Row Wise:
H0 : There is no significant difference between school A and School B
H1: There is a significant difference between school A and School B
Column Wise:
H02 : There is no significant difference between economics, medicine and history
H2: There is a significant difference between economics, medicine and history
Interaction Wise:
H03: There is no significant difference between school A and School B subject-wise (in
conjunction with subjects)
H3: There is a significant difference between school A and School B subject-wise (in
conjunction with subjects)
Anova: Two-Factor With Replication
42
Count 5 5 10
44
Count 5 5 10
Sum 111 173 284
Average 22.2 34.6 28.4
640.266
Variance 924.2 420.3 7
Total
Count 10 10
Sum 420 393
Average 42 39.3
861.555 235.566
Variance 6 7
ANOVA
Source of
Variation SS df MS F P-value F crit
8.37636 0.01056 4.49399
Sample 3001.25 1 3001.25 1 8 8
0.75388 4.49399
Columns 36.45 1 36.45 0.10173 9 8
3.18183 0.09343 4.49399
Interaction 1140.05 1 1140.05 1 7 8
Within 5732.8 16 358.3
Total 9910.55 19
DECISION RULE:
If f stat is greater than f critical, reject null hypothesis.
If p value is less than a, reject null hypothesis.
INFERENCE:
Row wise:
Here, f stat (8.376) is greater than f-critical (4.493), we will reject Null hypothesis.
Here, p value (0.01) is less than a (0.05), we will reject Null hypothesis.
Column wise:
Here, f stat (0.101) is less than f-critical (4.493), we will accept Null hypothesis.
Here, p value (0.753) is greater than u (0.05), we will accept Null hypothesis.
Interaction wise:
Here, f stat (3.181) is less than f-critical (4.493), we will accept Null hypothesis.
Here, p value (0.093) is greater than (1 (0.05), we will accept Null hypothesis.
CONCLUSION:
Row wise:
There is enough evidence that marks of students differ significantly school wise.
Column wise:
There is enough evidence that there is no difference between the marks of the three
subjects. i.e., Economics, Science and History.
Interaction:
There is no significant difference between the marks of School A and School B subject
HYPOTHESIS TESTING
IN R STUDIO
How to install R studio
In order to install R Studio, we first need to install R. Following are the steps how to install
R:
1. Go to CRAN, click Download R for Windows, click Base, and download the installer for the
latest R version.
2. Right-click the installer file and select Run as Administrator from the pop-up menu.
3. Select the language to be used during installation.
This doesn’t change the language used by R; all messages and Help files remain in English.
4. Follow the instructions of the installer.
You can safely use the default settings and just keep clicking Next until R starts installing.
After installing the setup of R,we can install the setup of R Studio. Following are the steps
how to install R Studio:
R and RStudio are not separate versions of the same program, and cannot be substituted for
one another. R may be used without RStudio, but RStudio may not be used without R.
As soon as you create a new script, the windows within your RStudio session adjust
automatically so you can see both your script and the results in your console when you run
your syntax.
Even better is the ability to call up potential syntax options while you are writing just by
using the tab key.
For example, suppose I am trying to access a variable in a data set called “teachers”, but I
haven’t memorized the variable names:
2) RStudio makes it convenient to view and interact with the objects stored in your
environment.
In the basic R GUI, you can always list the objects you have stored in your environment. But
RStudio has a very useful “Environment” window available.
This shows all of the objects that you have stored, including data; scalars, vectors, and
matrices; model outputs; etc., along with a summary of the information that is stored in those
objects.
You can even click on your data sets directly to open them and view them as spreadsheets.
3) RStudio makes it easy to set your working directory and access files on your computer.
Especially if you are working in Windows, one of the most tedious parts of programming in
R is setting your working directory to access your files.
With RStudio, you can navigate to folders on your computer in the “Files” window, view any
files you have in that folder, and set that folder as the working directory.
setwd("c:/Documents/my/working/directory")
A default working directory is a folder where RStudio goes, every time you open it. You can
change the default working directory from RStudio menu under: Tools –> Global options –>
click on “Browse” to select the default working directory you want.
The basic R GUI requires you to go to some lengths to save graphics as you go. But RStudio
has a window that does exactly that.
You can easily click back and forth between plots, change the sizes of your plot without
rerunning the code, and export or copy plots to include in other documents.
Top-left panel: Code editor allowing you to create and open a file containing R script. The R
script is where you keep a record of your work. R script can be created as follow: File –>
New –> R Script.
Workspace tab: shows the list of R objects you created during your R session
Bottom-right panel:
Plots tab: show the history of plots you created. From this tab, you can export a plot to a PDF
or an image files
Packages tab: show external R packages available on your system. If checked, the package is
loaded in R.
IMPORT OF DATA SHEET IN R STUDIO
1.In File tab, click on Import Dataset then click from excel
Result
For Summary Statistics
Min. 1st Qu. Median Mean 3rd Qu. Max.
3.00 22.00 42.50 42.05 58.75 98.00
For Standard Deviation
[1] 27.31006
For Variance
[1] 745.8395
CORRELATION
Coding
cor.test(Correlation$`Advertisement in month`,Correlation$`Sales in crores`)
Result
Pearson's product-moment correlation
Problem: To determine that there is a significance difference between the calculated mean
age of population and estimated mean age of population mean age being (40)
Age Dummy
42 0
76 0
56 0
67
65
65
89
45
45
65
78
55
44
65
76
89
54
56
56
76
45
Null Hypothesis: There is a no significance difference between the calculated mean age
of population and estimated mean age of population mean age being 40.
Alternate Hypothesis: There is a significance difference between the calculated mean
age
of population and estimated mean age of population mean age being 40.
H0 : µ = 40
H1 : µ ≠ 40
Output:
One Sample t-test
data: one_sample_t_test_2$Age
t = 7.0855, df = 20, p-value = 7.209e-07
alternative hypothesis: true mean is not equal to 40
95 percent confidence interval:
55.75844 68.90823
sample estimates:
mean of x
62.33333
Decision Rule:
If p value is less than a, reject Null Hypothesis
Inference:
Since P (0.02) is less than α (0.05), reject null hypothesis.
Conclusion:
There is a significance difference between the calculated mean age of population and
estimated mean age of population mean age being 40
HYPOTHESIS TESTING: ONE SAMPLE T TEST(TWO TAILED)
Problem: To analyse that there is a significant difference between the marks scored by class
groups A & B in mathematics at α=10%
Group A Group B
76 95
87 97
98 87
78 89
76 87
78 45
76 76
88 56
78 76
87 87
87 76
87 76
76 45
89 88
65 76
78 66
89 78
87 56
87 77
Alternate Hypothesis: There is a significant difference between the marks scored by class
groups A & B in mathematics at α=10%
Null Hypothesis: There is no significant difference between the marks scored by class
groups A & B in mathematics at α=10%
H0= µa=µb, µa-µb=0
H1= µa≠µb, µa-µb≠0
Coding
t.test(twosample_t_test2$’Group A’,twosample_t_test2$’Group B’)
Output
Welch Two Sample t-test
data: twosample_t_test2$`Group A` and twosample_t_test2$`Group B`
t = 1.7863, df = 26.177, p-value = 0.08565
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.060474 15.165737
sample estimates:
mean of x mean of y
82.47368 75.42105
Decision Rule
If p value is less than a, reject Null Hypothesis
Inference:
Since P (0.08) is less than α (0.10), reject null hypothesis.
Conclusion:
There is a significant difference between the marks scored by class groups A & B in
mathematics at α=10%
HYPOTHESIS TESTING: TWO INDEPENDENT SAMPLE T TEST
Problem: To analyse that the time spent by full time students in studying statistics is more
than the time spent by part time students at α=0.05.
Full Part
time time
3.2 3.1
1.5 3.4
6.5 4.6
0.2 2.8
3.7 2.3
3.3 1.5
1.7 3.8
3.6 9.5
3.8 4.3
5.3 2.7
6.9 1.6
3.6 1.6
1.7 3.2
1.2 4.2
7.2 3.9
3.9 1.2
1.9 0
5.3 0
Coding
t.test(two_sample_t_test1$`Full time`,two_sample_t_test1$`Part time`)
Output
Welch Two Sample t-test
Conclusion:
here is no significant difference between the mean of babyfood A and babyfood B
HYPOYHESIS TESTING: PAIRED SAMPLE T TEST
Problem: To determine that there is a significant difference between the time to finish the
race when race is completed with local shoes and branded shoes.
Athelet Branded
Local shoes
e shoes
1 3.2 3.1
2 1.5 3.4
3 6.5 4.6
4 0.2 2.8
5 3.7 2.3
6 3.3 1.5
7 1.7 3.8
8 3.6 9.5
9 3.8 4.3
10 5.3 2.7
11 6.9 1.6
12 3.6 1.6
13 1.7 3.2
14 1.2 4.2
15 7.2 3.9
Hypothesis Testing:
Null hypothesis (H0): There is no significant difference between the time to finish the race
when race is completed with local shoes and branded shoes.
Alternate hypothesis (H1): There is a significant difference between the time to finish the
race when race is completed with local shoes and branded shoes.
H0 = µA = µB; µA - µB = 0 or tl = tb, tl – tb = 0
H1 = µA ≠ µB; µA - µB ≠ 0 or tl ≠ tb, tl – tb ≠ 0
Coding
t.test(`Local shoes`,`Branded shoes`,mu=0,alternative = "two.sided",paired = T,conf.level =
0.95)
Output
Paired t-test
data: Local shoes and Branded shoes
t = 0.079506, df = 14, p-value = 0.9378
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
-1.558575 1.678575
sample estimates:
mean difference
0.06
Decision Rule
If p value is less than a, reject Null Hypothesis
Inference:
Since p(0.93) is lesser than the alpha (0.10), we will reject null hypothesis.
Conclusion:
There is a significant difference between the time to finish the race when race is completed
with local shoes and branded shoes.
HYPOTHESIS TESTING: PAIRED SAMPLE T TEST 2 (ONE TAIL)
Null hypothesis:
group 1 variance is equal group 2 variance.
Alternate hypothesis:
group 1 variance is greater than group 2 variance.
REASERCH PROBLEM:
x=aov(MARKS~SUBJECT)
summary(x)
RESULT:-
DECISION RULE:
INFERENCE:
Since P is lesser than alpha value, we will NOT accept null hypothesis
CONCLUSION: