We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13
JOURNAL
IN THE MAJOR COURSE
INFERENTIAL STATISTICS SUBMITTED BY Mr. KASH SHARMA SYDS Roll no : SDDS050A SEMESTER III UNDER THE GUIDANCE OF DR. SANJAY MISHRA SIR ACADEMIC YEAR 2023 - 2024 CERTIFICATE
This is to certify that Mr. KASH SHARMA of Second year
B.SC.DS Div.: A, Roll No. SDDS050A of Semester III (2023 - 2024) has successfully completed the Journal for the Major course INFERENTIAL STATISTICS as per the guidelines of KES’ Shroff College of Arts and Commerce, Kandivali(W), Mumbai-400067.
Teacher In-charge Principal
DR. SANJAY MISHRA SIR Dr. L Bhushan. INDEX
Sr. Practical Practical Name Page No.
No. No. 1. 1 Perform One-Sample t-test using R. 4 2. 2 Perform Two-Sample t-test using R. 5 3. 3 Perform Paired t-test using R. 6 4. 4 Perform Chi-Square test using R. 7 5. 5 Perform One way ANOVA using R. 8 6. 6 Perform Two way ANOVA using R. 9 7. 7 Perform Correlation and Linear Regression using R. 10 8. 8 Perform Sign Test for One-sample data using R. 11 9. 9 Perform Median Test for Two-sample data using R. 12 10. 10 Perform Sign Test for Two-sample data using R. 13 Practical No. 1: Aim: Perform One-Sample t-test using R. The one sample t test is a statistical hypothesis test that compares the mean of a sample to a known value. It's also known as a single sample t test. The one sample t test determines whether the mean of a population is statistically different from a known or hypothesized value. For example, you might want to know how your sample mean compares to the population mean. Question: A local bakery claims that their average muffin weight is 150 grams. To test this claim, you randomly select 10 muffins and weigh them. The weights (in grams) are as follows: 155, 148, 152, 147, 150, 153, 149, 151, 156, and 154. Can you determine if the bakery's claim is accurate using a one-sample t-test in R? Code:
Output:
4|Inferential Statistics Roll No.: SDDS050A
Practical No. 2: Aim: Perform Two-Sample t-test using R. A two-sample t-test is a statistical test that compares the means of two different samples to determine if there is a significant difference between them. It's also known as an independent t-test. The test is based on the assumption that the samples are drawn from populations with normal distributions. It's used when the two small samples (n< 30) are taken from two different populations and compared.. Question: Two different ice cream shops, Shop One and Shop Two, are known for their ice cream scoop sizes. Shop One claims that their average scoop size is 140 grams, while Shop Two claims that their average scoop size is 150 grams. To investigate these claims, a random sample of 50 scoops was taken from each shop, and their weights were measured. Using the data provided, can you determine if there is a significant difference in the average scoop sizes between the two ice cream shops, assuming equal variances? Code:
Output:
5|Inferential Statistics Roll No.: SDDS050A
Practical No. 3: Aim: Perform Paired t-test using R. A paired t-test is a statistical test that compares the averages and standard deviations of two related groups. It's also known as a dependent or correlated t-test. The test determines if there's a significant difference between the two groups. It's used when you have two samples where observations in one sample can be paired with observations in the other sample. Question: Which bakery, Sweet One or Sweet Two, has more consistent chocolate truffle weights based on a sample of 100 truffles each, where Sweet One's truffles have an average weight of 14g (SD = 0.3g) and Sweet Two's truffles have an average weight of 13g (SD = 0.2g), using a paired t-test in R? Code:
Output:
6|Inferential Statistics Roll No.: SDDS050A
Practical No. 4: Aim: Perform Chi-Square test using R. The chi-square test is a statistical tool that compares two categorical variables to determine if they are related or independent. The test is based on the differences between the observed values and those that would be expected if the variables were independent. Small differences indicate little dependence between the variables, while large differences indicate a dependence. Question: Researchers collected data on the number of occurrences of events X and Y in two categories, Category A and Category B. In Category A, there were 30 occurrences of event X and 20 occurrences of event Y. In Category B, there were 15 occurrences of event X and 25 occurrences of event Y. Using a chi-squared test in R, can you determine if there is a significant association between the categories and the occurrences of X and Y? Code:
Output:
7|Inferential Statistics Roll No.: SDDS050A
Practical No. 5: Aim: Perform One way ANOVA using R. One-way ANOVA (analysis of variance) is a statistical test that compares the means of two or more independent groups. It's also known as One-Factor ANOVA. One-way ANOVA compares the variance in the group means within a sample. It tests the null hypothesis (H0) that three or more population means are equal vs. the alternative hypothesis (H1) that at least one mean is different. Question: In a study comparing three different groups (A, B, C), researchers collected data on a specific variable. Group A had values {25, 28, 30, 32, 27}, Group B had values {22, 26, 24, 30, 28}, and Group C had values {18, 20, 19, 22, 21}. Using a one-way ANOVA in R, can you determine if there are any statistically significant differences in the variable among these groups, and provide a summary of the ANOVA result? Code:
Output:
8|Inferential Statistics Roll No.: SDDS050A
Practical No. 6: Aim: Perform Two way ANOVA using R. A two-way ANOVA (analysis of variance) is a statistical test that determines whether there is a statistically significant difference between the means of three or more independent groups that have been split on two variables. It can also be used to test for interaction between the two independent variables. Question: In a study comparing three different groups (A, B, C), researchers collected data on a specific variable. Group A had values {25, 28, 30, 32, 27}, Group B had values {22, 26, 24, 30, 28}, and Group C had values {18, 20, 19, 22, 21}. Using a one-way ANOVA in R, can you determine if there are any statistically significant differences in the variable among these groups, and provide a summary of the ANOVA result? Code:
Output:
9|Inferential Statistics Roll No.: SDDS050A
Practical No. 7: Aim: Perform Correlation and Linear Regression using R. Correlation is a statistical measure that describes the relationship between two variables. It measures the extent to which two variables are linearly related, meaning they change together at a constant rate. Correlation is a common tool for describing simple relationships without making a statement about cause and effect. Linear regression is a statistical method used in data science and machine learning to predict the relationship between two variables. It assumes a linear relationship between the independent variable and the dependent variable. The goal is to find the best-fitting line that describes the relationship. Question: Suppose you are analyzing the relationship between the number of hours spent studying (variable x) and the exam scores achieved (variable y) for a group of students. Using R, can you determine the strength and direction of the relationship between study hours and exam scores by calculating the correlation coefficient (Pearson's) and performing a linear regression analysis? Code:
Output:
10 | I n f e r e n t i a l S t a t i s t i c s Roll No.: SDDS050A
Practical No. 8: Aim: Perform Sign Test for One-sample data using R. The one-sample sign test is a non-parametric hypothesis test that determines if there is a statistically significant difference between the median of a non-normally distributed continuous data set and a standard. Question: In statistical analysis, determining whether a dataset's positive and negative values are balanced is a crucial step. You have a dataset with values: [-1, 2, -3, 4, -5, 6, -7, 8]. Using the binomial test in R, can you assess whether the number of positive values differs significantly from the number of negative values with a null hypothesis of equal proportions? Please present the results of the binomial test and your interpretation. Code:
Output:
11 | I n f e r e n t i a l S t a t i s t i c s Roll No.: SDDS050A
Practical No. 9: Aim: Perform Median Test for Two-sample data using R. The median test is a nonparametric test that determines if two independent samples were drawn from populations with the same median. Question: Using the provided data for Group 1 and Group 2, Median Test to assess if their medians significantly differ. Code:
Output:
12 | I n f e r e n t i a l S t a t i s t i c s Roll No.: SDDS050A
Practical No. 10: Aim: Perform Sign Test for Two-sample data using R. The two-sample paired sign test is a nonparametric test that assesses the number of observations in one group that are greater than paired observations in the other group. The test is based on the direction of the plus and minus sign of the observation, and not on their numerical magnitude. Question: Given paired datasets 'before' and 'after' representing symptom scores of patients before and after taking a new medication, perform a Sign Test to ascertain if there's a statistically significant reduction in symptoms due to the medication? Code:
Output:
13 | I n f e r e n t i a l S t a t i s t i c s Roll No.: SDDS050A