0% found this document useful (0 votes)

34 views9 pages

Chapter 1 INTRODUCTION TO DATA

Notes for computer science

Uploaded by

zarahrasheed1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views9 pages

Chapter 1 INTRODUCTION TO DATA

Notes for computer science

Uploaded by

zarahrasheed1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Statistics is the scientific methods of collecting, analyzing, summarizing, interpreting, and

presentation of data to make valid conclusion. Statistics is divided into: Descriptive and

Inferential.

Descriptive Statistics: It involves scientific methods to collect and present information with

graphs and numerical values.

Inferential Statistics: Involves the use of probability to generalize base on a sample of

population from a larger population to make conclusion.

DATA AND DATA SOURCES

Statistical data are raw facts of statistics. It may relate to an activity of under study, a

phenomenon, or a situation of interest. Statistical data are derived through the process of

measuring, counting and/or observing. An activity or phenomenon that generates data through its

process is termed as a variable. In other words, a variable

is one that takes on different values upon successive measurements. In statistics, data are

classified into two categories: quantitative data and qualitative data. This classification is based

on the kind of characteristics that are measured.

Quantitative Data: These are data that can be expressed numerically or quantified in definite

units of measurement.

Examples : Age of students taking STS 102, Score of UTME exam, etc. These observations are

expressed using numbers or quantified.

Depending on the nature of the variable observed for measurement, quantitative data can be

further categorized as continuous and discrete data.

Qualitative Data: These data cannot be expressed in numbers or quantified in unit of

measurement. Examples include Blood group, Sex, Nationality etc. These data are further

classified as nominal and rank data.

DATA SOURCES

The sources of data is divided into: Primary and Secondary data

Primary Data: These are data collected directly from the respondent. They are regarded as first

hand information collected by the researcher. Examples of Primary data can be obtained from:

 Census

 Survey

Secondary data: These are data already existed in form of published or unpublished source.

They are available from published source(s) which may not necessarily in the form actually

required.

Examples of secondary data include:

 Journals publication

 Research or Media organization

Methods of Data Collection

The method of data collection depends solely on the problem at hand. There are various methods

of collection of data viz-a-viz :

 Interviewing

 Questionnaire

 Observation

 Telephone
Data Presentation

A set of raw data collected are organized numerically for ease of analysis and

presentation. This is done by creating frequency table which is known as frequency

distribution. Presenting data in tables, charts, graphs gives a clearer meaning to the data.

Basic Terms

Class interval : A symbol defining a class, e.g 60–62 is called a class interval. The end numbers,

and 62, are called class limits; the smaller number (60) is the lower class limit, and the larger

number (62)

is the upper class limit.

Class Boundaries : the class boundaries are obtained by adding the upper limit of one class

interval to the

lower limit of the next-higher class interval and dividing by 2.

Class Width or Class Size: The size, or width, of a class interval is the difference between the

lower and upper class boundaries

and is also referred to as the class width, class size, or class length. If all class intervals of a

frequency

distribution have equal widths, this common width is denoted by c. In such case c is equal to the

difference between two successive lower class limits or two successive upper class limits.

Class Mark: The class mark is the midpoint of the class interval and is obtained by adding the

lower and upper

class limits and dividing by 2. The class mark is also called the class midpoint.

Frequency: A frequency is the number of times a value of the data occurs

Relative Frequency: A relative frequency is the ratio (fraction or proportion) of the number of

times a value of the data occurs in the set of all outcomes to the total number of outcomes. To

find the relative frequencies, divide each frequency by the total number of students in the

sample, n.

Cumulative Frequency: it is the sum of a frequency of the particular class to the frequencies of

the class before it.

Frequency Distribution

Frequency distribution is classified as: grouped and ungrouped frequency distribution.

Ungrouped frequency: it is basically for quantitative data sets. It is best when the range of the

data is less than 10 units. Range is the difference between the largest data value and the smallest

data value. For example, twenty students were asked how many hours they worked per day.

Their responses, in hours, are as follows:

5; 6; 3; 3; 2; 4; 8; 5; 2; 3; 5; 6; 5; 4; 4; 3; 5; 2; 5; 3.

Range= 8-2

Since the range is 6, we will keep each data value separate and not group them together. To

create an ungrouped frequency distribution is a simple task. Place the data values from smallest

to the largest without skipping any values on the first column. Place the frequency, the count of

each data value, in the corresponding row of the second column.

The table below shows the different data values in ascending order and their frequencies. Notice

all the data values are listed including seven which is not listed on the original data set.
Data Values Frequency(f)

2 3

3 5

4 3

5 6

6 2

7 0

8 1

Frequency distribution of students work hours

Grouped Frequency Distribution

This second type of frequency distribution is also used when there is quantitative data. However,

it is used when the range is large and the data values need to be grouped together. For example,

28 students were asked how many hours they worked per week. Their responses, in hours, are as

follows:

15; 26; 13; 33; 22; 14; 27; 15; 32; 23; 5; 26; 25; 14; 34; 13; 15; 22; 15; 28; 10; 18; 21; 24; 20; 18;

34; 20;

Here there are too many different data values to list them separately as in the ungrouped

frequency distribution. Notice the range is 29 (highest – lowest = 34 – 5). Therefore we need to

construct a grouped frequency distribution and group data values into classes.

A class is an interval where the lowest value of the interval is known as the lower limit and the

highest value of the interval is known as the upper limit.

Guidelines for classes:

 There should be between 5 and 20 classes

 Classes must be mutually exclusive (no overlap of data values)

 Classes must be all inclusive and continuous

 Classes must be equal in width

Constructing a Grouped Frequency Distribution:

1.) Find Range (R) (highest data value – lowest data value)

2.) Determine the number of classes (C) (usually the minimum is 5 classes and a maximum of 20

classes)

There are several suggested guide lines aimed at helping one decided on how many class

intervals to employ. Two of such methods are:

(a) C = 1 +3.322(log10 𝑛)

(b) C = 𝑛 where n = number of observations.

𝑅
3. Determine the width of the class interval (W), given as W= 𝐶 , where R is the Range of values,

and C is number of classes.

Note: Class width are rounded up to give number of classes.

4. Choose first lower limit (usually the lowest data value)

5. Create the other lower limits of the classes by adding the class width to the previous lower

limit

6. Create the upper limits by not overlapping the limits

7. Determine the numbers of observations falling into each class interval i.e. find the class

frequencies.

.
Example1: The following are the marks of 50 students in STS 102:

48 70 60 47 51 55 59 63 68 63 47 53 72 53 67 62 64 70 57 56

48 51 58 63 65 62 49 64 53 59 63 50 61 67 72 56 64 66 49 52

61 71 58 53 63 69 59 64 73 56.

(a) Construct a frequency table for the above data.

(b) Answer the following questions using the table obtained:

(i) how many students scored between 51 and 62?

(ii) how many students scored above 50?

(iii) what is the probability that a student selected at random from the class will

score less than 63?

Solution:

(a) Range (R) = Largest value – Smallest Value

= 73-47=26

No of classes(C) = 𝑛 = 50= 7.07≅ 7

𝑅 26
Class size or width (W)= 𝐶 = = 3.7 ≅ 4
7

Frequency Table

Marks Tally Frequency (f)

47-50 |||| || 7

51-54 |||| || 7

55-58 |||| || 7

59-62 |||| ||| 8

63-66 |||| |||| | 11

67-70 |||| | 6

71-74 |||| 4

b. i. 7+7+8 = 22

ii. 7+7+8+11+6+4= 43

iii. scores less than 63= 8+7+7+7= 29

Total number of students= 50

Prob(less than 63) = 29/50= 0.58

Example2: Twenty-eight students were asked how many hours they worked per week. Their

responses, in hours, are as follows: 15; 26; 13; 33; 22; 14; 27; 15; 32; 23; 5; 26; 25; 14; 34; 13;

15; 22; 15; 28; 10; 18; 21; 24; 20; 18; 34; 20; construct a grouped frequency distribution using 5

classes

Solution:

1. Range = 34 – 5 = 29

2. Use 5 classes

3. Class Width = 29/5 = 5.8 round up to 6

4. First lower limit will be 5 which is the minimum data value

5. The other lower limits will be 11, 17, 23, 29 by adding the class width of 6 to the previous

lower limit
6. The first upper limit will be 10 since the next class begins at 11. Using class width again, the

other upper limits are 16, 22, 28, 34

Class Tally Frequency (f)

5- 10 || 2

11-16 |||| ||| 8

17- 22 |||| || 7

23- 28 |||| || 7

29-34 |||| 4

ASSIGNMENT 1

The following data represent the ages (in years) of people living in a housing estate
in Abeokuta.
18 31 30 6 16 17 18 43 2 8 32 33 9 18 33 19 21 13 13 14
14 6 52 45 61 23 26 15 14 15 14 27 36 19 37 11 12 11 20 12
39 20 40 69 63 29 64 27 15 28.
Present the above data in a frequency table showing the following columns; class

interval, class boundary, class mark (mid-point), tally, frequency and cumulative

ASSIGNMENT 2

The grade points of 40 students are given below, using class 8 classes, construct a frequency

distribution and relative frequency

48 70 60 47 51 55 59 63 68 63 47 53 72 53 67 62 64 70 57 56

48 51 58 63 65 62 49 64 53 59 63 50 61 67 72 56 64 66 49 52

Salesforce Ai Associate Certification Practice Questions
100% (2)
Salesforce Ai Associate Certification Practice Questions
60 pages
TCS Technical Interview Questions
No ratings yet
TCS Technical Interview Questions
8 pages
Module Two: Frequency Distribution and Their Graphic Representations
No ratings yet
Module Two: Frequency Distribution and Their Graphic Representations
14 pages
Data Presentation
No ratings yet
Data Presentation
19 pages
Adv Stat Data Presentation
No ratings yet
Adv Stat Data Presentation
57 pages
Episode 2
No ratings yet
Episode 2
11 pages
2020 - Statistics 1 - Session 2
No ratings yet
2020 - Statistics 1 - Session 2
7 pages
Frequency Distribution
No ratings yet
Frequency Distribution
14 pages
Frequency Distribution Lecture 2 3
No ratings yet
Frequency Distribution Lecture 2 3
11 pages
UDSM Statistics and Probability For Non-Majors
No ratings yet
UDSM Statistics and Probability For Non-Majors
148 pages
Unit 7 Lecture Note
No ratings yet
Unit 7 Lecture Note
25 pages
Frequency
100% (1)
Frequency
36 pages
18bst5el U2
No ratings yet
18bst5el U2
21 pages
Unit 2 Statistics Analytics
No ratings yet
Unit 2 Statistics Analytics
10 pages
Module 13 - Organizing Data
No ratings yet
Module 13 - Organizing Data
14 pages
Statistics 2025
No ratings yet
Statistics 2025
160 pages
Chapter 18 - Statistics Presentation
No ratings yet
Chapter 18 - Statistics Presentation
44 pages
Mt271 Lecture Notes 1
No ratings yet
Mt271 Lecture Notes 1
13 pages
Statistics Combine
No ratings yet
Statistics Combine
65 pages
3 Organizing Data
No ratings yet
3 Organizing Data
20 pages
Trinitas College: Statistics and Probability Module 7-8 Lesson
No ratings yet
Trinitas College: Statistics and Probability Module 7-8 Lesson
5 pages
Aj Ka Kaam
No ratings yet
Aj Ka Kaam
21 pages
STA 111 - Topic One - Lecture 2
No ratings yet
STA 111 - Topic One - Lecture 2
20 pages
Chapter-2-Methods of Data Presentation
No ratings yet
Chapter-2-Methods of Data Presentation
17 pages
Grouped Frequency Distribution
100% (1)
Grouped Frequency Distribution
5 pages
Business Statistics Chapter 2
No ratings yet
Business Statistics Chapter 2
33 pages
UNIT 3 Methods of Organizing and Presenting Data
No ratings yet
UNIT 3 Methods of Organizing and Presenting Data
24 pages
Chapter 2
No ratings yet
Chapter 2
46 pages
2jane - Frequency Dist and Graphs FINAL March 2022
No ratings yet
2jane - Frequency Dist and Graphs FINAL March 2022
11 pages
Frequency Distribution: A Frequency Distribution Is Constructed For Three Main Reasons
No ratings yet
Frequency Distribution: A Frequency Distribution Is Constructed For Three Main Reasons
15 pages
Assessment Learning 2. M4
No ratings yet
Assessment Learning 2. M4
10 pages
Frequency Distribution
No ratings yet
Frequency Distribution
27 pages
STA112 Week 2 Class Note
No ratings yet
STA112 Week 2 Class Note
102 pages
2 LESSON 2 Freq Graphs FQ
No ratings yet
2 LESSON 2 Freq Graphs FQ
21 pages
Methods of Data Presntation
No ratings yet
Methods of Data Presntation
53 pages
Assessment in Learning 2 Module RIVERA
No ratings yet
Assessment in Learning 2 Module RIVERA
159 pages
Ch2 Statistics
No ratings yet
Ch2 Statistics
41 pages
Chapter1 Statistics
No ratings yet
Chapter1 Statistics
12 pages
Sta111 Complete Note
No ratings yet
Sta111 Complete Note
74 pages
Variables and Attributes
No ratings yet
Variables and Attributes
4 pages
10th Class Maths Notes 2024 CH 6
No ratings yet
10th Class Maths Notes 2024 CH 6
33 pages
Statistics Chapter-II
No ratings yet
Statistics Chapter-II
66 pages
Statistics: Class Mark Cumulative Frequency Histogram Frequency Polygon Mean Median Mode
No ratings yet
Statistics: Class Mark Cumulative Frequency Histogram Frequency Polygon Mean Median Mode
27 pages
Chapter 2 SUMMARY Descriptive Statistics
No ratings yet
Chapter 2 SUMMARY Descriptive Statistics
32 pages
L1 SK
No ratings yet
L1 SK
2 pages
Assessment 2 Chapter 1
No ratings yet
Assessment 2 Chapter 1
23 pages
7 Module
No ratings yet
7 Module
5 pages
09042020212858practical Statistical Methods 2019-20
No ratings yet
09042020212858practical Statistical Methods 2019-20
91 pages
"Probability and Statistics (For Engineering) 235 M: Summer Session 2019/2020
No ratings yet
"Probability and Statistics (For Engineering) 235 M: Summer Session 2019/2020
45 pages
Types of Data
No ratings yet
Types of Data
10 pages
Frequency Distribution Math4
100% (2)
Frequency Distribution Math4
14 pages
Module 3 PDF
No ratings yet
Module 3 PDF
24 pages
Topic 6
No ratings yet
Topic 6
7 pages
Graphical Representation of Data Statistics p2
No ratings yet
Graphical Representation of Data Statistics p2
35 pages
Technical Terms Used in Formulation Frequency Distribution
100% (1)
Technical Terms Used in Formulation Frequency Distribution
22 pages
Chapter 2
No ratings yet
Chapter 2
32 pages
Intro To Statistics
No ratings yet
Intro To Statistics
38 pages
Stat MSC
No ratings yet
Stat MSC
22 pages
Statistics and Probability
100% (7)
Statistics and Probability
141 pages
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Data Analysis in 6th Grade
From Everand
Data Analysis in 6th Grade
Christopher Casey
No ratings yet
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet
Managing Oracle Database Instance
No ratings yet
Managing Oracle Database Instance
12 pages
4-Data Manipulation and Querying
No ratings yet
4-Data Manipulation and Querying
14 pages
Information Technology (IT) Is The Application of
No ratings yet
Information Technology (IT) Is The Application of
6 pages
The 8D Methodology An Effective Way To Reduce Recu PDF
No ratings yet
The 8D Methodology An Effective Way To Reduce Recu PDF
7 pages
Nivedhan CV
No ratings yet
Nivedhan CV
2 pages
Head First SQL - A Learners Guide
No ratings yet
Head First SQL - A Learners Guide
586 pages
DP-203 Updated Dumps - Data Engineering On Microsoft Azure
No ratings yet
DP-203 Updated Dumps - Data Engineering On Microsoft Azure
60 pages
Yash Pratical File
No ratings yet
Yash Pratical File
15 pages
Data Communications and Networking
100% (1)
Data Communications and Networking
20 pages
536C3A
No ratings yet
536C3A
2 pages
2V0-621 Examcollection Premium Exam Dumps 218q PDF
100% (7)
2V0-621 Examcollection Premium Exam Dumps 218q PDF
53 pages
Evaluating Strategies For Cost Reduction in SCM Relating To Exports and Imports.
100% (1)
Evaluating Strategies For Cost Reduction in SCM Relating To Exports and Imports.
53 pages
3d Optical Data Storage
0% (1)
3d Optical Data Storage
15 pages
Lecture4 AccessControl
No ratings yet
Lecture4 AccessControl
13 pages
Feel For Data SeT
No ratings yet
Feel For Data SeT
2 pages
File Handling With Linked List in C++
0% (1)
File Handling With Linked List in C++
3 pages
Khoa Luan Tot Nghiep
No ratings yet
Khoa Luan Tot Nghiep
45 pages
Content
No ratings yet
Content
38 pages
Resume Format
No ratings yet
Resume Format
3 pages
Mcqs
No ratings yet
Mcqs
14 pages
Practical Examples On Database Management Systems
No ratings yet
Practical Examples On Database Management Systems
9 pages
AWR Report Analysis in Depth-Part 1 - Clouddba
No ratings yet
AWR Report Analysis in Depth-Part 1 - Clouddba
28 pages
Tal End Metadata Bridge
No ratings yet
Tal End Metadata Bridge
15 pages
SQL Server Database Development Best Practices: Grant Fritchey, Red Gate Software
No ratings yet
SQL Server Database Development Best Practices: Grant Fritchey, Red Gate Software
21 pages
SQL Syllabus
No ratings yet
SQL Syllabus
4 pages
Sqlite Internals PDF
No ratings yet
Sqlite Internals PDF
124 pages
Program Evaluation For Social Workers: Foundations of Evidence Based Programs 7th Edition, (Ebook PDF
100% (1)
Program Evaluation For Social Workers: Foundations of Evidence Based Programs 7th Edition, (Ebook PDF
53 pages
GROUP 5 Research - PR 1
No ratings yet
GROUP 5 Research - PR 1
8 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Chapter 1 INTRODUCTION TO DATA

Uploaded by

Chapter 1 INTRODUCTION TO DATA

Uploaded by

Statistics is the scientific methods of collecting, analyzing, summarizing, interpreting, and

graphs and numerical values.

Inferential Statistics: Involves the use of probability to generalize base on a sample of

population from a larger population to make conclusion.

DATA AND DATA SOURCES

process is termed as a variable. In other words, a variable

on the kind of characteristics that are measured.

expressed using numbers or quantified.

further categorized as continuous and discrete data.

classified as nominal and rank data.

The sources of data is divided into: Primary and Secondary data

Examples of secondary data include:

 Research or Media organization

Methods of Data Collection

of collection of data viz-a-viz :

presentation. This is done by creating frequency table which is known as frequency

is the upper class limit.

lower limit of the next-higher class interval and dividing by 2.

lower and upper class boundaries

lower and upper

Frequency: A frequency is the number of times a value of the data occurs

the class before it.

Frequency distribution is classified as: grouped and ungrouped frequency distribution.

Their responses, in hours, are as follows:

each data value, in the corresponding row of the second column.

Frequency distribution of students work hours

Grouped Frequency Distribution

highest value of the interval is known as the upper limit.

Guidelines for classes:

 Classes must be mutually exclusive (no overlap of data values)

 Classes must be all inclusive and continuous

 Classes must be equal in width

Constructing a Grouped Frequency Distribution:

intervals to employ. Two of such methods are:

(b) C = 𝑛 where n = number of observations.

and C is number of classes.

Note: Class width are rounded up to give number of classes.

4. Choose first lower limit (usually the lowest data value)

6. Create the upper limits by not overlapping the limits

(a) Construct a frequency table for the above data.

(b) Answer the following questions using the table obtained:

(i) how many students scored between 51 and 62?

(ii) how many students scored above 50?

score less than 63?

(a) Range (R) = Largest value – Smallest Value

No of classes(C) = 𝑛 = 50= 7.07≅ 7

Marks Tally Frequency (f)

59-62 |||| ||| 8

iii. scores less than 63= 8+7+7+7= 29

Total number of students= 50

Prob(less than 63) = 29/50= 0.58

3. Class Width = 29/5 = 5.8 round up to 6

4. First lower limit will be 5 which is the minimum data value

other upper limits are 16, 22, 28, 34

Class Tally Frequency (f)

11-16 |||| ||| 8

distribution and relative frequency

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.