0% found this document useful (0 votes)

4 views11 pages

CH 4 Handout

This chapter discusses data merging techniques in data science, highlighting the importance of combining datasets from different sources while addressing potential issues such as naming conventions and data formatting. It covers various types of joins (one-to-one, one-to-many, and many-to-many) and explains statistical concepts like standard deviation, z-scores, percentiles, quartiles, and deciles. The chapter aims to provide a comprehensive understanding of how to effectively merge and analyze data for better insights.

Uploaded by

ayan.infernogod

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views11 pages

CH 4 Handout

Uploaded by

ayan.infernogod

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

CHAPTER

DATA MERGING

However, while merging the data from

different sources there are many issues
Studying this chapter should that occur that require corrections for
enable you to understand: successful data merging. Different data
1. How to merge data sets? sources will always have different
2. What is Standard Deviation naming conventions than the main data
and what are different ways source. They may have different ways of
to calculate it? grouping the data and so on. Many
times, it happens that the additional
data source happens to be created at a
very different time by different people
1. Overview of Data with a different objective and use-cases.
Owing to all these factors, it should not
Merging sound strange if there is a lot of
In Data Science, data merging is the difference between multiple data
process of combining two or more data sources.
sets into a single data frame. This
process is necessary when we have raw In this topic, we will explore various
data stored in multiple files or data ways of simplifying the process of data
tables, that we want to analyze all in one merging. There are many places where
go. these data merging techniques will help
you. For example, if you have two
different systems that operate in parallel
with each other. Suppose that you have
to perform some analysis of the

36
relationship where you are having a designed to contain unique values. In
legacy system with a very poorly the Employee table, the Employee ID
formatted data that you are willing to field is the primary key, in the Contact
integrate with your new system. This is Info table, the Employee ID field is a
where data merging comes into the foreign key.
picture. Let us now dive deep into data
merging techniques. The one to one relationship returns the
related records when the value in the
We can perform data merging by Employee ID field in the Contact Info
implementing data joins on the table is the same as the Employee ID
databases in frame. There are three field in the Employees table.
categories of data joins:
This is how one to one join works, by
1. One to One Joins merging the data tables using this
2. One to Many Joins primary key.
3. Many to Many Joins
One to Many Joins
One to One Joins
In a one to many join, one record in a
One to one join is probably one of the table can be related to one or many
simplest join techniques. In this type of records in another table. For example,
join, each row in one table is linked to a each student can have multiple books by
single row in another table using a “key” school library.
column.
In the database, a one to many
For example, in a company database, relationships looks like this:
each employee has only one Employee
ID, and each Employee ID is assigned to
only one employee.
In the database, a one to one
relationship looks like this:

In this example, the primary key field in

the Students table, Student ID, is
designed to contain the unique values.
The foreign key field in the Library table,
Student ID, is designed to allow multiple
instances of the same value.
In this example, the “key” field in each
table is “Employee ID”. This “key” field is

37
The one to many relationships returns using a third table which is called as a
the related records when the value in the join table. Every record in a join table
Student ID field in the Library Table is contains a match field that contains the
the same as the value in the Student ID value of the primary keys of two tables
field in the Students table. that it joins. In a join table, usually these
match fields are called as foreign keys.
This is one to many join works, by These foreign keys are populated with
merging databases using primary key
the data as records in the join table are
which demonstrates one to many
created from either table that it joins.
relationships.
The below table demonstrates the
Student table, which contains a record
for every student. It also contains a
Courses table, which contains a record
for each course. A join table called
Enrollments creates two one to many
relationships, the one between each of
the two tables.

Many to Many Joins

A many to many relationships is said to
occur when multiple records in one table The primary key Student ID is a unique
are related to multiple records of other identifier for every student in Students
table. For example, a many to many table. The primary key Course ID is a
relationships exists between students unique identifier for every course in the
and courses. A student can register for Courses table. The Enrollments table
multiple courses. A course can have carries the foreign keys Student ID and
multiple students. Course ID.
It is not easy to perform join on tables To set up a join table for many to many
which have many to many relationships. relationships,
As a workaround, to perform a join, you
can break a many to many relationships
into two one to many relationships by

38
1. Using the above example, you can Z = (x-μ)/σ
create a table called Enrollments.
Where,
This will act as a join table.
2. In the Enrollments table, make a X = raw score
Student ID field and a Course ID
field. μ = Population mean
3. Make a relationship between the σ = Population Standard Deviation
two Student ID fields in the tables.
Later, make a relationship between Thus, the z-score is the raw score minus
two Course ID field in the tables. the population mean, divided by the
population standard deviation.
We can use this design, if a student
registers for four courses, we can ensure Whenever we come across situations
that the student has only one record in where the population mean and the
the Students table and four records in population standard deviation are
the Enrollments table, one for each unknown, the standard score can be
course student is enrolled in. calculated using the sample mean i.e. x̄
and the sample standard deviation as
2. What is Z-Score? estimates of population values.
A Z-score describes the position of a Now we will consider an example that
point in terms of its distance from the will illustrate the use of z-score formula.
mean when it is measured in the Consider that we know about a
standard deviation units. The z-score is population of group of kids having
always positive if the value of z-score lies weights that are normally distributed.
above the mean and it is negative if its Further to this, consider that we know
value is below the mean. that the mean of the distribution is 10
Z-score is also known as standard score kgs and the standard deviation is 2 kgs.
as it allows comparison of scores on Now consider the below questions:
different types of variables by 1. What is the z-score for 12 kgs?
standardizing the distribution. 2. What is the z-score for 5 kgs?
A standard normal distribution is a 3. How many kgs corresponds to a
normally shaped distribution with a z-score of 1.25?
mean of value as 0 and a standard For the first question, we simply plug
deviation of value as 1. x=12 in our z-score formula. The result
is: (12-10)/2 = 1.
3. How to calculate a
This means that 12 is one standard
Z-score? deviations above the mean.
The mathematical formula for
calculating the z-score is as following:

39
The second question is also very similar.
Simply put x=5 into the formula. Thus,
5. Why is a Z-score so
the result for this is: important?
(5-10)/2= -2.5 It is very helpful to standardize the
values of a normal distribution by
The interpretation of this is that 5 is 2.5 converting them into z-score because:
standard deviations below the mean.
1. It gives us an opportunity to
For the last question, we now know our calculate the probability of a
z-score. For this problem we plug z = value occurring within a normal
1.25 into the formula and use basic distribution.
algebra to solve for x: 2. Z-score allows us to compare two
values that are from the different
1.25 = (x-10)/2
samples.
Multiply both the sides by 2:

2.5 = (x-10)
Add 10 to both the sides:
6. Concept of
12.5 = x
Percentiles
The maximum value of the distribution
Hence, we see that 12.5 kgs corresponds can be considered in an alternative way.
to a z-score of 1.25. We can represent it as a value in a set of
data having 100% of the observations at
4. How to interpret the or below it. When we consider the
maximum value this way, it is called the
Z-score? 100th percentile.
The value of a z-score always tells us
how many standard deviations we are A percentile can be defined as the
away from the mean. For example, if the percentage of the total ordered
z-score is equal to 0, it is on the mean. observations at or below it. Therefore, pth
percentile of a distribution is the value
A positive z-score tells us that the raw such that p percentage of the ordered
score is higher than the mean average. observation falls at or below it.
For example, if the z-score is equal to +2,
it is 2 standard deviations above the Consider the following data set: [10, 12,
mean. 15, 17, 13, 22, 16, 23, 20, 24]

A negative z-score tells us that the score Here, if we want to find the percentile for
is below the mean average. For example, element 22, we follow the steps below:
if a z-score is equal to -3, it is 3 standard
deviations below the mean.

40
▪ Sort the dataset in ascending Using the values of the quartiles, we can
order. Once sorted, the dataset also find out the interquartile range. An
will look like [10, 12, 13, 15, 16, interquartile range can be defined as
17, 20, 22, 23, 24] the measure of middle 50% of the values
▪ The number of values at or below when ordered from lowest to highest.
the element 22 is 8. The total The interquartile range can be
number of elements in the calculated by subtracting first
dataset is 10. quartile(Q1) from the third quartile(Q3).
▪ Thus, going by the definition, 80
percent of the values are at or
below the element 22. Thus, IQR = Q3 – Q1
percentile for the element 22 is 80
percentiles. Let us consider the following 10 data
points:
[10, 20, 30, 40, 50,60, 70, 80, 90, 100]
7. Quartiles Here, as there are ten values (an even
Quartiles of dataset partitions the data number of values), the median is
into four equal parts, with one-fourth of halfway between the fifth & sixth data
the data values in each part. The total of values, which gives us 55 as the median,
100% is divided into four equal parts: or Q2.
25%, 50%, 75% & 100%. Since the
median is defined as the middlemost
value in the observation, the median will
have 50% of the observations at or below
it. Thus, the second quartile(Q2) or the
50th percentile demarcates the median. The first quartile or Q1 is the median of
The most frequently used percentiles all the values to the left of Q2. Thus here,
30 is the middle number of numbers to
other than the median are the 25 th
percentile and the 75 th percentile. The the left of the actual median (Q2 ).
25th percentile defines the first quartile, The third quartile or Q3 is the median of
the 75th percentile defines the third all the values to the right of Q2. Thus
quartile, and the 100 th percentile here, 80 is the middle number of
represents the fourth quartile. numbers to the right of the actual
The first quartile is the median of all the median (Q2 ).
values to the actual median's (Q2) left.
Similarly, the third quartile is the
median of all the values to the actual
The interquartile range (IQR) can be
median's (Q2) right.
calculated as Q3 – Q1, which is 80 - 30 =
50.

41
i is the ith decile and can be represented
as:
1st Decile, D1 = 1 * (n + 1)/ 10 th data
2nd Decile, D2 = 2 * (n + 1)/ 10 th data

and so on
An important application of quartiles is
in temperature ranges for the day as Steps to calculate decile:
reported on a weather report. In the
a. Find out the number of data or
presence of irregularities, the range
values can be significantly influenced by variables in the sample or
population. This is denoted by n.
them. Hence, it is preferred to use the
IQR instead, thereby ignoring the top 25
percentile and the bottom 25 percentile
of the data points. In the presence of b. In the next step, sort all the data
irregularities, IQR is more robust as well or variables in the sample or
as a better representation of the amount population in ascending order.
of spread in the data.
c. In the next step, based on the
8. Deciles decile that is required, calculate
the decile by using the formula:
Just like quartiles, we have deciles.
While quartiles sort the data into four
quarters, deciles sort the data into ten
equal parts: the 10 th, 20th, 30th, 40th, 𝑖 ∗ (𝑛 + 1 )
𝐷𝑖 =
50th, 60th, 70th, 80th, 90th,100th. 10𝑡ℎ 𝐷𝑎𝑡𝑎

The higher the place in the decile

ranking, the higher is the overall d. Lastly, based on the decile value,
ranking. For example, a person receiving determine the corresponding
99 percentiles in a test would be placed variable from amongst the
in a decile ranking of 10. However, a population data.
person receiving 5 percentiles in the
same test would be placed in a decile
ranking of 1. Let us look at an example to understand
The mathematical formula to calculate the concept in detail:
decile is:
𝑖 ∗ (𝑛 + 1 ) Suppose we have been given 23 random
𝐷𝑖 =
10𝑡ℎ 𝐷𝑎𝑡𝑎 numbers between 20 and 80. We need to
Where n is the number of data in the represent them as deciles.
population sample.

42
Let’s say the raw numbers are: [24, 32, Now D1 = 1 * (n+1)/ 10 th data
27, 32, 23, 62, 45, 77, 60, 63, 36, 54, 57, 36,
72, 55, 51, 32, 56, 33, 42, 55, 30] = 1* (23 + 1)/ 10

Following the steps mentioned above, we = 2.4 th data i.e. data between
first determine the number of variables digit number 2 & 3
in the sample (n). Here n = 23. Which is 24 + 0.4 * ( 27- 24 ) = 25.2
We then need to sort the 23 random Similarly,
numbers in ascending order, as shown
below. D2 = 2 * (n+1)/ 10 th data

SR. No Digit = 2 * (23 + 1)/ 10

1 23 = 4.8th data i.e. data between digit
2 24 number 4 & 5
3 27
Which is 30 + 0.8 * ( 32 - 30 ) = 31.6
4 30
5 32
6 32
7 32
8 33 D3 = 3 * (n+1)/ 10 th data
9 36 = 3 * (23 + 1)/ 10
10 36
11 42 = 7.2nd data i.e. data between digit
number 7 & 8
12 45
13 51 Which is 32 + 0.2 * ( 33 - 32 ) = 32.2
14 54
15 55
16 55
17 56 D4 = 4 * (n+1)/ 10 th data
18 57
19 60 = 4 * (23 + 1)/ 10
20 62 = 9.6th data i.e. data between digit
21 63 number 9 & 10
22 72
Which is 36 + 0.6 * ( 36 - 36 ) = 36
23 77

We can now calculate the positions of

decile D1 to decile D9 .
D5 = 5 * (n+1)/ 10 th data

43
= 5 * (23 + 1)/ 10 = 9 * (23 + 1)/ 10
= 12th data i.e. data at digit number = 21.6 th data i.e. data between digit
12 number 21 & 22
Which is 45 Which is 63 + 0.6 * ( 72 - 63 ) = 68.4

D6= 6 * (n+1)/ 10 th data

= 6 * (23 + 1)/ 10 Thus, we can represent the deciles for
the data set with its positions and
= 14.4th data i.e. data between digit
corresponding values in a table as
number 14 & 15
shown below:
Which is 54 + 0.4 * ( 55 - 54 ) = 54.4

Decile Data position Value

1 2.4 25.2
D7= 7 * (n+1)/ 10 th data 2 4.8 31.6
= 7 * (23 + 1)/ 10 3 7.2 32.2
= 16.8th data i.e. data between digit 4 9.6 36
number 16 & 17
5 12 45
Which is 55 + 0.8 * ( 56 - 55 ) = 55.8 6 14.4 54.4
7 16.8 55.8
8 19.2 60.4
D8= 8 * (n+1)/ 10 th data 9 21.6 68.4

= 8 * (23 + 1)/ 10
One example of the use of deciles is in
= 19.2nd data i.e. data between digit
school rankings. Students in the top 10
number 19 & 20
% or highest decile will be rewarded,
Which is 60 + 0.2 * ( 62 - 60 ) = 60.4 whereas students in the last 10% or
lowest decile will be given extra
assistance to improve their scores.

D9= 9 * (n+1)/ 10 th data

44
Recap

• In Data Science, data merging is the process of combining two or more

data sets into a single data frame.
• In one-to-one join, each row in one table is linked to a single row in
another table using a “key” column.
• In a one to many join, one record in a table can be related to one or
many records in another table.
• A many to many relationships are said to occur when multiple records
in one table are related to multiple records of other table.

Exercises

Objective Type Questions

Please choose the correct option in the questions below.
1. The pth percentile of a distribution is such that:
a) p percent of the observations fall at it
b) p percent of the observations fall below it
c) p percent of the observations fall at or below it
d) the value is p.

2. Which of the following function is used for quantiles of quantitative values?

a) Quantile
b) Quantity
c) Quantiles
d) All of the mentioned

3. The distribution of heights of Indian women aged 18 to 24 is approximately

normally distributed with a mean of 65.5 inches and standard deviation of 2.5
inches. Calculate the z-score for a woman six feet tall.
a) 2.60
b) 4.11
c) 1.04
d) 1.33

45
4. What is a z-score?
a) It is the number of standard deviations a particular score lies above or below
the mean of the set of scores.
b) It is a standardized measure of the mean of a set of data.
c) It is the average frequency of scores in a sample
d) It is a measure of central tendency in the data.

5. The median, mode, deciles and percentiles are all considered as measures of
a) Mathematical averages
b) Population averages
c) Sample averages
d) Averages of position

6. According to percentiles, the median to be measured must lie in

a) 80th
b) 40th
c) 50th
d) 100th

7. What measures of position divides the distribution into 10 equal parts?

a) Quartiles
b) Deciles
c) Percentiles
d) Range

8. What measures of position divides the distribution into 4 equal parts?

a) Quartiles
b) Deciles
c) Percentiles
d) Range

Standard Questions
Please answer the questions below in no less than 100 words.
1. What is data merging?
2. Why is data merging required in data science?
3. Name different ways of merging data sets
4. Explain one-to-one join with the help of an example
5. Explain one-to-many join with the help of an example

10 Database and File Concepts
No ratings yet
10 Database and File Concepts
14 pages
STD 10 Chap 4 Data Merging Notes
No ratings yet
STD 10 Chap 4 Data Merging Notes
4 pages
Applied Database
No ratings yet
Applied Database
39 pages
Classx DS Unit 4
No ratings yet
Classx DS Unit 4
43 pages
Introduction To Data Analytics-Module 1 Part 2
No ratings yet
Introduction To Data Analytics-Module 1 Part 2
78 pages
Chapter3 DataPreprocessing
No ratings yet
Chapter3 DataPreprocessing
50 pages
Unit 3
No ratings yet
Unit 3
36 pages
Classx - DS - UNIT 1
No ratings yet
Classx - DS - UNIT 1
49 pages
Introduction To Applied Database-2019
No ratings yet
Introduction To Applied Database-2019
49 pages
F.M.L. Thompson - The Cambridge Social History of Britain, 1750-1950, Vol. 01. Regions and Communities
No ratings yet
F.M.L. Thompson - The Cambridge Social History of Britain, 1750-1950, Vol. 01. Regions and Communities
592 pages
Normalization
No ratings yet
Normalization
32 pages
Central Tendency
No ratings yet
Central Tendency
24 pages
Relational Database SQL
No ratings yet
Relational Database SQL
28 pages
Databases Chapter 1 - Database Design
No ratings yet
Databases Chapter 1 - Database Design
10 pages
Lecture 1 Introduction To Database
No ratings yet
Lecture 1 Introduction To Database
28 pages
DM LAQs (CT 1)
No ratings yet
DM LAQs (CT 1)
40 pages
First Normal Form
No ratings yet
First Normal Form
6 pages
Data Preprocessing
No ratings yet
Data Preprocessing
39 pages
Database Design, Normalization and SQL
No ratings yet
Database Design, Normalization and SQL
44 pages
FS-Python-3M - IP-Database
No ratings yet
FS-Python-3M - IP-Database
14 pages
Chapter 1 RM
No ratings yet
Chapter 1 RM
44 pages
RDBMS Concepts
No ratings yet
RDBMS Concepts
54 pages
MMW Chapter 5 GH Annotated1
No ratings yet
MMW Chapter 5 GH Annotated1
32 pages
Database & Database Management Systems (Notes)
No ratings yet
Database & Database Management Systems (Notes)
22 pages
Chapter 2 Data Management
No ratings yet
Chapter 2 Data Management
20 pages
IT Elective I Advance Database System
No ratings yet
IT Elective I Advance Database System
70 pages
04 Data Normalization and Erd 4-4-21
No ratings yet
04 Data Normalization and Erd 4-4-21
7 pages
MMW - Data Descriptors, Probabilities and Normal Distribution, Regression and Correlation
No ratings yet
MMW - Data Descriptors, Probabilities and Normal Distribution, Regression and Correlation
6 pages
Week 1
No ratings yet
Week 1
15 pages
Data Normalization
No ratings yet
Data Normalization
13 pages
Oracle Export and Import Utility
No ratings yet
Oracle Export and Import Utility
11 pages
DBMS Normalization
No ratings yet
DBMS Normalization
53 pages
Normalization
100% (2)
Normalization
16 pages
SQL Assignment 2
No ratings yet
SQL Assignment 2
9 pages
DBMS Notes
No ratings yet
DBMS Notes
15 pages
Operation Guide 3294: About This Manual
No ratings yet
Operation Guide 3294: About This Manual
3 pages
4.what Is Normalization PDF
No ratings yet
4.what Is Normalization PDF
9 pages
Week 05
No ratings yet
Week 05
23 pages
Data Analysis and Report Writing BRM
No ratings yet
Data Analysis and Report Writing BRM
49 pages
Preliminary Definitions: Entity
No ratings yet
Preliminary Definitions: Entity
9 pages
Normal Distribution Report
No ratings yet
Normal Distribution Report
5 pages
Data Processing
No ratings yet
Data Processing
73 pages
Untitled Document
No ratings yet
Untitled Document
10 pages
Chapter-1: 1.1 Tapered Steel Members
No ratings yet
Chapter-1: 1.1 Tapered Steel Members
11 pages
Overcurrent Protection Device Basis
No ratings yet
Overcurrent Protection Device Basis
10 pages
Eurovent - New Energy Classes - 2016 PDF
No ratings yet
Eurovent - New Energy Classes - 2016 PDF
3 pages
Figure 1: Representation of The Measures of Relative Standing On A Normal Distribution
No ratings yet
Figure 1: Representation of The Measures of Relative Standing On A Normal Distribution
7 pages
Z-Score Examples With Solutions
No ratings yet
Z-Score Examples With Solutions
6 pages
Normalization 1st To 5th NF With Example
No ratings yet
Normalization 1st To 5th NF With Example
33 pages
Processing & Analysis of Data
No ratings yet
Processing & Analysis of Data
25 pages
Yellowstripe Scad
No ratings yet
Yellowstripe Scad
7 pages
SQL Normalisation, Constraints, ERD and ACID Properties
No ratings yet
SQL Normalisation, Constraints, ERD and ACID Properties
9 pages
Evaluasi Penggunaan Oksigen Sebagai Penghasil Uap Terapi Nebulizer Pada Pasien Asma
No ratings yet
Evaluasi Penggunaan Oksigen Sebagai Penghasil Uap Terapi Nebulizer Pada Pasien Asma
7 pages
Assignment July-December 2014: Management Programme
No ratings yet
Assignment July-December 2014: Management Programme
14 pages
Discriptive Statistics
No ratings yet
Discriptive Statistics
52 pages
Normalization Bbc2024
No ratings yet
Normalization Bbc2024
10 pages
Innovative Lpe Coatings
No ratings yet
Innovative Lpe Coatings
30 pages
Secrets of Mind Power Harry Lorayne
No ratings yet
Secrets of Mind Power Harry Lorayne
45 pages
Invitation PWD Forum
No ratings yet
Invitation PWD Forum
5 pages
DBMS Unit-2
No ratings yet
DBMS Unit-2
39 pages
Normalisation Data
No ratings yet
Normalisation Data
8 pages
Assignmentdetails Physics 12
No ratings yet
Assignmentdetails Physics 12
3 pages
AS Pratical (Theory) Cheat Sheet
No ratings yet
AS Pratical (Theory) Cheat Sheet
4 pages
Dbms Ia 2 Set A Scheme
No ratings yet
Dbms Ia 2 Set A Scheme
8 pages
CS1 Formula Sheet
No ratings yet
CS1 Formula Sheet
15 pages
Octavia Manual Running Gear Part4
No ratings yet
Octavia Manual Running Gear Part4
136 pages
Aviation Ni-Cd BMT - Battery Maintenance Training
No ratings yet
Aviation Ni-Cd BMT - Battery Maintenance Training
2 pages
A Rookie's Guide To Data Normalization - Datameer
No ratings yet
A Rookie's Guide To Data Normalization - Datameer
1 page
Q.1 What Is Normalisation? ANSWER:-Normalisation Is The Process of Structuring A Relational Database in Accordance
No ratings yet
Q.1 What Is Normalisation? ANSWER:-Normalisation Is The Process of Structuring A Relational Database in Accordance
9 pages
Solving Linear Fractional Programming Problems With Interval Coefficients in The Objective Function. A New Approach
No ratings yet
Solving Linear Fractional Programming Problems With Interval Coefficients in The Objective Function. A New Approach
11 pages
MYSQL DAY - 20 (Normalization)
No ratings yet
MYSQL DAY - 20 (Normalization)
13 pages
HW SW Codesign
No ratings yet
HW SW Codesign
514 pages
DLL-November 4-8-25, 2024
No ratings yet
DLL-November 4-8-25, 2024
4 pages
ACC: Database Normalization Basics Description of Normalization
No ratings yet
ACC: Database Normalization Basics Description of Normalization
5 pages
Bageshwori Civil Consult Pvt. LTD: Kathmandu, Nepal
No ratings yet
Bageshwori Civil Consult Pvt. LTD: Kathmandu, Nepal
7 pages
Modelling of Tension-Stiffening in Bending RC Elements Based On Equivalent Stiffness of The Rebar
No ratings yet
Modelling of Tension-Stiffening in Bending RC Elements Based On Equivalent Stiffness of The Rebar
21 pages
Standardization For Oil and Gas Sector: S.M. Bhatia Deputy Director General Bureau of Indian Standards
No ratings yet
Standardization For Oil and Gas Sector: S.M. Bhatia Deputy Director General Bureau of Indian Standards
41 pages
Water Jet Cutter
No ratings yet
Water Jet Cutter
7 pages
Schneider Electric - ComPacT-NSX-new-generation - LV432642
No ratings yet
Schneider Electric - ComPacT-NSX-new-generation - LV432642
3 pages
3.-GE11 EntrepreneurialMind FINAL
100% (4)
3.-GE11 EntrepreneurialMind FINAL
15 pages
Final Trial Exam - 2021: Text One
No ratings yet
Final Trial Exam - 2021: Text One
7 pages
Caries Detection
No ratings yet
Caries Detection
7 pages
Lab Manual 10
No ratings yet
Lab Manual 10
12 pages
Dbms Ch3
No ratings yet
Dbms Ch3
23 pages
18CSP83 - Project Phase 2 - Body
No ratings yet
18CSP83 - Project Phase 2 - Body
11 pages
Non Core - Ganai
No ratings yet
Non Core - Ganai
2 pages
Compound Key
No ratings yet
Compound Key
13 pages
SikaGrout-220 2011-11 - 1
No ratings yet
SikaGrout-220 2011-11 - 1
4 pages
Microsoft Access: Database Creation and Management through Microsoft Access
From Everand
Microsoft Access: Database Creation and Management through Microsoft Access
Steven Bright
No ratings yet
SQL Programming & Database Management For Absolute Beginners: SQL Server, Structured Query Language Fundamentals: "Learn - By Doing" Approach And Master SQL
From Everand
SQL Programming & Database Management For Absolute Beginners: SQL Server, Structured Query Language Fundamentals: "Learn - By Doing" Approach And Master SQL
William Sullivan
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CH 4 Handout

Uploaded by

CH 4 Handout

Uploaded by

CHAPTER

However, while merging the data from

In this example, the primary key field in

Many to Many Joins

The higher the place in the decile

SR. No Digit = 2 * (23 + 1)/ 10

We can now calculate the positions of

D6= 6 * (n+1)/ 10 th data

Decile Data position Value

D9= 9 * (n+1)/ 10 th data

• In Data Science, data merging is the process of combining two or more

Objective Type Questions

2. Which of the following function is used for quantiles of quantitative values?

3. The distribution of heights of Indian women aged 18 to 24 is approximately

6. According to percentiles, the median to be measured must lie in

7. What measures of position divides the distribution into 10 equal parts?

8. What measures of position divides the distribution into 4 equal parts?

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.