0% found this document useful (0 votes)

20 views

Validating Direct Mapping with SQL

The document outlines a SQL-based approach for validating direct mapping between two tables, Table 1 and Table 2, focusing on counting records, identifying mapped columns, comparing data, and checking for missing records. It provides specific SQL queries for each step, including data type and format validation. An example scenario illustrates the mapping and SQL application for practical understanding.

Uploaded by

mmyybabybaby

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Validating Direct Mapping with SQL

Uploaded by

mmyybabybaby

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

You're dealing with a common ETL scenario: Table 1 (10 columns) and Table 2 (8

columns). You want to validate the "direct mapping," meaning specific columns from
Table 1 are moved to Table 2. Here's a SQL-based approach:

Assumptions
● You have read access to both tables.
● You know which columns in Table 1 should map to which columns in Table 2.
● Both tables are in the same database or you can use fully qualified names (e.g.,
database1.schema1.table1).
● A "key" column(s) exists that can uniquely identify rows in Table 1 and Table 2.
This is crucial for comparing data. Let's call this key column Table1_Key and
Table2_Key, respectively. They may or may not have the same name.
Steps
1. Count the Records
○ This is a basic sanity check. It doesn't guarantee data accuracy, but it can
quickly reveal if a large number of records are missing.
-- Count records in Table 1
SELECT COUNT(*) AS SourceRecordCount FROM Table1;

-- Count records in Table 2

SELECT COUNT(*) AS TargetRecordCount FROM Table2;

○ Compare the counts. The TargetRecordCount should be less than or equal to

the SourceRecordCount. If it's significantly lower, it indicates a problem. If it's
higher, it indicates duplicates, which is also a problem.
2. Identify Mapped Columns
○ This is the most crucial part. Let's say your mapping is as follows:
■ Table 1.ColumnA -> Table 2.ColumnA
■ Table 1.ColumnB -> Table 2.ColumnB
■ Table 1.ColumnC -> Table 2.ColumnC
■ Table 1.ColumnD -> Table 2.ColumnD
■ Table 1.Table1_Key -> Table 2.Table2_Key
3. Compare Data Using a JOIN
○ Use a JOIN (usually a LEFT JOIN or INNER JOIN) to compare the data in the
mapped columns.
-- Compare mapped columns
SELECT
T1.Table1_Key,
T1.ColumnA AS T1_ColumnA,
T2.ColumnA AS T2_ColumnA,
T1.ColumnB AS T1_ColumnB,
T2.ColumnB AS T2_ColumnB,
T1.ColumnC AS T1_ColumnC,
T2.ColumnC AS T2_ColumnC,
T1.ColumnD AS T1_ColumnD,
T2.ColumnD AS T2_ColumnD
FROM
Table1 AS T1
LEFT JOIN Table2 AS T2 ON T1.Table1_Key = T2.Table2_Key
WHERE
T1.ColumnA != T2.ColumnA OR
T1.ColumnB != T2.ColumnB OR
T1.ColumnC != T2.ColumnC OR
T1.ColumnD != T2.ColumnD OR
T2.Table2_Key IS NULL;

○ Explanation:
■ The FROM and LEFT JOIN clauses join the two tables on their key
columns. A LEFT JOIN is used to include all rows from Table 1, even if
there's no matching row in Table 2.
■ The SELECT clause retrieves the key columns and the mapped data
columns from both tables, aliasing them (e.g., T1_ColumnA, T2_ColumnA)
to distinguish between them.
■ The WHERE clause filters the results to show only rows where the data in
the mapped columns is different or where a key from Table 1 is not found
in Table 2 (indicating a missing row in the target).
■ If the query returns any rows, it indicates a data discrepancy.
4. Check for Missing Target Records
○ The LEFT JOIN in the previous query also helps identify missing records in
Table 2. The T2.Table2_Key IS NULL condition in the WHERE clause will find
these. You can also write a separate query:
-- Find missing records in Table 2
SELECT
T1.Table1_Key
FROM
Table1 AS T1
LEFT JOIN Table2 AS T2 ON T1.Table1_Key = T2.Table2_Key
WHERE
T2.Table2_Key IS NULL;

5. Data Type and Format Validation

○ SQL can also help with basic data type and format validation. For example:
-- Check for non-numeric values in a numeric column (e.g., age)
SELECT Table2_Key FROM Table2 WHERE TRY_CAST(ColumnB AS INT) IS
NULL AND ColumnB IS NOT NULL;

-- Check for dates in an incorrect format

SELECT Table2_Key FROM Table2 WHERE ISDATE(ColumnC) = 0 AND
ColumnC IS NOT NULL;

--Check nulls in not nullable columns

SELECT Table2_Key from Table2 where columnD is NULL;

○ These queries use SQL functions (TRY_CAST, ISDATE) to check if the data in
ColumnB and ColumnC is of the expected type. The IS NOT NULL condition is
added to not select rows where the value is already null.
Example Scenario

Let's say:
● Table 1: SourceData (SourceDataID, Name, Age, City, ProductID, OrderDate,
Email, Phone, Address, Status)
● Table 2: TargetData (TargetDataID, CustomerName, CustomerAge, CustomerCity,
ProductID, OrderDate, EmailAddress, Status)
● Mapping:
○ SourceData.Name -> TargetData.CustomerName
○ SourceData.Age -> TargetData.CustomerAge
○ SourceData.City -> TargetData.CustomerCity
○ SourceData.ProductID -> TargetData.ProductID
○ SourceData.OrderDate -> TargetData.OrderDate
○ SourceData.Email -> TargetData.EmailAddress
○ SourceData.Status -> TargetData.Status
○ SourceData.SourceDataID -> TargetData.TargetDataID
Here's how you'd apply the SQL:

-- 1. Count Records
SELECT COUNT(*) AS SourceCount FROM SourceData;
SELECT COUNT(*) AS TargetCount FROM TargetData;

-- 2. Compare Data
SELECT
SD.SourceDataID,
SD.Name AS SD_Name,
TD.CustomerName AS TD_CustomerName,
SD.Age AS SD_Age,
TD.CustomerAge AS TD_CustomerAge,
SD.City AS SD_City,
TD.CustomerCity AS TD_CustomerCity,
SD.ProductID,
TD.ProductID,
SD.OrderDate,
TD.OrderDate,
SD.Email AS SD_Email,
TD.EmailAddress AS TD_EmailAddress,
SD.Status,
TD.Status
FROM
SourceData AS SD
LEFT JOIN TargetData AS TD ON SD.SourceDataID = TD.TargetDataID
WHERE
SD.Name != TD.CustomerName OR
SD.Age != TD.CustomerAge OR
SD.City != TD.CustomerCity OR
SD.ProductID != TD.ProductID OR
SD.OrderDate != TD.OrderDate OR
SD.Email != TD.EmailAddress OR
SD.Status != TD.Status OR
TD.TargetDataID IS NULL;

-- 3. Check for Missing Target Records

SELECT SourceDataID FROM SourceData WHERE SourceDataID NOT IN (SELECT
TargetDataID FROM TargetData);

-- 4. Data Type/Format Validation

SELECT TargetDataID FROM TargetData WHERE TRY_CAST(CustomerAge AS INT) IS
NULL AND CustomerAge IS NOT NULL;
SELECT TargetDataID FROM TargetData WHERE ISDATE(OrderDate) = 0 AND
OrderDate IS NOT NULL;
SELECT TargetDataID from TargetData where EmailAddress is NULL;

This comprehensive SQL approach will help you thoroughly validate the direct
mapping from Table 1 to Table 2. Adapt the table and column names to your specific
scenario.

SQL Refresher
No ratings yet
SQL Refresher
5 pages
Cheat Sheet Mysql
No ratings yet
Cheat Sheet Mysql
1 page
Cheat Sheet Oracle
No ratings yet
Cheat Sheet Oracle
1 page
SQL Cheat Sheet
No ratings yet
SQL Cheat Sheet
15 pages
Oracle 2nd Assignment
No ratings yet
Oracle 2nd Assignment
11 pages
Actifio: Next-Generation Data Management
No ratings yet
Actifio: Next-Generation Data Management
36 pages
SQL Notes
No ratings yet
SQL Notes
15 pages
Essential SQL Commands Cheat Sheet
No ratings yet
Essential SQL Commands Cheat Sheet
3 pages
SQL
No ratings yet
SQL
4 pages
Relational Model Basics
No ratings yet
Relational Model Basics
32 pages
SQL Cheat Sheet
No ratings yet
SQL Cheat Sheet
3 pages
1 Ssis Advanced Part2 m1 Slides
No ratings yet
1 Ssis Advanced Part2 m1 Slides
31 pages
SQL Server Queries
No ratings yet
SQL Server Queries
12 pages
Cheat Sheet Sqlserver
No ratings yet
Cheat Sheet Sqlserver
1 page
Sap NOTES
No ratings yet
Sap NOTES
61 pages
Cheat Sheet Sqlserver
No ratings yet
Cheat Sheet Sqlserver
1 page
SQL Cheat Sheet
100% (4)
SQL Cheat Sheet
3 pages
SQL Cheat Sheet - Follow Dr. AngShuMan Ghosh For More-1
No ratings yet
SQL Cheat Sheet - Follow Dr. AngShuMan Ghosh For More-1
3 pages
SQL Cheat Sheet
100% (2)
SQL Cheat Sheet
3 pages
?????? ??? ??????? ????? ?????
No ratings yet
?????? ??? ??????? ????? ?????
4 pages
SQL Cheat Sheet Mysql
No ratings yet
SQL Cheat Sheet Mysql
1 page
SQL Cheat Sheet
No ratings yet
SQL Cheat Sheet
4 pages
SQL 20
100% (1)
SQL 20
4 pages
SQL Cheat Sheet Sqlserver
No ratings yet
SQL Cheat Sheet Sqlserver
1 page
W3S SQL
No ratings yet
W3S SQL
13 pages
Cheat Sheet Oracle
No ratings yet
Cheat Sheet Oracle
1 page
SQL Database Cheat Sheet-1
No ratings yet
SQL Database Cheat Sheet-1
8 pages
Cheat Sheet Postgresql
No ratings yet
Cheat Sheet Postgresql
1 page
University of Technology Sydney SQL-Exam-Notes 2019 Database Fundamental
No ratings yet
University of Technology Sydney SQL-Exam-Notes 2019 Database Fundamental
2 pages
22 5cosc020w Lect06 SQL Join
No ratings yet
22 5cosc020w Lect06 SQL Join
25 pages
Forgotten T-SQL Cheat Sheet: L P O Select XML T: L D
100% (1)
Forgotten T-SQL Cheat Sheet: L P O Select XML T: L D
1 page
SQL Essentials: Mark Mcilroy
No ratings yet
SQL Essentials: Mark Mcilroy
36 pages
SQL Basics
No ratings yet
SQL Basics
4 pages
Select Modifying Data: SQL Cheat Sheet - Mysql
No ratings yet
Select Modifying Data: SQL Cheat Sheet - Mysql
3 pages
Select Modifying Data: SQL Cheat Sheet - Oracle
No ratings yet
Select Modifying Data: SQL Cheat Sheet - Oracle
3 pages
Please help me with real time SQL query for ETL t...
No ratings yet
Please help me with real time SQL query for ETL t...
3 pages
BDST 122 RDBMS
No ratings yet
BDST 122 RDBMS
12 pages
Learn SQL in 4 Hours
No ratings yet
Learn SQL in 4 Hours
3 pages
SQL
No ratings yet
SQL
20 pages
MySQL 2
No ratings yet
MySQL 2
25 pages
SQL Queries and Concepts
No ratings yet
SQL Queries and Concepts
5 pages
Database Management System Lab COE-317: Submitted by:-366/CO/14 SOMYA Sangal 374/CO/14 TWISHI Tyagi Coe-Iii
0% (1)
Database Management System Lab COE-317: Submitted by:-366/CO/14 SOMYA Sangal 374/CO/14 TWISHI Tyagi Coe-Iii
17 pages
Advanced Data Selection
No ratings yet
Advanced Data Selection
36 pages
SQL Cheat Sheet:: - by Yash Shirodkar
No ratings yet
SQL Cheat Sheet:: - by Yash Shirodkar
8 pages
SQL Cheat Sheet:: - by Yash Shirodkar
No ratings yet
SQL Cheat Sheet:: - by Yash Shirodkar
8 pages
22 5cosc020w Lect06 SQL Join
No ratings yet
22 5cosc020w Lect06 SQL Join
25 pages
T SQL
No ratings yet
T SQL
39 pages
SQL Ultimate Cheat Sheet
No ratings yet
SQL Ultimate Cheat Sheet
9 pages
SQL Cheat Sheet
No ratings yet
SQL Cheat Sheet
7 pages
Lec 7
No ratings yet
Lec 7
40 pages
SQL Cheat Sheet DATAwithBARAA
No ratings yet
SQL Cheat Sheet DATAwithBARAA
5 pages
Revision Mid 496
No ratings yet
Revision Mid 496
12 pages
cp4152 Database Practices Lab
No ratings yet
cp4152 Database Practices Lab
51 pages
SQL PPT DDL DML Agg Operator Clauses
No ratings yet
SQL PPT DDL DML Agg Operator Clauses
76 pages
2table insert
No ratings yet
2table insert
9 pages
Database syntax (by chatGPT)
No ratings yet
Database syntax (by chatGPT)
4 pages
SQL commands
No ratings yet
SQL commands
3 pages
Select Modifying Data: SQL Cheat Sheet - SQL Server
No ratings yet
Select Modifying Data: SQL Cheat Sheet - SQL Server
3 pages
Cassandra Query Language by Examples - Puzzles with Answers
From Everand
Cassandra Query Language by Examples - Puzzles with Answers
Cristian Scutaru
No ratings yet
Ti 84 Plus Calculator: QuickStudy Laminated Reference Guide
From Everand
Ti 84 Plus Calculator: QuickStudy Laminated Reference Guide
Ken Yablonsky
No ratings yet
SQL in 30 Pages
From Everand
SQL in 30 Pages
U.Q. Magnusson
4/5 (12)
OLTP and OLAP
No ratings yet
OLTP and OLAP
46 pages
COPY JOB (Preview) - Amazon Redshift
No ratings yet
COPY JOB (Preview) - Amazon Redshift
1 page
Create an IAM role for S3 access
No ratings yet
Create an IAM role for S3 access
2 pages
AWS IAM roles - Matillion Docs
No ratings yet
AWS IAM roles - Matillion Docs
3 pages
AWS Glue is a fully managed ETL
No ratings yet
AWS Glue is a fully managed ETL
2 pages
Build an ETL service pipeline to load data incrementally from Amazon S3 to Amazon Redshift using AWS Glue - AWS Prescriptive Guidance
No ratings yet
Build an ETL service pipeline to load data incrementally from Amazon S3 to Amazon Redshift using AWS Glue - AWS Prescriptive Guidance
15 pages
Transformation (1)
No ratings yet
Transformation (1)
2 pages
Validationns Mapping
No ratings yet
Validationns Mapping
2 pages
Employee Details
No ratings yet
Employee Details
2 pages
Mapping Document Source Side • Metadata Validat...
No ratings yet
Mapping Document Source Side • Metadata Validat...
3 pages
City and Sate
No ratings yet
City and Sate
28 pages
Designed Test Cases For Data Loading
No ratings yet
Designed Test Cases For Data Loading
1 page
Cal List SQL
No ratings yet
Cal List SQL
10 pages
Hive-NASA Case Study
100% (1)
Hive-NASA Case Study
9 pages
ASM1 1st DatabaseDesignAndDevelopment
No ratings yet
ASM1 1st DatabaseDesignAndDevelopment
9 pages
Herman Michael. - Real Python For The Web
No ratings yet
Herman Michael. - Real Python For The Web
439 pages
Your Ultimate Tosca basics for Interview
No ratings yet
Your Ultimate Tosca basics for Interview
18 pages
Generalization: For Example, Faculty and Student Entities Can Be Generalized and Create A Higher Level
No ratings yet
Generalization: For Example, Faculty and Student Entities Can Be Generalized and Create A Higher Level
14 pages
A4 Size
0% (1)
A4 Size
1 page
BDA Mod 3
No ratings yet
BDA Mod 3
57 pages
Laboratory 8 - Securing Databases
No ratings yet
Laboratory 8 - Securing Databases
15 pages
Security Real-Time Data Auditing With Extended Oracle Change Data Capture
No ratings yet
Security Real-Time Data Auditing With Extended Oracle Change Data Capture
6 pages
Practical-1: Aim: Hadoop Configuration and Single Node Cluster Setup and Perform File Management Task in
No ratings yet
Practical-1: Aim: Hadoop Configuration and Single Node Cluster Setup and Perform File Management Task in
61 pages
Bite302l - Database-Systems - TH - 1.0 - 71 - Bite302l - 66 Acp
No ratings yet
Bite302l - Database-Systems - TH - 1.0 - 71 - Bite302l - 66 Acp
2 pages
Dbms Question Bank Ans
No ratings yet
Dbms Question Bank Ans
16 pages
Unit 5
No ratings yet
Unit 5
17 pages
HCLT108-1-July-Dec2024-FA3-Memo-LS-V.3-14072024
No ratings yet
HCLT108-1-July-Dec2024-FA3-Memo-LS-V.3-14072024
8 pages
Guillermo V Ochoa - Resume - Gvochoa Gmail
No ratings yet
Guillermo V Ochoa - Resume - Gvochoa Gmail
2 pages
Master Mongodb Development For Web & Mobile Apps. Crud Operations, Indexes, Aggregation Framework - All About Mongodb!
No ratings yet
Master Mongodb Development For Web & Mobile Apps. Crud Operations, Indexes, Aggregation Framework - All About Mongodb!
3 pages
04 Transaction Replication
No ratings yet
04 Transaction Replication
41 pages
Assignment Elsa
No ratings yet
Assignment Elsa
15 pages
DBMS Exp7
No ratings yet
DBMS Exp7
4 pages
Types of NF
No ratings yet
Types of NF
6 pages
Module - 5: Python Application Programming
No ratings yet
Module - 5: Python Application Programming
25 pages
Indexes in The Teradata Database
No ratings yet
Indexes in The Teradata Database
4 pages
361المشروع اسيات قواعد البيانات
No ratings yet
361المشروع اسيات قواعد البيانات
8 pages
(Ebook) ADO Programmer’s Reference by Dave Sussman (auth.) ISBN 9781430207184, 9781590593424, 1430207183, 1590593421 All Chapters Instant Download
100% (2)
(Ebook) ADO Programmer’s Reference by Dave Sussman (auth.) ISBN 9781430207184, 9781590593424, 1430207183, 1590593421 All Chapters Instant Download
67 pages
Transform Data Into Actionable Insights
No ratings yet
Transform Data Into Actionable Insights
17 pages
Sreekanth Reddy 03
No ratings yet
Sreekanth Reddy 03
2 pages
DBMS Module 3
No ratings yet
DBMS Module 3
69 pages
Course Syllabus - Backend Web Developer - V2
No ratings yet
Course Syllabus - Backend Web Developer - V2
7 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Validating Direct Mapping with SQL

Uploaded by

Validating Direct Mapping with SQL

Uploaded by

You're dealing with a common ETL scenario: Table 1 (10 columns) and Table 2 (8

-- Count records in Table 2

○ Compare the counts. The TargetRecordCount should be less than or equal to

5. Data Type and Format Validation

-- Check for dates in an incorrect format

--Check nulls in not nullable columns

-- 3. Check for Missing Target Records

-- 4. Data Type/Format Validation

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.