0% found this document useful (0 votes)
83 views10 pages

Data Warehouse Developer Curriculum

The document provides an apprenticeship curriculum for a data warehouse developer program. The curriculum consists of 1450 hours of theory and 4550 hours of practical, on-the-job training over 36 months. Key topics covered include data warehouse fundamentals, processes and technologies, data warehousing and OLAP, data mining, cluster analysis, and graph mining. Trainees must pass 40 out of 100 marks in theory and 180 out of 300 marks in practical components to complete the program. Upon completion, trainees will be qualified for jobs involving data analysis, data warehouse development and support, and software development life cycle support.

Uploaded by

vaishali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views10 pages

Data Warehouse Developer Curriculum

The document provides an apprenticeship curriculum for a data warehouse developer program. The curriculum consists of 1450 hours of theory and 4550 hours of practical, on-the-job training over 36 months. Key topics covered include data warehouse fundamentals, processes and technologies, data warehousing and OLAP, data mining, cluster analysis, and graph mining. Trainees must pass 40 out of 100 marks in theory and 180 out of 300 marks in practical components to complete the program. Upon completion, trainees will be qualified for jobs involving data analysis, data warehouse development and support, and software development life cycle support.

Uploaded by

vaishali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

APPRENTICESHIP

CURRICULUM
For
DATA WAREHOUSE DEVELOPER
Under
IT SECTOR
1 Program Title Data Warehouse Developer
2 Program Code, if any NA
3 Duration (hours and months) for NA
theory (Block I)
4 Duration (hours and months) for 6000 Hrs. (36 months)
On the Job Training (Block II)
5 Certifying body for theory Yashaswi academy for skills
component
6 Certifying Body for On the Job Yashaswi academy for skills
training/practical component

7 Minimum eligibility criteria Degree in Computer science


(Educational Qualification and/or
technical Qualification and
Experience)
Exemptions, if any
8 Trainer’s Qualification and Degree in computer Technology/ Engineering from
Experience(BT and OJT) recognized University with two years’ experience in
the relevant field.
9 Indicative list of training tools Refer Annexure A
required to deliver this
qualification (may be attached as
Annex A)
1 Formal structure of the curriculum
0
Modules Duration Duration Total
of of duration
Training- Training-
Theory Practical
On the
Job 1. Data Warehouse 100 350 450
Training Fundamentals
Progra 2. Data Warehouse Process 100 300 400
m and Technology.

3.Data Warehouse & OLAP. 100 300 400

4.Data Mining. 150 350 500

5.Cluster Analysis. 150 300 450

6.Graph mining. 50 300 350

7.Weka Tool. 100 350 450


8.Association Rule Mining. 100 300 400

9. Relational data Model and Language. 100 300 400

10. Mining Complex Types of Data. 100 300 400

11.ETL testing. 100 300 400

12. Business Analysis.   100 300 400

13. Business Requirements and Data 100 200 300


warehouse.

14. Data Extraction, Transformation, and 100 300 400


Loading.
Total
duration 1450 4550 6000
of OJT
1 Total Pass marks
1
Total and Pass Marks- Total and Pass Marks- Practical
Theory
Basic
Training NA NA
Program
On the Job 40 out of 100 marks 180 out of 300 marks
Training
Program
1 Job description-brief
2 1. Use technologies to access and interpret information effectively.
2. Provide day-to-day support and mentoring to end users who are
interacting with the data.
3. Use data mining techniques to analysis the source data and
determine the best reporting solutions to build for customer.
4. Provide support to Software Development Life cycle.
5. Provide training to all employees of development team.
6. Develop and prepare schedule for new data warehouse.
1 Employment avenues/opportunities
3 1) Product development and operations in various IT industries &
telecommunications.
2) Product development, marketing, operations in airline, banking, aircraft,
Health care, investment & insurance, personal care, public sector.
3) Distribution and marketing.
1 13 Curriculum update version and date 04 March 2020
4
1 Curriculum revision date 25 Feb 2025
5

Curriculum

I. Practical/On the job Training component (Block II)

Units Topics/Expected Key Learning outcomes

Data  Introduction to Data Warehouse, OLTP Systems;


Warehouse Differences between   OLTP Systems and Data
Fundamentals Warehouse.
 Characteristics of Data Warehouse; Functionality of
Data Warehouse: Advantages and Applications of
Data Warehouse; Advantages, Applications: Top-
Down   and Bottom-Up Development Methodology.
 Tools for Data warehouse development: Data
Warehouse Types.
 Planning Data Warehouse and Key Issues, Planning
and Project Management in constructing Data
warehouse.
 Data Warehouse development Life Cycle, Kimball
Lifecycle Diagram, Requirements Gathering
Approaches, Team organization, Roles, and
Responsibilities.
 Components of Data warehouse Architecture, Tool
selection:  Federated Data Warehouse Architecture

Data Warehouse  Warehousing Strategy, Warehouse//management and


Process and Support Processes. Warehouse Planning and
Technology Implementation.
 Hardware and Operating Systems for Data Warehousing,
Client/Server Computing Model & Data Warehousing
 Parallel Processors & Cluster Systems, Distributed DBMS
implementations, Warehousing Software, Warehouse
Schema Design, Data Extraction, Clean-up &
Transformation Tools, Warehouse Metadata.
 Categorizing Meta data, Meta data management in
practice; Meta data requirements gathering,
 Meta data classification, Meta data collection
strategies: Meta Data Management in Oracle and SAS:
Tools for Meta data management.

Data Warehouse & OLAP  Characteristics of OLAP, Steps in the OLAP Creation
Process, Advantageous of OLAP: What is
Multidimensional Data.
 OLAP Architectures; MOLAP, ROLAP, HOLAP: Data
Warehouse and OLAP: Hypercube & Multicubes.
 Aggregation, Historical 8 information, Query Facility, OLAP
function and Tools.

 Data Mining interface, Security, Backup and Recovery,


Tuning Data Warehouse, Testing Data Warehouse.

 Warehousing applications and Recent Trends: Types of


Warehousing Applications, Web Mining, Spatial Mining and
Temporal Mining.

Data Mining  Data Processing, Form of Data Pre-processing, Data


Cleaning: Missing Values, Noisy Data,(Binning,
Clustering, Regression, Computer and Human
inspection),Inconsistent Data, Data Integration and
Transformation.
 Data Reduction:-Data Cube Aggregation, Dimensionality
reduction, Data Compression, Numerosity Reduction,
Discretization and Concept hierarchy generation, Decision
Tree.
 Architecture for Data Mining: Profitable
Applications: Data Mining Tools.
 Data Mining Stages, Data Mining Models, Data
Warehousing (DWH) and On-Line Analytical Processing
(OLAP).
 Need for Data Warehousing, Challenges, Application of
Data Mining Principles, OLTP Vs DWH, Applications of
DWH

Cluster Analysis  Cluster Analysis, Clustering Methods- K means,


Hierarchical clustering, Agglomerative clustering,
Divisive clustering, clustering and segmentation
software, evaluating clusters.
 Categories of Web Mining – Web Content Mining,
Web Structure Mining, Web Usage Mining,
Applications of Web Mining, and Agent based and
Data base approaches, Web mining Software.
 Similarity and Distance Measures, Hierarchical and
Partitional Algorithms. Hierarchical Clustering- CURE and
Chameleon.
 Density Based Methods-DBSCAN, OPTICS. Grid Based
Methods- STING, CLIQUE. Model Based Method –
Statistical Approach,Association rules
 Introduction, Large Itemsets, Basic Algorithms, Parallel
and Distributed Algorithms, Neural Network approach.

Graph mining  Real-world network properties (node degree distribution,


shortest paths, clustering coefficient, small world
phenomenon, properties of time evolving graphs)
 Network generative models (random graphs, preferential
attachment model, Kronecker graphs, stochastic
blockmodels)
 Graph clustering and community detection (spectral
clustering, graph partitioning, modularity-based
algorithms, community structure of real-world graphs,
overlapping communities)
 Graph kernels and graph similarity
 Graph classification.
 Graph mining and applications
 Dense sub graph detection
 Rich network structures (signed and multilayer networks)
and applications
 Anomaly detection
 Geo-social and location based networks
 Iris plants database, Breast cancer database,
 Auto imports database – Introduction to WEKA,
 The Explorer – Getting started, Exploring the explorer,
Weka Tool
Learning algorithms, Clustering algorithms, Association–
rule learners.
 Classification rule process on WEKA data-set using j48
algorithm.
 Classification rule process on WEKA data-set using
Naive Bayes algorithm.
 Clustering rule process on data-set iris.arff using simple
k-means
Association Rule Mining  Mining Frequent Patterns, Associations and
Correlations
 Mining Methods – Mining various Kinds of
Association
 Rules – Correlation Analysis – Constraint Based
Association Mining
 Classification and Prediction
 Decision Tree Induction Bayesian Classification
,Rule Based Classification
 Classification by Back propagation Support
Vector Machines
 Associative Classification Lazy Learners Other
Classification Methods Prediction.
Relational data Model and  Relational data model concepts, integrity 8 constraints,
Language entity integrity, referential integrity, Keys constraints,
Domain constraints, relational algebra, relational
calculus, tuple and domain calculus.
 Introduction on SQL: Characteristics of SQL, advantage
of SQL. SQl data type and literals. Types of SQL
commands. SQL operators and their procedure. Tables,
views and indexes.
 Queries and sub queries. Aggregate functions. Insert,
update and delete operations, Joins, Unions,
Intersection, Minus, Cursors, Triggers, Procedures in
SQL/PL SQL.
Mining Complex Types of  Multidimensional Analysis and Descriptive Mining of
Data Complex.
 Data Objects, Mining Spatial Databases, Mining
Multimedia Databases.
 Mining Time-Series and Sequence Data, Mining Text
Databases
 Mining the World Wide Web.
 Mining Complex Data Objects: Generalization of
Structured Data
 Generalizing Spatial and Multimedia Data
 Plan Mining by Divide and Conquer
 Multidimensional Analysis
 Spatial Data Warehousing
 Methods for Computation of Spatial Data Cube

ETL testing  Creating, running and analysing


sessions/workflows in Informatica
 Preparation of Test strategy, Test plan and
estimations
 Testing scenarios, creation of test cases
and scripts
 Test case execution and defect tracking and
reporting.

Business Analysis   
 Reporting and Query tools and Applications
 Tool Categories
 The Need for Applications
 Cognos Impromptu
 Online Analytical Processing (OLAP)
 Need Multidimensional Data Model
 OLAP Guidelines Multidimensional versus Multirelational
OLAP
 Categories of Tools OLAP Tools and the Internet.
Business Requirements
and Data warehouse  Dimensional nature of Business data and Dimensional
Analysis,
 Dimension hierarchies and categories, Key Business
Metrics (Facts),
 Requirement Gathering methods and Requirements
Definition Document (contents)
 Business Requirements and Data Design - Structure for
Business Dimensions and Key Measurements, Levels of
detail
 Business Requirements and the Architecture plan
 Business Requirements and Data Storage
Specifications
 Business Requirements and Information Delivery
Strategy
Data Extraction,  Requirements of ETL and steps
Transformation, and  Data extraction - identification of sources and techniques
Loading  Data transformation - Basic tasks, Transformation types,
Data integration and consolidation, Transformation for
dimension attributes
 Data loading - Techniques and processes, Data refresh
versus update, Procedures for Dimension tables, Fact
tables
 History and incremental loads ETL Tool options

Annexure A

A: Trade Details
Sr. Particulars
1 Name of the Trade Data Warehouse Developer
2 Duration (In Semester): 6
3 Intake: 15 per shift
Space Required (in Sq.
6 Meter): 50 sq. m.

7 Power Required (in KW): 4


B: Workshop/ Lab Furniture
Sr. Name of Item Category Qty Unit Remark
Black/ White Board with Per 1 Unit in a
1 Stand - 4 X 3 Feet Furniture 1 Number Shift
Per 1 Unit in a
2 Book Shelf/ Glass Shelf Furniture 1 Number Shift
Discussion Table/ Working
Per 1 Unit in a
3 Table = L:W:H = 8:4:3 Feet - Furniture 1 Number
Shift
Heavy Wooden Top
Per 1 Unit in a
4 Instructor/ Office Chair Furniture 2 Number Shift
Per 1 Unit in a
Shift
5 Instructor/ Office Table Furniture 1 Number
Per 1 Unit in a
6 Notice Board - 2 X 3 Feet Furniture 1 Number Shift
Steel Almirah – Large Per 1 Unit in a
7 (Optional) Furniture 2 Number Shift
Steel Locker - 12 Pigeon Per 1 Unit in a
8 Hole Furniture 2 Number Shift
Per 1 Unit in a
10 Steel Rack (Optional) Furniture 1 Number Shift
11 Stool - Height 450 mm Furniture 10 Number Per 1 Unit in a Shift

C: Workshop/ Lab Infrastructure (Tools, Equipment’s, Machines, etc.)


Sr. Name of Item Category Qty Unit
1 For IT Lab sessions: Computer Equipment 15 Number
Lab with 1:1 PC: trainee ratio and
having internet connection,
commonly used search engines,
MS Office / Open office, Browser,
HTML, etc.
2 Printer Equipment 1 Number
3 Scanner Equipment 1 Number
4. Business Objects XI 9.0 version Software - -
5. C,CPP Software
6. MS SQL, Microsoft office Software
Informatica Powercenter, IBM
7. Cognos Software
8. JDK 5.5, VB 6 Software
9. Linux 8, Oracle 11 g Software
10. XAMP Software
11. VMWARE Software
12. Routers, Hubs Hardware
Windows Operating system 7/ 8/
13. 10/ XP Software
14. Antivirus Software

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy