0% found this document useful (0 votes)
17 views5 pages

07 ODI - Change Data Capture

The document discusses Oracle Data Integrator's (ODI) capability for change data capture (CDC). CDC in ODI tracks changes made to source data and extracts only the changed data for integration processes like synchronization and replication. It can capture changes using triggers or by mining database logs. ODI supports simple and consistent set journalizing to track changes with or without ensuring referential integrity between related data stores.

Uploaded by

Dileep
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views5 pages

07 ODI - Change Data Capture

The document discusses Oracle Data Integrator's (ODI) capability for change data capture (CDC). CDC in ODI tracks changes made to source data and extracts only the changed data for integration processes like synchronization and replication. It can capture changes using triggers or by mining database logs. ODI supports simple and consistent set journalizing to track changes with or without ensuring referential integrity between related data stores.

Uploaded by

Dileep
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

ODI - Change Data Capture

The goal of Change Data Capture is to track change in the source data. When running
integration interface, ODI-EE can reduce the volume of source data processed in the flow by
extracting only the changed data.
Reducing the volume of source data is useful in many field such as:

 synchronization
 replication

These changes are captured by Oracle Data Integrator and transformed into events that are
propagated throughout the information system.
Changes tracked by Changed Data Capture constitute data events. The ability to track these
events and process them regularly in batches or in real time is key to the success of an event-
driven integration architecture.
Changed Data Capture is performed by journalizing models. Journalizing a model consists of
setting up the infrastructure to capture the changes (inserts, updates and deletes) made to the
records of this model's datastores.
Oracle Data Integrator supports two journalizing modes:

 Simple Journalizing tracks changes in individual datastores in a model.


 Consistent Set Journalizing tracks changes to a group of the model's datastores, taking
into account the referential integrity between these datastores. The group of datastores
journalized in this mode is called a Consistent Set.

Publish-and-subscribe model

Changed Data Capture uses a publish-and-subscribe model. This model works in three
steps:
o An identified subscriber, usually an integration process, subscribes to changes
that might occur in a datastore. Multiple subscribers can subscribe to these
changes.
o The Changed Data Capture framework captures changes in the datastore and
then publishes them for the subscriber.
o The subscriber—an integration process—can process the tracked changes at any
time and consume these events. Once consumed, events are no longer available
for this subscriber.

Process

ODI-EE processes datastore changes in two ways:

o Regularly in batches (pull mode)—for example, processes new orders from the
Web site every five minutes and loads them into the operational datastore (ODS)
o In real time (push mode) as the changes occur—for example, when a product is
changed in the enterprise resource planning (ERP) system, immediately updates
the on-line catalog.

The Journalizing Components

The journalizing components are:

o Journals: Where changes are recorded. Journals only contain references to the
changed records along with the type of changes (insert/update, delete).
o Capture processes: Journalizing captures the changes in the source datastores
either by creating triggers on the data tables, or by using database-specific
programs to retrieve log data from data server log files. See the documentation
on journalizing knowledge modules for more information on the capture
processes used.
o Subscribers: CDC uses a publish/subscribe model. Subscribers are entities
(applications, integration processes, etc) that use the changes tracked on a
datastore or on a consistent set. They subscribe to a model's CDC to have the
changes tracked for them. Changes are captured only if there is at least one
subscriber to the changes. When all subscribers have consumed the captured
changes, these changes are discarded from the journals.
o Journalizing views: Provide access to the changes and the changed data
captured. They are used by the user to view the changes captured, and by
integration processes to retrieve the changed data.

These components are implemented in the journalizing infrastructure.

Simple vs. Consistent Set Journalizing

Simple Journalizing enables you to journalize one or more datastores. Each journalized
datastore is treated separately when capturing the changes.
This approach has a limitation, illustrated in the following example: Say you need to
process changes in the ORDER and ORDER_LINE datastores (with a referential integrity
constraint based on the fact that an ORDER_LINE record should have an associated
ORDER record). If you have captured insertions into ORDER_LINE, you have no
guarantee that the associated new records in ORDERS have also been captured.
Processing ORDER_LINE records with no associated ORDER records may cause
referential constraint violations in the integration process.
Consistent Set Journalizing provides the guarantee that when you have an ORDER_LINE
change captured, the associated ORDER change has been also captured, and vice versa.
Note that consistent set journalizing guarantees the consistency of the captured
changes. The set of available changes for which consistency is guaranteed is called the
Consistency Window. Changes in this window should be processed in the correct
sequence (ORDER followed by ORDER_LINE) by designing and sequencing integration
interfaces into packages.
Although consistent set journalizing is more powerful, it is also more difficult to set up.
It should be used when referential integrity constraints need to be ensured when
capturing the data changes. For performance reasons, consistent set journalizing is also
recommended when a large number of subscribers are required.
Implementation

Tracking changes

ODI-EE provides two methods for tracking changes from source datastores to the
Changed Data Capture framework:

o triggers
o and relational database management system (RDBMS) log mining.

The triggers method creates triggers on the source tables to track changes as data is
inserted, updated, or deleted. This method can be implemented on most RDBMS, but it
can have an impact on the transactional performance of the source systems.
The second method involves mining the RDBMS logs, which are the internal change
history of the database engine. This method has no effect on the system’s transactional
performance; it is database-specific. This method is supported out-of-the-box for:

o Oracle Database (through the Log Miner feature)


o and IBM DB2.

The Changed Data Capture framework used to manage changes is generic and open.
The change tracking method can be customized, and any third-party change provider
can be used to load the framework with changes.

Processing the change

Developers define the declarative rules for the captured changes within the integration
processes in the ODI-EE Designer graphical user interface—without having to code.
With the Designer, customers declaratively specify set-based maps between sources and
targets, and then the system automatically generates the data flow from the set-based
maps. The technical processes required for processing the changes captured are
implemented in Knowledge Modules.

Ensuring Data Consistency

Changes frequently involve several datastores at one time. For example, when an order
is created, updated, or deleted, it involves both the orders table and the order lines table.
When processing a new order line, the new order to which this line is related must be
taken into account.
ODI provides a mode of tracking changes, called Consistent Set Changed Data Capture,
for this purpose. This mode allows you to process sets of changes that guarantee data
consistency.

Source:

http://gerardnico.com/wiki/dit/odi/cdc

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy