NimmagaddaDreher 2006 ObjectRelationalDataWarehousing
NimmagaddaDreher 2006 ObjectRelationalDataWarehousing
net/publication/282404642
Mapping and Modeling of Oil and Gas Relational Data Objects for Warehouse
Development and Efficient Data Mining
CITATIONS READS
10 1,720
2 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Shastri Lakshman Nimmagadda on 16 September 2016.
1
Dimensions
Several data cube views have been interpreted through 0.. & 0.. &
OLAP in [7, 12, 14], and issues relevant to data mining in Facts
1 1
Period
Survey Schedules
Ahmadi, Kuwait City, 61003, Kuwait. (phone: 965 398 3639; e-mail:
NIMM@chevron.com). Fig. 1: A star schema data model of “surveys” object
Heinz Dreher, Curtin University of Technology, GPO Box U1987 Perth,
Western Australia 6845. (e-mail: h.dreher@curtin.edu.au)
1201
1-4244-9701-0/06/$20.00
c 2006 IEEE
As shown in Fig. 1, the star schema represents a dice of In the implementation model, primary keys of dimensions
four object dimensions (contractor, survey lines, survey are generated when the object model is transferred to the
schedule and survey period) with survey object facts. data model. The facts of the oil and gas data thus refer to the
The fact object class refers to every associated object dimensions using the key migration from the migration
dimension, which is to be placed in a cell of the cube. The tables. When the data models are composed, foreign keys
dimensions in our example are: are identified. Slicing and dicing in star schemas have a
1) The contractor object class describes each individual limitation (selection) of dimensions. It is a run-time issue,
contractor of the contracting company (chosen by the oil & not a modeling one, but the model has to recognize the need
gas company) identified by name, ID, contractors not of it.
obliged to the terms and conditions are not part of the data Alternatively, all the links or associations among
warehouse. dimension and fact tables can also be stored in separate
2) Survey schedule is the schedule of surveys to be tables that logically created, called associative schemas.
conducted in a basin During design and implementation stages, these associative
3) Survey lines possess several survey lines laid for this schemas are triggered for mapping dimensional and fact data
survey tables.
4) Period is the start of survey (including ending date of
B. Snowflake Schemas
survey)
5) Type of survey conducted in a type of basin The basic star schema does not satisfy all the needs of
data mining. Composite dimensions are used such as period
The survey object fact describes here a single survey of survey in the example above. The analyst investigates the
engaged by a contractor at a period, choosing a specific patterns of data according to day, week, month, quarter, year
survey with number of survey lines. The analytical space etc... In such a case, during the mapping process, dimension
can be the whole cube, or the analytical space can be sliced relations have to be normalized. Redundancy of dimension
according to the dimension in to smaller pieces. Each tables is needs to be addressed to reduce the complexity of
dimension is described according to an object class, which is data slicing. The schema that is derived with such a process
specified according to a business subject. This is most is called a snowflake. As shown in Fig. 3, the period
important for the success of data warehouse design. Typical dimension is normalized to days, weeks, months, quarters,
users are exploration manager, exploration database analyst, and year. For every additional normalized dimension: day,
or a marketing engineer. The fact table itself is another week, month, quarter and year, slices can be made from the
object, represented again by a class object. The fact table cubes. For this reason the dimension of period is pre-
refers to every dimension. The association between the fact selected using one of the normalized tables, which is a
tables and dimension tables is always one to many, which simple addition to the data mining queries.
means each fact is associated with exactly one unit of the Object Classes
Classes
1 :Month
C
:Contractor
11 11
shown in Fig. 4.
0.. & 0.. & Object Classes
<<Non-Identifying>> <<Non-Identifying>>
:Year
:Day
:Surveys
Object Classes
:Month
Object Classes
[PK]Survey_ID : Integer
[FK]SurveyLine_ID : Integer 1 1
[FK]SurveySchedule_ID : Integer
1
[FK]Contractor_ID : Integer 0.. &
[FK]Period_ID: Integer 0.. &
bject Classes
0.. &
:Period
Object Classes
0.. &
:Surveys
:Period
:Survey Schedules
[PK]Period_ID : Integer 0.. & 0.. &
[PK]SurveySchedule_ID : Integer
1 0.. & :SurveySchedule
:Contractor
Object Classes 1
1
:SurveyLines
Dimensions
[FK]SurveyLine_ID : Integer
[PK]Day_ID : Integer [PK]Month_ID : Integer [PK]Year_ID : Integer
0.. &
Dimensions
<<PK>>PK_DY20() <<PK>>PK_YR20() <<Identifying>>
<<PK>>PK_MO19()
1
1 1 0.. &
<<Non-Identifying>>
1
1
<<Non-Identifying>> Surveys
0.. & <<Non-Identifying>>
0.. & 0.. & [PK]Survey_ID: Integer
Period
Object Classes
[FK]SurveyLine_ID : Integer
[PK]Period_ID : Integer [FK]SurveySchedule_ID : Integer
[FK]Day_ID : Integer 0.. & [FK]Contractor_ID : Integer
Object Classes
[FK]Month_ID: Integer [FK]Period_ID : Integer
[FK]Year_ID : Integer 0.. &
1
1 <<Non-Identifying>> 11 <<Non-Identifying>>
0.. &
Facts
<<Non-Identifying>>
Survey Schedule 1
Surveys Contractor
[PK]Survey_ID: Integer [PK]SurveySchedule_ID: Integer
[FK]SurveyLine_ID : Integer [PK]Contractor_ID : Integer
[FK]SurveySchedule_ID : Integer
[FK]Contractor_ID : Integer
Dimensions
[FK]Period_ID : Integer
0.. & 0.. &
0.. &
<<Non-Identifying>>
[PK]SurveyLine_ID: Integer [PK]SurveySchedule_ID: Integer [PK]Contractor_ID : Integer dimension associated (many-to-many) with surveys object
Object Classes
D. Object class hierarchies
Fig. 5: A snowflake schema model with normalized period Data mining discovers both corporate and operational
dimension and association with “Surveys” object information from the oil and gas company data that is
hidden at the bottom of the operational sub-systems.
C. Many to-many relationships Demography data can be built based on the contractor
On every survey or investigation, there may be many dimension in different hierarchies. Contractor is grouped
survey lines. There can be several of them if the survey is according to the city, or state or zip code, then according to
extended on offshore and transitional zones from onshore country as shown in Fig 8.
areas. In such a case, one-to-many relationships will not be Dimensions
Dimensions
Dimensions
Contractor ZIP Code Country
many-to-many associations exist. However such an
association cannot be implemented in the star schema. A 0.. & 1 0.. & 1
0..&
D
sio
im
Surveys
n
en
ime
analytical space.
sion
D
0.. &
s
0..&
Facts
Dimensions
1 1
Period Survey Schedule Period
Survey Schedules
[PK]Period_id : Integer
[PK]SurveySchedule_ID : Integer [FK]Day_ID : Integer
1
[FK]Month_ID : Integer
<<Non-Identifying>>
Dimensions [FK]Year_ID : Integer
<<Non-Identifying>> 0.. & 0.. &
Surveys
Dimensions
[FK]Contractor_ID : Integer 1
dimension of “Survey Lines” 0.. & [FK]Period_ID : Integer
<<Non-Identifying>>
<<Identifying>>
1 0.. &
Contractor SurveyLine Relation
[FK]Survey_ID : Integer
The many-to-many association cannot be implemented
[PK]Contractor_ID : Integer
[FK]SurveyLine_ID : Integer
[FK]ZIPCode_ID : Integer
0.. & 0.. &
Data Mining
Data Mart
is depicted in the Fig. 10. A geometrical dice is an example Information
Data Mining
of three dimensional space with all three dimensions of the
Oil & Gas
same size. Imaging a cube with each object class dimension
of three units, we get 44 = 256 cells of equal structure. The
Data Delivery
Decision
multidimensional analysis space (or a data warehouse dice) Production Support
Data Mart Information
differs just in details from a geometrical space.
Data Mining
Fig. 11: Position of the data mart within an oil and gas
company
Contractor
architecture
Fact Fact
META DATA
Time
Fact
Cleaning
data-mining
Integration
Etc... DW Server
Wells
Warehouse
Field1
capacity and preserving the flexibility of data structuring.
Field3
The data warehouse developed using the relational objects
of the oil and gas data items, has several architectural views. Field4
Data Mining
With the design of data structures designed and application Knowledge Mapping
developed, for oil and gas company, these architectures Petro Data Mining
Exploration
March
Fact s Facts
Fig. 11.
F a c ts
F a c ts
April
F acts
May
Production
data is to keep the operational managers active and sharp in Petroleum bear ing field objec ts of Gulf Ba sins
their managerial decision support to act promptly. Million Fig. 13: Interoperability of data objects among petro-fields
dollar decisions are taken based on the information
processed by data warehouse with accountability and it Two of such situations discussed in this paper, such as,
provides an added value in return, tens of millions of dollars contractor-surveys, operator-wells problems have been
saved. Data objects from different warehouse marts, such as processed and interpreted as shown in Figs. 16, 17a and 17b.
exploration, drilling and production are integrated as Oil-field base data objects are reused (interoperability)
knowledge built from data integration, is interpreted in terms Start of Survey = 4/15/76
End of Survey = 12/25/77
Date of VSP = 12/5/7
Period ID = 15
Survey ID = SR1001
Start of Project = 15/4/76 G
D
Contractor ID = CR1100
Time
Contractor ID = CT1001 F
Logger ID = LG1003
Special Survey = 3D-Time Lapse
Distance Survey ID = WS1001
r
o
Date Logging = 12/5/78
ct
ra
(H)
nt
Contractor ID = CT1001
o
C
Geologist ID = EN1001 E
Geologist Name = Reuben
(E) B
Contractor ID = CT1001
Depth
(B)
Survey ID = SR1001 Surveys
Survey Name = 2D Seismic
Start Date = 15/4/78
End Date = 12/05/77
Contractor ID = CT0100
Aggregated Object Views
H
Line Objects Multidimensional object data views for super class object
Net Oil-Pay Region Object “wells” are shown in Figs. 17a and 17b. All the available
Net Reservoir Region Object
Point Object Structure Region Object data and information are now integrated (Fig.16) and
available in a centrally located enterprise data warehouse
Fig. 14: OLAP model for petroleum exploration (EDW), making it easy for all managers to take timely
decisions. Different data warehouse architectures may be
Exploration managers are responsible for providing the tried creating several data marts that can map and process
valued processed exploration information, so that critical individual operational units’ object classes and make them
decisions made in planning new exploratory or development available to exploration managers. Similar data marts may
wells in the frontier oil bearing sedimentary basins are be initiated for drilling, production, marketing, human
accurate and precise. Fig.15 exhibits the interoperability of resources and other support engineering class objects.
petroleum data objects among basins, when data objects are Aggregated Object Views
Time
F
r
o
ct
ra
A
nt
o
C
Time E
Wells
Wells
(G)
Contractor ID = CT1001
Engineer ID = EN1001 C
Aggregated Object Views
Contractor ID = CT1001 F
petroleum bearing sedimentary basins Logger ID = LG1003
Logger Name = Edward
Well ID = WS1001
r
to
(H)
All the survey information processed by OLAP is
tr
on
Contractor ID = CT1001
Geologist ID = EN1001
C
E
presented by different combination of aggregated views as (E)
Geologist Name = Reuben
B
shown in Fig. 16.
Contractor ID = CT1001
Contractor Name = Charles
Well ID = WS1001
Period ID = 21
(B)
Well facts stored in the oil and gas data warehouse or data Well ID = WS1001
Well Name = Karrata1
Wells
Start Date = 15/4/78
marts, are processed by OLAP procedure [7] and presented End Date = 12/25/78
Contractor ID = CT0012
Aggregated Object Views
in aggregated data views convenient to interpret and extract
knowledge from the past historical oil and gas business data. Fig. 17b: Computed aggregate views of “wells” object
REFERENCES
[1] D’Orazio, R., and Happel, G. (1996). Practical Data Modelling for
Database Design. The information Technology Series, John Wiley & Sons
Australia Ltd., 180-280p.
[2] Gornic, D (2000) Data Modelling for Data Warehouses, Rational
Software White Paper,
www.rational.com/worldwide.
[3] Hoffer, J.A, Presscot, M.B and McFadden, F.R (2002). Modern
Database Management, Sixth Edition, Prentice Hall, 260-520p.
[4] Huynh, T.N. Mangisengi, O. and Tjoa, A.M., (2000). Metadata for
Object-Relational Data Warehouse., Proceedings of the International
Workshop on Design and Management of Data Warehouses., Stockholm,
Sweden., June 5-6.
[5] King, E. (2000). Data Warehousing and Data Mining, Computer
Technology Research Corporation, 50-110p, http://www.ctrcorp.com.
[6] Longley, I.M. Bradshaw, M.T. and Hebberger, J. (2001) Australian
petroleum provinces of the 21st century, in Downey, M.W. Threet, J.C.
Morgan, W.A (2001) Petroleum provinces of the 21st century, AAPG
Memoir 74, pp.287-317.
[7] Moody, L. D and Kortink, M.A.R (2003). From ER Models to
Dimensional Models: Bridging the gap between OLTP and OLAP Design,
Part1 and Part 2, Business Journal Intelligence, Summer Fall editions, Vol.
8(3), http://www.tdwi.org.
[8] Nimmagadda, S.L, Dreher, H. and Rudra, A. (2005a) Data Warehouse
Structuring Methodologies for Efficient Mining of Western Australian
Petroleum Data Sources, published in the proceedings of IEEE conference
of INDIN 05, Perth, Australia
[9] Nimmagadda, S.L., Dreher, H. and Rudra, A. (2005a) Ontology of
Western Australian petroleum exploration data for effective data warehouse
design and data mining, published in the proceedings of the 3rd international
IEEE conference on Industrial Informatics, held in Perth, Australia, August.