Geometry Ina RDBMS
Geometry Ina RDBMS
-Anubhav Kishore
INDEX
GEOMETRIC OBJECTS STANDARDS
ARCSDE AND GEOMETRY OBJECTS STORAGE
GEOMETRY METADATA
GEOMETRY STORAGE
ARCSDE COMPRESSED BINARY REPRESENTATION
SQL SPATIAL TYPE
These different storage schema are defined in abstract form by the standards organizations, and may take
advantage of DBMS specific data types.
For example, the Binary storage schema may be implemented using any DBMS supported binary data type,
and the specification defines the binary type in the abstract.
ARCSDE AND GEOMETRY OBJECTS STORAGE
ArcSDE manages the physical storage of geometry for features using standard data types provided by the host
DBMS.
Some DBMS' have spatial data types, while others provide standard binary or binary large object (BLOB)
storage types.
ArcSDE geometry storage is dependent on the capabilities of your DBMS. No matter what geometry storage
type you use, an ArcSDE client application does not need to know what the geometry storage type is.
In the case of Oracle, you can use any combination of available storage methods. You may choose to store a
point layer as Oracle Spatial geometry types and a polygon layer as binary.
The decision on how to store your geometry will be based on the DBMS you use and other requirements
specific to your implementation.
This table summarizes the DBMS and geometry storage options available.
Note: ST_GEOMETRY is a super class of several subclasses (e.g.: ST_polygon)
DBMS GEOMETRY STORAGE COLUMN TYPE
This information is managed in three DBMS tables, called the LAYERS, SPATIAL_REF_SYS, and
GEOMETRY_COLUMNS tables. Three separate tables are used so that data can be normalized (in the
case of spatial references) and for ease of integration with the OpenGIS Simple Features in SQL
Specification.
GEOMETRY METADATA CONT..
LAYERS Table:
The LAYERS table stores a row for each spatial column in the database.
Applications use the layer properties to discover available spatial data sources.
The layer properties are used by ArcSDE to constrain and validate the contents of the spatial column,
to index geometry values, and to properly create and manage the associated DBMS tables.
The entries found in the LAYERS table vary slightly depending on the underlying DBMS.
GEOMETRY_COLUMNS Table:
The GEOMETRY_COLUMNS table stores a row for each column of type Geometry in the database that
complies with the OpenGIS SQL specification.
The ArcSDE application server treats this table as "write only"—the only time it is accessed by the
ArcSDE server is when a layer is added or deleted that uses an OpenGIS SQL data format.
This table is defined by the OpenGIS SQL specification, and may be updated by other applications,
with geometry columns not managed by ArcSDE.
When a new Geometry column is created in an OpenGIS compliant format, the fully qualified table,
column name, and spatial reference ID (SRID) is added to the GEOMETRY_COLUMNS table.
GEOMETRY METADATA CONT..
SPATIAL_REF_SYS Table:
Each Geometry column is associated with a Spatial Reference System.
ArcSDE stores information on each Spatial Reference System in the SPATIAL_REF_SYS table.
The columns of this table are those defined by the OpenGIS SQL Specification (SRID, SRTEXT,
AUTH_NAME, and AUTH_SRID), and those required by ArcSDE for internal coordinate transformation.
The Spatial Reference System identifies the coordinate system for a geometry and gives meaning to
the numeric coordinate values for the Geometry.
GEOMETRY METADATA CONT..
LAYERS Table Schema:
CREATE TABLE layers ( layer_id INTEGER NOT NULL, description VARCHAR2(65) NULL,
table_name VARCHAR2(160) NOT NULL, owner VARCHAR2(32) NOT NULL, spatial_column
VARCHAR2(32) NOT NULL, srid INTEGER NOT NULL, storage_type INTEGER NOT NULL, eflags
INTEGER NOT NULL, gsize FLOAT(64) NOT NULL, gsize2 FLOAT(64) NOT NULL, gsize3
FLOAT(64) NOT NULL, minx FLOAT(64) NOT NULL, miny FLOAT(64) NOT NULL, maxx FLOAT(64)
NOT NULL, maxy FLOAT(64) NOT NULL, cdate INTEGER NOT NULL, layer_config VARCHAR2(32)
NULL, optimal_array_size INTEGER NULL, stats_date INTEGER NULL, minimum_id INTEGER
NULL, CONSTRAINT layers_uk1 UNIQUE (layer_id), CONSTRAINT layers_uk2
UNIQUE(table_name,owner));
GEOMETRY METADATA CONT..
LAYERS Table Schema:
The specification for the SQL Spatial Type has been extended to support ArcSDE functionality. This is
possible because the actual geometry type structure is hidden behind the SQL function interfaces.
ARCSDE COMPRESSED BINARY REPRESENTATION CONT..
The OpenGIS Specification for Simple Features in SQL defines a Binary geometry schema as one of three schemas
for managing spatial data.
In the Binary geometry schema, the coordinate values for the shape are stored in a binary object (called a BLOB).
The BLOB is managed in a separate Geometry Table. Access to the geometry from the Business Table is through a
Foreign key called the Geometry ID, or GID.The following tables illustrate the relationship between the Business
Table and the Geometry Table in the Binary geometry schema:
101 1 1 (x,y,&,x,y) 1
102 2 2 (x,y,&,x,y) 2
103 3 3 (x,y,&,x,y) 3
The ArcSDE binary schema implements the OpenGIS Binary schema, with additional functionality,
and a compressed binary representation in place of the OpenGIS Well-Known Binary Representation
for Geometry.
The compressed binary representation provides support for geometry properties not defined by the
OpenGIS specification, including: Elevations, Measures, Annotation, and CAD data.
Functions are available in the C, JAVA, and SQL type API to convert this compressed binary
representation into OpenGIS Well-Known Binary and Well-Known Text representations of Geometry.
ARCSDE COMPRESSED BINARY REPRESENTATION CONT..
The ArcSDE compressed binary representation of geometry is used to store binary geometry.
This binary representation requires that an offset and scale be applied to the coordinates of a
geometric object - the resulting integer coordinates are then encoded using the delta from the
previous coordinate. Appended to the geometric object is an optional CAD and ANNO object.
Coordinate values
Internally, all ArcSDE coordinates are 32-bit positive integers between 0 and 2147483647.
This format provides better data accuracy, data integrity, and processing speed than real numbers.
Developers should be aware of the internal integer representation, because it is possible to attempt to
store a number that is too large in a layer. In that case, the ArcSDE software returns the error
SE_COORD_OUT_OF_BOUNDS. Developers never need to work directly with the integer values.
Because real-world coordinates are often neither positive nor integer, ArcSDE data requires an offset
distance (a false origin) to ensure numbers are positive and a minimum resolution multiplier (called
the scale) to convert real numbers to integers.
Offset distances are specified in the same units as the data. The scale can be any positive value up to
2147483645.
Logical representation of ArcSDE feature geometry
This section describes the logical view of how an ArcSDE feature’s geometry is represented in a binary
stream. There are three issues to present: coordinate ordering, multipart delineation, and point
compression.
ARCSDE COMPRESSED BINARY REPRESENTATION CONT..
Logical representation of ArcSDE feature geometry
This section describes the logical view of how an ArcSDE feature’s geometry is represented in a binary stream.
There are three issues to present: coordinate ordering, multipart delineation, and point compression.
Coordinate ordering
An ArcSDE feature’s geometry is represented by one or more coordinates. The coordinates consist of, at a
minimum, an x,y pair. A feature might also have z (zed) or measure (m) values associated with each x,y pair.
Each of these values, x, y, z, and m, is represented internally as 32-bit integers. The order in which these
coordinates are stored in a binary stream is x/y, x/y, ..., x/y, z, z, ..., z, m, m, ..., m (again, with the z- and m-
values being optional). A one-to-one correspondence exists between the z- or m-values and the x,y pairs. In
other words, for each z-coordinate or measure value present in the feature geometry, an x,y pair exists.
Multipart delineation
An ArcSDE feature may have one or more geometric parts (a single-part or multipart feature). Each part is
delineated by a separator coordinate within the binary stream which represents a feature’s geometry. The
separator coordinate has a predefined value. Beginning with a feature’s second part, the separator is the first
coordinate of the part’s ordered coordinate list. The coordinate list of a multipart feature is stored in a binary
stream as x/y, x/y, ..., x/y, <separator>, x/y, x/y, ..., x/y, z, z, ..., z, <separator>, z, z, ..., z, m, m, ..., m,
<separator>, m, m, ..., m (again, the z- and m-values are optional).
Point compression
Within the binary stream, each of the x/y, z, and measure values are compressed in a byte-order independent
manner. The compression of feature coordinates is done in two steps. First, all values are converted to a
relative-offset scheme, then each relative-offset value is packed into the minimum number of bytes required
to represent the value.
ARCSDE COMPRESSED BINARY REPRESENTATION CONT..
Physical representation of ArcSDE feature geometry
This section describes the physical view of how an ArcSDE feature’s geometry is stored in a binary
stream.
There are three issues to present: separators, point compression, and the binary layout.
Part separators
The physical representation of the separators which delineate the parts of a feature is an x-value of
negative one (-1), a y-value of zero (0), and the z- and m-values are undefined. Separators do not
require any special logic when being compressed.
Point compression
The compression or decompression of the coordinates stored in the binary stream is a two step
process: the conversion to/from the relative-offset scheme and the packing/unpacking of bytes. To
compress coordinates, the values are converted to relative-offsets, then packed into a byte array. To
decompress coordinates, the byte array is unpacked, then the values are converted to absolute
values. Each step is described below.
Relative-offset value calculation
The goal of converting coordinate values to a relative offset scheme is to make the values as small as
possible so that they require fewer bits to represent them. In an array of relative-offset values, the
first value is an absolute value (stored as a 32-bit integer) while each subsequent value is the offset,
or difference, from the previous absolute value. Therefore, given N absolute values, the relative-offset
values are calculated by:
ARCSDE COMPRESSED BINARY REPRESENTATION CONT..
relative_value[0] = absolute_value[0]
relative_value[1] = absolute_value[1] - absolute_value[0]
[...]
relative_value[N-2] = absolute_value[N-2] - absolute_value[N-3]
relative_value[N-1] = absolute_value[N-1] - absolute_value[N-2]
CREATE TABLE F<layer#> ( fid INTEGER NOT NULL, numofpts INTEGER NOT NULL, entity
SMALLINT NOT NULL, eminx FLOAT(64) NOT NULL, eminy FLOAT(64) NOT NULL, emaxx
FLOAT(64) NOT NULL, emaxy FLOAT(64) NOT NULL, eminz FLOAT(64) NOT NULL, emaxz
FLOAT(64) NOT NULL, min_measure FLOAT(64) NOT NULL, max_measure FLOAT(64) NOT NULL,
area FLOAT(64) NOT NULL, len FLOAT(64) NOT NULL, points <BLOB DATATYPE> PRIMARY KEY
(fid));
ARCSDE COMPRESSED BINARY REPRESENTATION CONT..
FID—The ID of the geometry.
NUMOFPTS—The number of coordinate values in the geometry.
ENTITY—An accumulated mask of geometry and coordinate type flags.
EMINX—Minimum X ordinate value.
EMINY—Minimum Y ordinate value.
EMAXX—Maximum X ordinate value.
EMAXY—Maximum Y ordinate value.
EMINZ—Minimum Z ordinate value (Initializes to NULL).
EMAXZ—Maximum Z ordinate value (Initializes to NULL).
MIN_MEASURE—Minimum M ordinate value (Initializes to NULL).
MAX_MEASURE—Maximum M ordinate value (Initializes to NULL).
AREA—Calculated area for the geometry.
LEN—Calculated length or perimeter for the geometry.
Points—Geometry coordinate values.
SQL SPATIAL TYPE
The OpenGIS Specification for Simple Features in SQL defines a SQL Spatial type with associated SQL
functions as one of three schemas for managing spatial data.
In the SQL Spatial Type implementation, the coordinate values for the shape are managed as an SQL
object, accessible only through SQL function invocations.
The SQL Spatial Type is just another column in the DBMS table, with well-defined properties and
behaviour.
The actual representation of the geometry and storage of the coordinate data is DBMS specific and
never exposed to the application.
This hidden implementation allows the extension of theOpenGIS Specification for Simple Features in
SQL with Annotation, CAD, elevations, and measures.
The following example illustrates how a geometry type looks in the table. The index is not shown,
because it is now managed by the DBMS as an index on the column:
Business Table
101 (x,y,&,x,y)
102 (x,y,&,x,y)
103 (x,y,&,x,y)
& (x,y,&,x,y)
SQL GEOMETRY FUNCTIONS
Functions may be associated with abstract data types; for example, a geometry type could have
functions to return the area or length.
Functions that operate on the type are also possible. For example, functions that compute
intersections or the union of two geometries, returning a new geometry as the result such as returning
all parcels that intersect a specified polygon (named MyPolygon):
SELECT Parcel.Name, Parcel.Id FROM Parcels WHERE Intersects(Parcels.Geometry,
MyPolygon)
The spatial functions are fully integrated into the SQL language allowing them to be used in SELECT
statements (as above) and any other place a SQL expression would be used, such as the following
INSERT statement:
INSERT INTO Countries (Name, Geometry)
VALUES ( Kenya, PolygonFromText(POLYGON ((x y, x y, x y, ..., x y)), 14))
SQL Spatial Types and Functions are only available on extensible DBMS products.
At this time, support is limited to Informix Dynamic Server, Oracle Spatial, MS SQL Server with
geometry and Geography support, and DB2 Data Joiner.
As other extensible DBMS products become available, they will be evaluated to determine if the
necessary functionality exists to provide spatial type support.
Adobe Acrobat
*All the spatial functions supported by Arcgis are listed here in the pdf. Document
WHAT IS A GEODATABASE?
At its most basic level, an ArcGIS geodatabase is a collection of geographic datasets of various types
held in a common file system folder, a Microsoft Access database, or a multiuser relational DBMS
(such as Oracle, Microsoft SQL Server, PostgreSQL, Informix, or IBM DB2). Geodatabases come in
many sizes, have varying numbers of users and can scale from small, single-user databases built on
files up to larger workgroup, department, and enterprise geodatabases accessed by many users.
The geodatabase is the native data structure for ArcGIS and is the primary data format used for
editing and data management. While ArcGIS works with geographic information in numerous
geographic information system (GIS) file formats, it is designed to work with and leverage the
capabilities of the geodatabase.
It is the physical store of geographic information, primarily using a database management system
(DBMS) or file system. You can access and work with this physical instance of your collection of
datasets either through ArcGIS or through a database management system using SQL.
Geodatabases have a comprehensive information model for representing and managing geographic
information. This comprehensive information model is implemented as a series of tables holding
feature classes, raster datasets, and attributes. In addition, advanced GIS data objects add GIS
behaviour; rules for managing spatial integrity; and tools for working with numerous spatial
relationships of the core features, raster, and attributes.
Geodatabase software logic provides the common application logic used throughout ArcGIS for
accessing and working with all geographic data in a variety of files and formats. This supports
working with the geodatabase, and it includes working with shapefiles, computer-aided drafting (CAD)
files, triangulated irregular networks (TINs), grids, CAD data, imagery, Geography Markup Language
(GML) files, and numerous other GIS data sources.
Geodatabases have a transaction model for managing GIS data workflows.
GEODATABASE STORAGE
The geodatabase storage model is based on DBMS principles, leveraging a series of simple yet
essential relational database concepts. The DBMS (and the file system for file geodatabases) provides
a simple, formal data model for storing and working with information in tables.Key concepts include
the following:
Data is organized into tables.
Tables contain rows.
All rows in a table have the same columns.
Each column has a type, such as integer, decimal number, character, date, and so on.
Relationships are used to associate rows from one table with another table. This is based on a
common column in each table.
Relational integrity rules exist for tables. For example, each row always shares the same columns, a
domain lists the valid values or value ranges for a column, and so on.
For ArcSDE geodatabases that are stored in relational databases, a number of additional DBMS
capabilities also apply:
Structured query language (SQL), a series of relational functions and operators, is available to
operate on the tables and their data elements.
The SQL operators are designed to work with the generic relational data types, such as integers,
decimal numbers, dates, and characters.
For example, a feature class is stored as a DBMS table. Each row represents a feature. The columns in
GEODATABASE STORAGE
At the core of the geodatabase is a standard (that is, not exotic) relational database schema (a series of standard database
tables, column types, indexes, and other data.
Geodatabase storage includes both the schema and the rule base for each geographic dataset plus simple, tabular storage of
the spatial and attribute data.base objects).
The geodatabase schema includes the definitions, integrity rules, and behavior for each geographic dataset.
These include properties for feature classes, topologies, networks, raster catalogs, relationships, domains, and so
forth.
The schema is persisted in a collection of geodatabase meta tables in the DBMS that defines the integrity and
behavior of the geographic information.
The spatial representations are most commonly stored as either vector features or raster datasets along with
traditional tabular attributes.
For example, a DBMS table can be used to store a feature class where each row in the table represents a feature.
A shape column in each row is used to hold the geometry or shape of the feature.
The shape column holding the geometry is typically one of two column types:
A binary large object (BLOB) column type
A spatial column type, if the DBMS supports it
A homogeneous collection of common features, each having the same spatial representation, such as a point,
line, or polygon, and a common set of attribute columns, is referred to as a feature class and is managed in a
single table.
Raster and imagery data types can be managed and stored in relational tables as well. Raster data is typically
much larger in size and requires a side table for storage. For DBMS storage and access, each raster is cut into
smaller pieces, or blocks, and stored in individual rows in the separate block table.
GEODATABASE STORAGE
The column types that hold the vector and raster geometry can vary from database to database. Recently, most
DBMSs have added support for spatial type extensions, and the geodatabase can readily use them to hold the
spatial geometry.
Esri was closely involved in efforts to extend Structured Query Language (SQL) for spatial types as the
primary author of the SQL/MM Part 3 Spatial and the Open Geospatial Consortium, Inc. (OGC) Simple
Features SQL specifications.
Esri has focused on support for these types, as well as the independent Oracle Spatial, PostGIS (in
PostgreSQL), and Microsoft SQL Server spatial types, in the persistence of geodatabases using DBMS
standards.
Presently, all DBMSs include spatial type support for geodatabases using ArcGIS as follows:
Oracle using the ST_Geometry or, optionally, the Oracle Spatial type
IBM DB2 using the Spatial Extender Geometry Object
Informix using the Spatial DataBlade Geometry Object
PostgreSQL using the ST_Geometry or PostGIS geometries
Microsoft SQL Server using Microsoft spatial types, geometry and geography
PHYSICAL STORAGE OF GEODATABASES
Geodatabase storage in a DBMS contains two sets of tables—dataset tables (user-defined tables) and system tables.
Dataset tables—Each dataset in the geodatabase is stored in one or more tables. The dataset tables work
with the system tables to manage data.
System tables—The geodatabase system tables keep track of the contents of each geodatabase. They
essentially describe the geodatabase schema that specifies all dataset definitions, rules, and relationships.
These system tables contain and manage all the metadata required to implement geodatabase properties,
data validation rules, and behaviours'.
The internal structure of these tables was restructured beginning with the ArcGIS 10 release. The
information related to the schema in the geodatabase, which prior to ArcGIS 10 was stored in over 35
geodatabase system tables, was consolidated into four main tables:
GDB_Items: Contains a listing of all items contained within a geodatabase such as feature classes,
topologies and domains
GDB_ItemTypes: Contains a predefined list of recognized item types, such as Table
GDB_ItemRelationships: Contains schema associations between items such as which feature classes are
contained within a feature dataset
GDB_ItemRelationshipTypes: Contains a predefined list of recognized relationship types such as
DatasetInFeatureDataset
PHYSICAL STORAGE OF GEODATABASES
For example, a simple feature class stored in a geodatabase within a SQL Server DBMS using binary
geometry storage is made up of a business table, an associated feature (f) table, and a spatial index
(s) table.
These tables work with a set of system tables: the i table, the gdb_items, sde_table_registry,
sde_layers, and sde_spatial_references system tables. These are used to track information about the
feature class.
Each DBMS has a slight variation in the set of tables and columns used to store and manage a
geodatabase. The type of DBMS you are using to store your geodatabase impacts the physical storage
schema.
Additional files
Geodatabases also use triggers, functions, stored procedures, and user-defined types in the DBMS to
implement functionality and maintain consistency. A detailed discussion of these is not necessary,
since you would not need to interact with most of these database objects.
CREATING TABLES WITH SQL AND REGISTERING WITH THE
GEODATABASE
You can use SQL to create tables. If the table contains a spatial column, the table is considered to be a spatial table.
You can use SQL to populate both nonspatial and spatial tables with data. Then, to use ArcGIS and geodatabase
functionality, you can register the table with the geodatabase.