0% found this document useful (0 votes)
42 views12 pages

8-9 Spatial Data Maintenance

This document discusses spatial data management and geodatabases. It covers topics like spatial data indexing, data migration, transformation, enhancement, integration and conflation. Data migration involves moving data between storage systems while preserving format and content. Data transformation converts data between formats, often cleaning and validating it. Data conflation combines overlapping geospatial datasets to create a composite with better quality than the originals. A geodatabase is a collection of geographic datasets in a common file system or database that implements a comprehensive data model for geographic information.

Uploaded by

kevin.kipchoge18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views12 pages

8-9 Spatial Data Maintenance

This document discusses spatial data management and geodatabases. It covers topics like spatial data indexing, data migration, transformation, enhancement, integration and conflation. Data migration involves moving data between storage systems while preserving format and content. Data transformation converts data between formats, often cleaning and validating it. Data conflation combines overlapping geospatial datasets to create a composite with better quality than the originals. A geodatabase is a collection of geographic datasets in a common file system or database that implements a comprehensive data model for geographic information.

Uploaded by

kevin.kipchoge18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

UNIT CODE: GGI 4202

UNIT NAME: SPATIAL BUSINESS


INTELLIGENCE

Lecture 06-07
Spatial Data Management
Overview of Spatial Data Management
 Spatial database management deals with the storage, indexing, and
querying of data with spatial features, such as location and geometric
extent.
 Many applications require the efficient management of spatial data,
including Geographic Information Systems, Computer Aided Design, and
Location Based Services.
 Spatial indices are used by spatial databases (databases which store
information related to objects in space) to optimize spatial queries.
 Conventional index types do not efficiently handle spatial queries such as
how far two points differ, or whether points fall within a spatial area of
interest.
 Data management means more than simply handling updates. Here are
four scenarios that we often see.
 Data migration
 Data transformation
 Data enhancement
 Data integration
 Data conflation
2
Data Migration
 The process of selecting, preparing, extracting, and transforming data
and permanently transferring it from one computer storage system to
another.
 It is a key consideration for any system implementation, upgrade, or
consolidation, and it is typically performed in such a way as to be as
automated as possible, freeing up human resources from tedious tasks.
 It occurs for a variety of reasons, including server or storage equipment
replacements, maintenance or upgrades, application migration, website
consolidation, disaster recovery, and data center relocation.
 Categories
 Storage migration: Result in having to move physical blocks of data from one disk to
another, often using virtualization techniques. The data format and content itself will
not usually be changed in the process.
 Database migration: Similarly, it may be necessary to move from one database vendor
to another, or to upgrade the version of database software being used
 Application migration: Changing application vendor like a new CRM or ERP platform
 Business process migration: Business processes operate through a combination of
human and application systems actions. When these change they can require the
movement of data from one store, database or application to another to reflect the
changes to the organization. 3
Disadvantages of Data Migration
 Migration addresses the possible obsolescence of the data carrier but
does not address the fact that certain technologies which run the data
may be abandoned altogether, leaving migration useless.
 Time-consuming – migration is a continual process, which must be
repeated every time a medium reaches obsolescence, for all data objects
stored on a certain media.
 Costly - an institution must purchase additional data storage media at
each migration
 If you have poor data quality now in your old data management system
and you plan to migrate that data into your new system, your new system
will most likely inherit the same challenges, headaches, and poor data
quality

4
Data Transformation
 Data transformation is the process of converting data from one format,
such as a database file, XML document or Excel spreadsheet, into
another.
 Transformations often involve converting a raw data source into a
cleansed, validated and ready-to-use format.
 Data transformation can be simple, or complex based on the required
changes to the data between the source (initial) data and the target
(final) data.
 Data transformation can be divided into the following steps, each
applicable as needed based on the complexity of the transformation
required.
 Data discovery: Typically the data is profiled using profiling tools or sometimes using
manually written profiling scripts to better understand the structure and characteristics
of the data and decide how it needs to be transformed.
 Data mapping: The process of defining how individual fields are mapped, modified,
joined, filtered, aggregated etc. to produce the final desired output
 Code generation: The process of generating executable code (e.g. SQL, Python, R, or
other executable instructions) that will transform the data based on the desired and
defined data mapping rules
5
Data Transformation…
 Code execution: Step whereby the generated code is executed against the data to
create the desired output. The executed code may be tightly integrated into the
transformation tool, or it may require separate steps by the developer to manually
execute the generated code
 Data review: is the final step in the process, which focuses on ensuring the output data
meets the transformation requirements. Any anomalies or errors in the data that are
found and communicated back to the developer or data analyst as new requirements
to be implemented in the transformation process.
 Types of Data Transformation
 Batch Data Transformation
 This is whereby developers write code or implement transformation rules in a data
integration tool, and then execute that code or those rules on large volumes of data.
 Batch data transformation is the cornerstone of virtually all data integration
technologies such as data warehousing, data migration and application integration.
 Interactive Data Transformation
 This is an emerging capability that allows business analysts and business users the
ability to directly interact with large datasets through a visual interface, understand the
characteristics of the data (via automated data profiling or visualization), and change or
correct the data through simple interactions such as clicking or selecting certain
elements of the data.

6
Data Conflation
 Geospatial data conflation is the compilation or reconciliation of two
different geospatial datasets covering overlapping regions (Saalfeld
1988).
 In general, the goal of conflation is to combine the best quality elements
of both datasets to create a composite dataset that is better than either
of them.
 The consolidated dataset can then provide additional information that
cannot be gathered from any single dataset.
 Based on the types of geospatial datasets dealt with, the conflation
technologies can be categorized into the following three groups.
 Vector to vector data conflation: A typical example is the conflation of two road
networks of different accuracy levels.
 Raster to raster data conflation
 Rasta to vector data conflation

7
Introduction: Geodatabase
 At its most basic level, a geodatabase is a collection of geographic
datasets of various types held in a common file system folder, a Microsoft
Access database, or a multiuser relational DBMS (such as Oracle,
Microsoft SQL Server, PostgreSQL, Informix, or IBM DB2).
 Geodatabases come in many sizes, have varying numbers of users and
can scale from small, single-user databases built on files up to larger
workgroup, department, and enterprise geodatabases accessed by many
users.
 It is the physical store of geographic information, primarily using a
database management system or file system.
 Geodatabases have a comprehensive information model for representing
and managing geographic information.
 This comprehensive information model is implemented as a series of tables holding
feature classes, raster datasets, and attributes.
 In addition, advanced GIS data objects add GIS behavior; rules for managing spatial
integrity; and tools for working with numerous spatial relationships of the core
features, rasters, and attributes
 Geodatabases have a transaction model for managing GIS data workflows.

8
Types: Personal geodatabases
 Personal geodatabases—All datasets are stored within a Microsoft Access
data file, which is limited in size to 2 GB.
 Original data format for ArcGIS geodatabases stored and managed in
Microsoft Access data files.(This is limited in size and tied to the Windows
operating system.)
 Single user and small workgroups with smaller datasets: some readers and one writer.
Concurrent use eventually degrades for large numbers of readers.
 All the contents in each personal geodatabase are held in a single Microsoft Access file
(.mdb).
 Two GB per Access database. The effective limit before performance degrades is
typically between 250 and 500 MB per Access database file.
 Often used as an attribute table manager (via Microsoft Access). Users like the string
handling for text attributes.

9
File Geodatabases
 A collection of various types of GIS datasets held in a file system folder
 Stored as folders in a file system.
 Each dataset is held as a file that can scale up to 1 TB in size. The file
geodatabase is recommended over personal geodatabases.
 Features:
 Provide a widely available, simple, and scalable geodatabase solution for all users.
 Provide a portable geodatabase that works across operating systems.
 Scale up to handle very large datasets.
 Use an efficient data structure that is optimized for performance and storage.
 File geodatabases also allow users to compress vector data to a read-only format to
reduce storage requirements even further.
 Outperform shapefiles for operations involving attributes and scale the data size limits
way beyond shapefile limits

10
Enterprise geodatabases
 Enterprise geodatabases—Also known as multiuser geodatabases, they can
be unlimited in size and numbers of users.
 A collection of various types of GIS datasets held as tables in a relational
database
 Stored in a relational database using Oracle, Microsoft SQL Server, IBM DB2,
IBM Informix, or PostgreSQL.
 Features:
 Extremely large, continuous GIS databases
 Many simultaneous users
 Long transactions and versioned workflows
 Relational database support for GIS data management (providing the benefits of a relational
database for scalability, reliability, security, backup, integrity, and so forth)
 SQL types for Spatial in all supported DBMSs (Oracle, SQL Server, PostgreSQL, Informix, and
DB2)
 High performance that can scale to a very large number of users
11
The architecture of a geodatabase
 The geodatabase is object relational
 Based on a series of simple yet essential relational database concepts and
leverages the strengths of the underlying database management system.
 Simple tables and well-defined attribute types are used to store the schema,
rule, base, and spatial attribute data for each geographic dataset.
 Through this approach, structured query language (SQL)—a series of
relational functions and operators—can be used to create, modify, and query
tables and their data elements

12

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy