0% found this document useful (0 votes)
79 views170 pages

OGI352

Notes

Uploaded by

pradeepraja430
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views170 pages

OGI352

Notes

Uploaded by

pradeepraja430
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 170

Strictly as per Revised Syllabus of

Anna University
Choice Based Credit System (CBCS)
Semester - V (CSE / IT / ECE / Mechanical) Open Elective-I

Geographic Information
System
T. Graceshalini
M.E., M.B.A., Ph.D. (Pursuing in VANETS)
Assistant Professor,
Velammal College of Engineering and Technology,
Madurai

S. Kavitha
M.E. (CSE), B.Tech. (IT)
Assistant Professor,
Velammal College of Engineering and Technology,
Madurai

Price : ` 190/-
I S B N 9 7 8 - 9 3 - 8 9 4 2 0 - 11 - 1

® TM

TECHNICAL
PUBLICATIONS
SINCE 1993 An Up-Thrust for Knowledge 9 789389 420111

(i)
Geographic Information
System
Subject Code : OCE552

Semester - V (CSE / IT / ECE / Mechanical) Open Elective - I

First Edition : September 2019

ã Copyright with Authors


All publishing rights (printed and ebook version) reserved with Technical Publications. No part of this book
should be reproduced in any form, Electronic, Mechanical, Photocopy or any information storage and
retrieval system without prior permission in writing, from Technical Publications, Pune.

Published by :
® TM

TECHNICAL Amit Residency, Office No.1, 412, Shaniwar Peth, Pune - 411030, M.S. INDIA
P h . : + 9 1 - 0 2 0 - 2 4 4 9 5 4 9 6 / 9 7 , Te l e f a x : + 9 1 - 0 2 0 - 2 4 4 9 5 4 9 7
PUBLICATIONS
SINCE 1993 An Up-Thrust for Knowledge Email : sales@technicalpublications.org Website : www.technicalpublications.org

Printer :
Yogiraj Printers & Binders
Sr.No. 10\1A,
Ghule Industrial Estate, Nanded Village Road,
Tal-Haveli, Dist-Pune - 411041.

Price : ` 190/-
I S B N 9 7 8 - 9 3 - 8 9 4 2 0 - 11 - 1

9 789389 420111 AU 17

9789389420111 [1] (ii)


Geographic Information System 1-2 Fundamentals of GIS

1.1 Introduction to GIS


A Geographic Information System (GIS) is a computer-based tool for mapping and analyzing
geographic phenomenon that exist, and events that occur, on Earth. GIS technology integrates
common database operations such as query and statistical analysis with the unique visualization
and geographic analysis benefits offered by maps. These abilities distinguish GIS from other
information systems and make it valuable to a wide range of public and private enterprises for
explaining events, predicting outcomes, and planning strategies. Map making and geographic
analysis are not new, but a GIS performs these tasks faster and with more sophistication than do
traditional manual methods.
Today, GIS is a multi-billion-dollar industry employing hundreds of thousands of people
worldwide. GIS is taught in schools, colleges, and universities throughout the world. Professionals
and domain specialists in every discipline are become increasingly aware of the advantages of
using GIS technology for addressing their unique spatial problems.
We commonly think of a GIS as a single, well-defined, integrated computer system. However,
this is not always the case. A GIS can be made up of a variety of software and hardware tools. The
important factor is the level of integration of these tools to provide a smoothly operating, fully
functional geographic data processing environment.
Overall, GIS should be viewed as a technology, not simply as a computer system.
In general, a GIS provides facilities for data capture, data management, data manipulation and
analysis, and the presentation of results in both graphic and report form, with a particular emphasis
upon preserving and utilizing inherent characteristics of spatial data.
The ability to incorporate spatial data, manage it, analyze it, and answer spatial questions is the
distinctive characteristic of geographic information systems.
A geographic information system, commonly referred to as a GIS, is an integrated set of
hardware and software tools used for the manipulation and management of digital spatial
(geographic) and related attribute data.
There are three integrating part in a GIS :
Geographic : The spatial realities of the real world
Information : The meaning and use of data
Systems : The computer technology and support infrastructure

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1-3 Fundamentals of GIS

1.2 Basic Spatial Concepts


Spatial concepts as the driving force for spatial thinking and for the selection and use of spatial
tools. Eight concepts are the focus of spatial reasoning in the use of geographical information.
These concepts are demonstrable at all levels of space and time.
 Location - Understanding formal and informal methods of specifying “where”
 Distance - The ability to reason from knowledge of relative position
 Network - Understanding the importance of connections
 Neighborhood and Region - Drawing inferences from spatial context
 Scale - Understanding spatial scale and its significance
 Spatial Heterogeneity - The implications of spatial variability
 Spatial Dependence - Understanding relationships across space
 Objects and Fields - Viewing phenomena as continuous in space-time or as discrete

1.3 Coordinate System


A coordinate system is a reference system used for locating objects in a two or three
dimensional space

Geographic Coordinate System


A geographic coordinate system, also known as global or spherical coordinate system is a
reference system that uses a three-dimensional spherical surface to determine locations on the
earth. Any location on earth can be referenced by a point with longitude and latitude.
We must familiarize ourselves with the geographic terms with respect to the Earth coordinate
system in order to use the GIS technologies effectively.
Pole : The geographic pole of earth is defined as either of the two points where the axis of
rotation of the earth meets its surface. The North Pole lies 90º north of the equator and the South
Pole lies 90º south of the equator
Latitude : Imaginary lines that run horizontally around the globe and are measured from 90º
north to 90º south. Also known as parallels, latitudes are equidistant from each other.
Equator : An imaginary line on the earth with zero degree latitude, divides the earth into two
halves-Northern and Southern Hemisphere. This parallel has the widest circumference.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1-4 Fundamentals of GIS

Fig. 1.3.1 Division of earth into hemispheres

Longitude : Imaginary lines that run vertically around the globe. Also known as meridians,
longitudes are measured from 180º east to 180º west. Longitudes meet at the poles and are widest
apart at the equator
Prime meridian : Zero degree longitude which divides the earth into two halves-Eastern and
Western hemisphere. As it runs through the Royal Greenwich Observatory in Greenwich, England
it is also known as Greenwich meridian

Fig. 1.3.2 Latitude and longitude measurements

Equator (0º) is the reference for the measurement of latitude. Latitude is measured north or
south of the equator. For measurement of longitude, prime meridian (0º) is used as a reference.
Longitude is measured east or west of prime meridian. The grid of latitude and longitude over the
globe is known as graticule. The intersection point of the equator and the prime meridian is the
origin (0, 0) of the graticule.

Coordinate measurement
The geographic coordinates are measured in angles. The angle measurement can be understood
as per following :

A full circle has 360 degrees 360º

1 circle A degree is further divided into 60 minutes 1º = 60′

A minute is further divided into 60 seconds 1′ = 60″

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1-5 Fundamentals of GIS

An angle is expressed in Degree Minute Second.


While writing coordinates of a location, latitude is followed by longitude. For example,
coordinates of Delhi is written as 28° 36′ 50″ N, 77° 12′ 32″ E.
Decimal Degree is another format of expressing the coordinates of a location. To convert a
coordinate pair from degree minute second to decimal degree following method is adopted :
8º36’50” = 28 + (36*1)+50(50*1/60*1/60)
= 28 + 0.6 + 0.138
= 28.6138
We have 28 full degrees, 36 minutes - each 1/60 of a degree, and 50 seconds - each 1/60 of 1/60
of a degree.

Local Time and Time Zones


With rotation of earth on its axis, at any moment one of the longitudes faces the Sun (noon
meridian), and at that moment, it is noon everywhere on it. After 24 hours the earth completes one
full rotation with respect to the Sun, and the same meridian again faces the noon. Thus each hour
the Earth rotates by 360/24 = 15 degrees.
This implies that with every 15º of longitude change a new time zone is created which is
marked by a difference of one hour from the neighboring longitudes specified at 15º gap. The
earth's time zones are measured from the prime meridian (0º) and the time at Prime meridian is
called Greenwich Mean Time. Thus, there are 24 time zones created around the globe.

Date
The International Date Line is the imaginary line on the Earth that separates two consecutive
calendar days. Generally, it is said to be lying exactly opposite to the prime meridian having a
measurement of 180º meridian but it is not so. It zigs and zags the 180º meridian following the
political jurisdiction of the states but for sake of simplicity it is taken as 180º meridian. Starting at
midnight and going east to the International Date Line, the date is one day ahead of the date on the
rest of the Earth.

Projected Coordinate system


A projected coordinate system is defined as two dimensional representation of the Earth. It is
based on a spheroid geographic coordinate system, but it uses linear units of measure for
coordinates. It is also known as Cartesian coordinate system.
In such a coordinate system the location of a point on the grid is identified by (x, y) coordinate
pair and the origin lies at the centre of grid. The x coordinate determines the horizontal position
and y coordinate determines the vertical position of the point.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1-6 Fundamentals of GIS

Fig. 1.3.3 Cartesian coordinate system

In such a coordinate system the location of a point on the grid is identified by (x, y) coordinate
pair and the origin lies at the centre of grid. The x coordinate determines the horizontal position
and y coordinate determines the vertical position of the point.

1.4 GIS and Information Systems

1.4.1 Definitions of GIS


“A geographic information system is a special case of information systems where the database
consists of observations on spatially distributed features, activities or events, which are definable in
space as points, lines, or areas. A geographic information system manipulates data about these
points, lines, and areas to retrieve data for ad hoc queries and analyses” (Kenneth Dueker,
Portland State University, 1979).
“A powerful set of tools for collection, storing, retrieving at will transforming and displaying
spatial data from the real world” Burrough,1986
“A system for capturing , storing, checking, integrating, manipulating, analyzing and displaying
data which are spatially referenced on the earth ” Chorley, 1987.
“GIS is a configuration of computer hardware and software specifically designed for the
acquisition, maintenance and use of cartographic data” Tomlin,1990
“A Geographic Information System (GIS) is a computer-based tool for mapping and analyzing
things that exist and events that happen on earth. GIS technology integrates common database
operations such as query and statistical analysis with the unique visualization and geographic
analysis benefits offered by maps.” ESRI

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1-7 Fundamentals of GIS

“GIS is an integrated system of computer hardware, software, and trained personnel linking
topographic, demographic, utility, facility, image and other resource data that is geographically
referenced.” NASA.

1.4.2 Objectives of GIS


Some of the major objectives of GIS are to
 Maximizing the efficiency of planning and decision making
 Integrating information from multiple sources
 Facilitating complex querying and analysis
 Eliminating redundant data and minimizing duplication

1.5 History of GIS


One of the first applications of spatial analysis in epidemiology is the 1832 "Rapport sur la
marche et les effets du choléra dans Paris et le department de la Seine". The French geographer
Charles Piquet represented the 48 districts of the city of Paris by halftone color gradient according
to the percentage of deaths by cholera per 1,000 inhabitants.
In 1854 John Snow depicted a cholera outbreak in London using points to represent the
locations of some individual cases, possibly the earliest use of a geographic methodology in
epidemiology. His study of the distribution of cholera led to the source of the disease, a
contaminated water pump (the Broad Street Pump, whose handle he disconnected, thus terminating
the outbreak).
The early 20th century saw the development of photo zincography, which allowed maps to be
split into layers, for example one layer for vegetation and another for water. This was particularly
used for printing contours – drawing these was a labor-intensive task but having them on a separate
layer meant they could be worked on without the other layers to confuse the draughtsman.
The year 1960 saw the development of the world's first true operational GIS in Ottawa, Ontario,
Canada by the federal Department of Forestry and Rural Development. Developed by Dr. Roger
Tomlinson, it was called the Canada Geographic Information System (CGIS) and was used to
store, analyze, and manipulate data collected for the Canada Land Inventory – an effort to
determine the land capability for rural Canada by mapping information about soils, agriculture,
recreation, wildlife, waterfowl, forestry and land use at a scale of 1:50,000.
In 1986, Mapping Display and Analysis System (MIDAS), the first desktop GIS product
emerged for the DOS operating system. This was renamed in 1990 to MapInfo for Windows when
it was ported to the Microsoft Windows platform. This began the process of moving GIS from the
research department into the business environment.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1-8 Fundamentals of GIS

The first known use of the term "Geographic Information System" was by Dr. Roger
Tomlinson in the year 1968 in his paper "A Geographic Information System for Regional
Planning“. Tomlinson is also acknowledged as the "father of GIS”.

1.6 Components of a GIS

1.6.1 Hardware
It consists of the equipments and support devices that are required to capture, store process and
visualize the geographic information. These include computer with hard disk, digitizers, scanners,
printers and plotters etc.

1.6.2 Software
Software is at the heart of a GIS system. The GIS software must have the basic capabilities of
data input, storage, transformation, analysis and providing desired outputs. The interfaces could be
different for different software’s.
Key software components are
 Tools for the input and manipulation of geographic information
 A database management system (DBMS)
 Tools that support geographic query, analysis, and visualization
 A graphical user interface (GUI) for easy access to tools
The GIS software’s being used today belong to either of the category –proprietary or open
source. ArcGIS by ESRI is the widely used proprietary GIS software. Others in the same category
are MapInfo, Microstation, Geomedia etc. The development of open source GIS has provided us
with freely available desktop GIS such as Quantum, uDIG, GRASS, MapWindow GIS etc., GIS
softwares.

1.6.3 Data
The data is captured or collected from various sources (such as maps, field observations,
photography, satellite imagery etc) and is processed for analysis and presentation.

1.6.4 Methods/Procedures
These include the methods or ways by which data has to be input in the system, retrieved,
processed, transformed and presented.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1-9 Fundamentals of GIS

1.6.5 People
This component of GIS includes all those individuals (such as programmer, database manager,
GIS researcher etc.) who are making the GIS work, and also the individuals who are at the user end
using the GIS services, applications and tools

Fig. 1.6.1 Components of GIS

1.7 GIS Subsystems / Software Functional Elements


A GIS has four main functional subsystems. These are:
 a data input subsystem;
 a data storage and retrieval subsystem
 a data manipulation and analysis subsystem
 a data output and display subsystem

1.7.1 Data Input


Data input is the operation of encoding the data and writing them to the database and creates the
foundation for useful GIS. However, the process of good database creation is very time consuming
and complex operation upon which the usefulness of the GIS depends.
A data input subsystem allows the user to capture, collect, and transform spatial and thematic
data into digital form. Data input involves data acquisition including identification and collection
of data required for applications. It covers all aspects of transforming data captured from existing

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 10 Fundamentals of GIS

maps, field observations, and sensors into a compatible digital form. A wide range of computer
tools is available for this purpose, including the digitizer, lists of data in text files, scanners and the
devices necessary for recording data already written on magnetic media such as tapes, drums and
disks
Various sources for data input may be :
 text files
 existing maps
 aerial photographs
 satellite imagery
 airborne scanners
 field measurements

Fig 1.7.1 Data input

1.7.2 Data Storage and Retrieval


The data storage and retrieval subsystem organizes the data, spatial and attribute, in a form
which permits it to be quickly retrieved by the user for analysis, and permits rapid and accurate
updates to be made to the database. This component usually involves use of a Database
Management System (DBMS) for maintaining attribute data. Spatial data is usually encoded and
maintained in a proprietary file format.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 11 Fundamentals of GIS

Fig. 1.7.2 Data Storage

1.7.3 Data Manipulation and Analysis


The data manipulation and analysis subsystem allows the user to define and execute spatial and
attribute procedures to generate derived information. This subsystem is commonly thought of as
the heart of a GIS, and usually distinguishes it from other database information systems and
Computer-Aided Drafting (CAD) systems.

1.7.4 Data Output


The data output subsystem allows the user to generate graphic displays, normally maps, and
tabular reports representing derived information products.
The critical function for a GIS is, by design, the analysis of spatial data.

Fig. 1.7.3 Data Output


TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 12 Fundamentals of GIS

It is important to understand that the GIS is not a new invention. In fact, geographic
information processing has a rich history in a variety of disciplines. In particular, natural resource
specialists and environmental scientists have been actively processing geographic data and
promoting their techniques since the 1960's.

1.8 Applications of GIS


GIS is involved in various areas. These include topographical mapping, socioeconomic and
environment modeling, and education. The role of GIS is best illustrated with respect to some of
the representative application areas that are mentioned below
 Tax Mapping,
 Business,
 Logistics,
 Emergency evacuation,
 Environment

1.8.1 Tax Mapping


Raising revenue from property taxes is one of the important functions of the government
agencies. The amount of tax payable depends on the value of the land and the property. The correct
assessment of value of land and property determines the equitable distribution of the community
tax. A tax assessor has to evaluate new properties and respond to the existing property valuation.
To evaluate taxes the assessor uses details on current market rents, sale, maintenance, insurance
and other expenses. Managing as well as analyzing all this information simultaneously is time
consuming and hence comes the need of GIS. Information about property with its geographical
location and boundary is managed by GIS. Land units stored in parcel database can be linked to
their properties. Querying the GIS database can locate similar type of properties in an area. The
characteristics of these properties can then be compared and valuation can be easily done

1.8.2 Business
Approximately 80 percent of all business data are related to location. Businesses manage a
world of information about sales, customers, inventory, demographic profiles etc. Demographic
analysis is the basis for many other business functions: customer service, site analysis, and
marketing. Understanding your customers and their socioeconomic and purchasing behavior is
essential for making good business decisions. A GIS with relevant data such as number of
consumers, brands and sites they go for shopping can give any business unit a fair idea whether
their unit if set up is going to work at a particular location the way they want it to run.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 13 Fundamentals of GIS

1.8.3 Logistics
Logistics is a field that takes care of transporting goods from one place to another and finally
delivering them to their destinations. It is necessary for the shipping companies to know where
their warehouses should be located, which routes should the transport follow that ensures minimum
time and expenditures to deliver the parcels to their destinations. All such logistics decisions need
GIS support.

1.8.4 Emergency Evacuation


The occurrence of disasters is unpredictable. We as humans are unable to tell when, where and
what magnitude of disaster is going to emerge and therefore solely depend on disaster preparedness
as safety measures. It is important to know in which area the risk is higher, the number of
individuals inhabiting that place, the routes by which the vehicles would move to help in
evacuating the individuals. Thus preparing an evacuation plan needs GIS implementation.

1.8.5 Environment
GIS is being increasingly involved in mapping the habitat loss, urban sprawl, land-use change
etc. Mapping such phenomena need historical landuse data, anthropogenic effects which greatly
affect these phenomena are also brought into GIS domain. GIS models are then run to make
predictions for the future.

1.9 Proprietary and Open Source Software

1.9.1 Open-Source GIS Software


Many GIS tasks can be accomplished with open-source GIS software, which are freely
available over Internet downloads. With the broad use of non-proprietary and open data formats
such as the Shape File format for vector data and the Geotiff format 26 for raster data, as well as
the adoption of OGC standards for networked servers, development of open source software
continues to evolve, especially for web and web service oriented applications. Most widely used
open source applications:
 GRASS - Originally developed by the U. S. Army Corps of Engineers, open source : a
complete GIS
 MapServer - Web-based mapping server, developed by the University of Minnesota.
 Chameleon - Environments for building applications with MapServer.
 GeoNetwork opensource - A catalog application to manage spatially referenced resources
 GeoTools - Open source GIS toolkit written in Java, using Open Geospatial Consortium
specifications.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 14 Fundamentals of GIS

 gvSIG - Open source GIS written in Java.

1.9.2 Proprietary Software


With proprietary software there is one point of contact, with the open source software the
support comes from the community. License cost versus capacity development : potentially save
enough money on software to save jobs. License on tools are hard to sell, more profits from the
participation in projects that need improvements on the tool or capacity development using the
tool. Proprietary tools tend to lock the user in: hard to make improvements, stuck with formats.
Bugs are easier removed by the community. New developments are quicker implemented.
Most widely used notable proprietary software applications and providers :
 ESRI - Products include ArcView 3.x, ArcGIS, ArcSDE, ArcIMS, and ArcWeb services.
 GRAM++ GIS - Low-cost GIS software product developed by CSRE, IIT Bombay.
 Autodesk - Products include MapGuide and other products that interface with its flagship
AutoCAD software package.
 Cadcorp - Developers of GIS software and OpenGIS standard
 Intergraph - Products include GeoMedia, GeoMedia Profesional, GeoMedia WebMap
 ERDAS IMAGINE - A proprietary GIS, Remote Sensing, and Photogrammetry software
developed by Leica Geosystems Geospatial Imaging.
 SuperGeo - Products include SuperGIS Desktop & extensions, SuperPad Suite,
SuperWebGIS & extensions, SuperGIS Engine & extensions, SuperGIS Network Server and
GIS services.

1.10 Types of Data - Spatial, Attribute Data

1.10.1 Spatial Data


Geospatial data, spatial information or geographic information. Spatial data - describes the
absolute and relative location of geographic features. It is represented by vector and raster forms
(including imagery). Spatial data are generally multi-dimensional and auto correlated.
Spatial data also refers to all types of data objects or elements that are present in a geographical
space or horizon. It is the data or information that identifies the geographic location of features and
boundaries on Earth, such as natural or constructed features, oceans, and more.
Spatial data is usually stored as coordinates and topology, and the data that can be mapped. It
enables the global finding and locating of individuals or devices anywhere in the world.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 15 Fundamentals of GIS

1.10.1.1 Representation of Space


Burrough & McDonnell (1998) described two ways to represent the space (an area, landscape
or some bigger unit), which are as follows :
a. Discrete Entities : The space could be seen as occupied with entities that are described by
their properties and can be located on earth using coordinate systems. The entities have a
clear boundary. Buildings, roads, land parcels etc. are the example of discrete entities.
b. Continuous fields : The variation of an attribute over the space as a continuous field. No
physical boundary can ever be observed in such case. Temperature, pressure, elevation etc.
across an area are the examples of continuous fields

1.10.1.2 Types of Data Representation


The data can be represented in any of the format
 Numeric data
 Vector data
 Raster data

Numeric data
Numeric data is statistical data which includes a geographical component or field that can be
joined with vector files so the data can be queried and displayed as a layer on a map in a GIS.
The most common type of numeric data is demographic data from the US Census.

Vector data
Vector data is a data that has a spatial component, or X,Y coordinates assigned to it. Vector
files can contain sets of points, lines, or polygons that are referenced in a geographic space.
There are three types of features :
 Point (vertex, node) is a 0-dimensional object and has the property of location (x,y).
 Line (edge, link, chain, arc) is a one-dimensional object that has the property of length. An
arc starts with a node, has zero or more vertices, and ends with a node.
 Polygon is a two-dimensional object with properties of area and perimeter.

Fig. 1.10.1 Vector entities

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 16 Fundamentals of GIS

Raster data
Raster data is a data in a .JPG, .TIF, .GIF or similar format.
Items scanned using a flatbed scanner like the map given below is examples of raster files.
Images taken with a digital camera produce these same types of files

Fig. 1.10.2 Rater data format

1.10.1.3 Data Content


Based on data content, spatial data can be classified as :
Temporal - Constantly changing data, i.e. data that represents the dynamic variables at
different time frames (t1, t2). Example of temporal data layers would be rainfall, stream discharge,
and land use.
Thematic - Contains information on some unique aspect or attribute class; i.e. watershed
boundaries, soils, geology.

1.10.1.4 Data Structure


Discrete Data - Discrete Data is object-based, categorical or discontinued data. It represents
objects defined as points, lines, or areas. Examples are weather stations, rivers, lakes.
In addition, we can find exact spatial objects (discrete features with well-defined boundaries)
and inexact spatial objects (discrete features also called “fuzzy entities” with no precise
boundaries, i.e. the boundaries are transitional).
Continuous Data - Continuous Data is field-based or surface data. It covers a continuous
space, represented by a large number of discrete units. An example is an elevation map
(represented as a raster).

1.10.2 Attribute Data


Attribute data describes characteristics of the spatial features. Also called as Non spatial data
and characteristic data.
TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 17 Fundamentals of GIS

These characteristics can be quantitative and/or qualitative in nature. Attribute data is often
referred to as tabular (Numeric) data. Non-spatial data are generally one-dimensional and
independent. It’s a separate data model used to store and maintain attribute data for GIS
software. These data models may exist internally within the GIS software, or may be reflected in
external commercial Database Management Software (DBMS).
Variety of different data models exist for the storage and management of attribute data.
The most common are :
 Tabular Model
 Hierarchical Model
 Network Model
 Relational Model
 Object Oriented Model
Most early GIS software packages stored their attribute data in Tabular model. The next three
models are those most commonly implemented in Database Management Systems (DBMS).

1.10.2.1 Tabular Model


The object oriented is newer but rapidly gaining in popularity for some applications.
It stores attribute data as sequential data files with fixed formats (or comma delimited for
ASCII data), for the location of attribute values in a predefined record structure.
This type of data model is outdated in the GIS arena.
It lacks in method of checking data integrity, as well as being inefficient with respect to data
storage,
e.g. limited indexing capability for attributes or records, etc.

Fig. 1.10.3 Example showing tabular model

1.10.2.2 Hierarchical Model


It is the earliest database model that is evolved from file system where records are arranged in a
hierarchy or as a tree structure. Records are connected through pointers that store the address of the

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 18 Fundamentals of GIS

related record. Each pointer establishes a parent child relationship where a parent can have more
than one child but a child can only have one parent. There is no connection between the elements
at the same level. To locate a particular record, you have to start at the top of the tree with a parent
record and trace down the tree to the child.

Fig. 1.10.4 Example showing hierarchical data structure

The figure above describes the electronic gadgets in day today use. We can see that flash is a
child of mp3 players, which is a child of portable electronics, which is a child of electronics. The
topmost element electronics has no parent. Tube, LCD, plasma, CD players and 2 way radios are
leaf nodes (don’t have any children)

1.10.2.3 Network Data Structure Model


A network is a generalized graph that captures
relationships between objects using connectivity.
A network database consists of a collection of
records that are connected to each other through
links. A link is an association between two
records. It allows each record to have many
parents and many children thus allowing a natural
model of relationships between entities.

Fig. 1.10.5 Example showing network data


structure model

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 19 Fundamentals of GIS

1.10.2.4 Relational Data Structure Model


The relational data model was introduced by Codd in 1970. The relational database relates or
connects data in different files through the use of a common field. A flat file structure is used with
a relational database model. In this arrangement, data is stored in different tables made up of rows
and columns. The columns of a table are named by attributes. Each row in the table is called a
tuple and represents a basic fact. No two rows of the same table may have identical values in all
columns.
There are two crucial data integrity constraints viz. primary key and foreign key. A primary key
is an attribute whose value is unique across all tuples (rows) in a relation (table). The primary key
of one table appearing as an attribute of another table is known as a foreign key in that table

Fig. 1.10.6 Example showing relational data structure model

1.10.2.5 Object Oriented Model


 The object-oriented database model manages data through objects.
 An object is a collection of data elements and operations that together are considered a
single entity.
 The object-oriented database is a relatively new model.
 This approach has the attraction that querying is very natural, as features can be bundled
TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 20 Fundamentals of GIS

together with attributes at the database administrator's discretion.


 To date, only a few GIS packages are promoting the use of this attribute data model.

Fig. 1.10.7 Example showing object oriented model

1.11 Scales/ Levels of Measurements


Measurement is defined as application of rules to assign numbers to objects (or attributes).
Measurement rules are the procedures used to transform the qualities of attributes into
numbers (e.g., type of scale used).

Three important properties :


 Magnitude is the property of “moreness”. Higher score refers to more of something.
 Equal interval is the difference between any two adjacent numbers referring to the same
amount of difference on the attribute?
 Absolute zero is the scale have a zero point that refers to having none of that attribute?

Measurement Scales
 The scale determines the amount of information contained in the data.
 The scale indicates the data summarization and statistical analyses that are most appropriate.
 The attributes shown in a thematic map can be recorded by four different scales.
 Numerical values may be defined with respect to nominal, ordinal, interval or ratio scales of
measurement.
 It is important to recognize the scales of measurement used in GIS data as this determines
the kinds of mathematical operations that can be performed on the data.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 21 Fundamentals of GIS

o Nominal scales - Qualitative, not quantitative distinction (no absolute zero, not equal
intervals, not magnitude)
o Ordinal scales - Ranking individuals (magnitude, but not equal intervals or absolute
zero)
o Interval scales - Scales that have magnitude and equal intervals but not absolute zero
o Ratio scales - Have magnitude, equal intervals, and absolute zero (so can compute
ratios)

Nominal Scale
Nominal Scales - There must be distinct classes but these classes have no quantitative
properties. Therefore, no comparison can be made in terms of one category being higher than the
other.
For example - There are two classes for the variable gender - males and females.
There are no quantitative properties for this variable or these classes and, therefore, gender is a
nominal variable.

Other Examples :
 Country of origin
 Biological sex (male or female)
 Animal or non-animal
 Married vs. Single
Sometimes numbers are used to designate category membership

Example : Country of Origin


 1 = United States
 3 = Canada
 2 = Mexico
 4 = Other
However, in this case, it is important to keep in mind that the numbers do not have intrinsic
meaning

Ordinal Scale
Ordinal Scales - There are distinct classes but these classes have a natural ordering or ranking.
The differences can be ordered on the basis of magnitude.
For example - Final position of horses in a race is an ordinal variable. The horses finish first,
second, third, fourth, and so on.
TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 22 Fundamentals of GIS

The difference between first and second is not necessarily equivalent to the difference between
second and third, or between third and fourth.
Does not assume that the intervals between numbers are equal

Fig. 1.11.1 Example : finishing place in a race (first place, second place)

Interval Scale
The data have the properties of ordinal data, and the interval between observations is expressed
in terms of a fixed unit of measure.
Designates an equal-interval ordering - The distance between, for example, a1 and a2 is the
same as the distance between a4 and a5.
Example - Celsius temperature is an interval variable. It is meaningful to say that 25 degrees
Celsius is 3 degrees hotter than 22 degrees Celsius, and that 17 degrees Celsius is the same amount
hotter (3 degrees) than 14 degrees Celsius. Notice, however, that 0 degrees Celsius does not have a
natural meaning. That is, 0 degrees Celsius does not mean the absence of heat!

Ratio Scale

Fig. 1.11.2 Levels of measurement

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 23 Fundamentals of GIS

Ratio Scales - captures the properties of the other types of scales, but also contains a true zero,
which represents the absence of the quality being measured. Has an absolute zero that is
meaningful. Can construct a meaningful ratio (fraction), for example, number of clients in past six
months.
It is meaningful to say that “...we had twice as many clients in this period as we did in the
previous six months.

1.12 Two Marks Questions with Answers


Q.1 Define GIS
Ans. : GIS stands for Geographical Information System. According to Burrough,1986
GIS is defined as an integrated tool, capable of mapping, analyzing, manipulating and storing
geographical data in order to provide solutions to real world problems and help in planning for
the future.
GIS deals with what and where components of occurrences. For example, to regulate rapid
transportation, government decides to build fly-over (what component) in those areas of the
city where traffic jams are common (where component).
Q.2 List the components of GIS
Ans. : Hardware, Software, Data, Method, People.
Q.3 List out the three basic kinds vector entities in GIS.
Ans. : The Vector Model uses points and their coordinates (X, Y) to represent spatial features
(ESRI). There are three types of features:
Point (vertex, node) is a 0-dimensional object and has the property of location (x,y).
Line (edge, link, chain, arc) is a one-dimensional object that has the property of length. An
arc starts with a node, has zero or more vertices, and ends with a node.
A polygon is a two-dimensional object with properties of area and perimeter.

Q.4 What are the applications of GIS?


Ans. :
 Tax Mapping
 Business
 Logistics

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 24 Fundamentals of GIS

 Emergency evacuation
 Environment
Q.5 Differentiate Spatial and Attribute data.
Ans. :
Sr. Spatial data Attribute data
No.

1. Describes the absolute and relative Describes characteristics of the spatial


location of geographic features. features

2. It is represented by vector and raster These characteristics can be quantitative


forms (including imagery). and/or qualitative in nature

3. Spatial data are generally multi- Non-spatial data are generally one-
dimensional and auto correlated dimensional and independent.

Q.6 What are the 4M analysis in GIS?


Ans. : In a GIS, we measure environmental parameters, develop maps portraying earth
characteristics, monitor changes in surrounding space and time, and also model alternatives of
actions and processes operating in the environment. These are called four Ms of GIS
Q.7 Write any 4 advantages of GIS.
Ans. :
 Geospatial data better maintained in a standard format
 Revision and updating easier
 Geospatial data and information easier to search, analyze and represent
 Value added products can be generated
 Geospatial data can be shared and exchanged freely
 Productivity and efficiency of staff is improved
 Saving in time and money
 Better decisions making
Q.8 Which is called as characteristics data? Why?
Ans. : Attribute data is known as Characteristic Data. Attribute data- describes characteristics of
the spatial features.
These characteristics can be quantitative and/or qualitative in nature.
Attribute data is often referred to as tabular(Numeric) data.
TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 25 Fundamentals of GIS

Non-spatial data are generally one-dimensional and independent.


Q.9 What are the levels of measurement?
Ans. : The scale determines the amount of information contained in the data.
The scale indicates the data summarization and statistical analyses that are most appropriate.
Nominal scales - Qualitative, not quantitative distinction (no absolute zero, not equal
intervals, not magnitude)
Ordinal scales - Ranking individuals (magnitude, but not equal intervals or absolute zero)
Interval scales - Scales that have magnitude and equal intervals but not absolute zero
Ratio scales - Have magnitude, equal intervals, and absolute zero (so can compute ratios)
Q.10 Write the types of Coordinate Systems
Ans. : Cartesian Coordinate system - can be represented by a grid with a numbering system that
can locate information on a horizontal and vertical axis.
Polar coordinate system - information are located using only an angle and a distance (radius).
Global coordinate system - is where two numbers (latitude and longitude) are used to
reference a specific location on the earth.
Q.11 What is Geographical co-ordinate system?
Ans. : This is a one of true co-ordinate system .The location of any point on the earth surface
can be defined by a reference using latitude and longitude.
Q.12 What are the hardware components of a GIS?
Ans. : Hardware is the computer on which a GIS operates. Today, GIS software runs on a wide
range of hardware types, from centralized computer servers to desktop computers used in stand-
alone or networked configurations.
Q.13 What are the software components of a GIS?
Ans. :
 Data acquisition/Input
 Data processing and preprocessing
 Database management (storage and retrieval)
 Spatial data manipulation and analysis
 Product generation: output and visualization
Q.14 What are the data input devices used in a GIS?
Ans. : The different methods of input into a GIS are by
 Keyboard entry
 Manual digitizing
 Scanning and automatic digitizing.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 26 Fundamentals of GIS

Q.15 What are the data output devices used in a GIS?


Ans. : The important data output devices used ina GIS are
1. Plotter : Used to plot the graphical information after analysis on a paper
2. Printer : Used to print the information after analysis on a paper
3. VDU : Visual display unit - used to display the results after analysis
4. Tape Drive: Used to atore the results after analysis and take it to other systems.
Q.16 List the important GIS software’s.
Ans. : Standard GIS Softwares
 ARCGIS
 ARCVIEW
 ARCINFO
 MAPINFO
 ERDAS
 ENVI
 AUTOCADMAP
 IDRISI
Q.17 What is DBMS?
Ans. : A Database Management System (DBMS) is a software package with computer programs
that control the creation, maintenance, and use of a database. It allows organizations to
conveniently develop databases for various applications by Database Administrators (DBAs) and
other specialists. A database is an integrated collection of data records, files, and other objects. A
DBMS allows different user application programs to concurrently access the same database..
Q.18 Define attribute values
Ans. : An attribute-value system is a basic knowledge representation framework comprising a
table with columns designating "attributes" (also known as "properties", "predicates," "features,"
"dimensions," "characteristics" or "independent variables" depending on the context) and rows
designating "objects" (also known as "entities," "instances," "exemplars," "elements" or
"dependent variables"). Each table cell therefore designates the value (also known as "state") of
a particular attribute of a particular object.
Q.19 What are the input formats of GIS software?
Ans. :
 text files

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 27 Fundamentals of GIS

 existing maps
 aerial photographs
 satellite imagery
 airborne scanners
 field measurements
 other GIS databases
Q.20 Differentiate Latitude and longitude.
Ans. : Latitude : Imaginary lines that run horizontally around the globe and are measured from
90º north to 90º south. Also known as parallels, latitudes are equidistant from each other.

Longitude : Imaginary lines that run vertically around the globe. Also known as meridians,
longitudes are measured from 180º east to 180º west. Longitudes meet at the poles and are
widest apart at the equator

1.13 Long Answered Questions with Answers


Q.1 Write about the components of GIS in detail. (Refer section 1.6)
Q.2 Illustrate the concept of Coordinate System. (Refer section 1.3)
Q.3 Explain the workflow of Geographic Information System in detail. (Refer section 1.7)
Q.4 Discuss briefly about Spatial and Attribute data. (Refer section 1.10)
Q.5 Discuss in detail about Software Functional elements. (Refer section 1.7)
Q.6 While writing coordinates of a location, latitude is followed by longitude. For example, coordinates of
Delhi is written as 28° 36′ 50″ N, 77° 12′ 32″ E. Convert a coordinate pair from degree minute second
to decimal degree. (Refer page No. 1 - 5)
Q.7 Write in detail about Proprietary and open source Software of GIS. (Refer section 1.9)
Q.8 Explain the types of data models in Attribute data. (Refer section 1.10.2)



TM
Technical Publications - An up thrust for knowledge
Geographic Information System 1 - 28 Fundamentals of GIS

Notes

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2-2 Spatial Data Models

2.1 Database Structures

Database in GIS:
A database is a collection of related information that permits the entry, storage, input, output
and organization of data. A database management system (DBMS) serves as an interface between
users and their database.
A spatial database includes location. It has geometry as points, lines and polygons.
GIS combines spatial data from many sources with many different people. Databases connect
users to the GIS database.
For example, a city might have the waste water division, land records, transportation and fire
departments connected and using datasets from common spatial databases.
The database structure is the collection of record type and field type definitions that comprise
your database:
 Record Types: These define the type of entities or research objects you wish to capture
(e.g. Person).
 Fields: These are the properties or attributes that describe your record types
(e.g. Gender, Age, Height etc.).
Collectively, these define the information or data that can be stored in any record of that type.

2.2 Data Structure Models (Structures of Databases)


 Data models are the conceptual models that describe the structures of databases.
 Structure of a database is defined by the data types, the constraints and the relationships for
the description or storage of data.
Following are the most often used data models:
 Relational Model
 Object oriented Model
 ER Diagram

2.2.1 Relational Model


The relational data model was introduced by Codd in 1970. The relational database relates or
connects data in different files through the use of a common field. A flat file structure is used with
a relational database model. In this arrangement, data is stored in different tables made up of rows
and columns. The columns of a table are named by attributes. Each row in the table is called a

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2-3 Spatial Data Models

tuple and represents a basic fact. No two rows of the same table may have identical values in all
columns.
There are two crucial data integrity constraints viz. primary key and foreign key. A primary key
is an attribute whose value is unique across all tuples (rows) in a relation (table). The primary key
of one table appearing as an attribute of another table is known as a foreign key in that table.

Fig. 2.2.1 Relational Data Model

2.2.1.1 Relational Model Concepts


 Attribute: Each column in a Table. Attributes are the properties which define a relation.
e.g., Student_Rollno, NAME,etc.
 Tables – In the Relational model the, relations are saved in the table format. It is stored
along with its entities. A table has two properties rows and columns. Rows represent records
and columns represent attributes.
 Tuple – It is nothing but a single row of a table, which contains a single record.
 Relation Schema: A relation schema represents the name of the relation with its attributes.
 Degree: The total number of attributes which in the relation is called the degree of the
relation.
 Cardinality: Total number of rows present in the Table.
TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2-4 Spatial Data Models

 Column: The column represents the set of values for a specific attribute.
 Relation instance – Relation instance is a finite set of tuples in the RDBMS system.
Relation instances never have duplicate tuples.
 Relation key - Every row has one, two or multiple attributes, which is called relation key.
 Attribute domain – Every attribute has some pre-defined value and scope which is known
as attribute domain.

2.2.1.2 Operations in Relational Model


Four basic update operations performed on relational database model are
Insert, update, delete and select.
 Insert is used to insert data into the relation
 Delete is used to delete tuples from the table.
 Modify allows you to change the values of some attributes in existing tuples.
 Select allows you to choose a specific range of data.
Whenever one of these operations are applied, integrity constraints specified on the relational
database schema must never be violated.

2.2.1.3 Relational Integrity Constraints


Relational Integrity constraints are referred to conditions which must be present for a valid
relation. These integrity constraints are derived from the rules in the mini-world that the database
represents.
There are many types of integrity constraints. Constraints on the Relational database
management system are mostly divided into three main categories are:
 Domain constraints- Domain constraints specify that within each tuple, and the value of
each attribute must be unique. This is specified as data types which include standard data
types integers, real numbers,
 Key constraints- An attribute that can uniquely identify a tuple in a relation is called the
key of the table. The value of the attribute for different tuples in the relation has to be
unique.
 Referential integrity constraints- is base on the concept of Foreign Keys. A foreign key is
an important attribute of a relation which should be referred to in other relationships.

Advantages of using Relational model


 Simplicity: A relational data model is simpler than the hierarchical and network model.
 Structural Independence: The relational database is only concerned with data and not with

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2-5 Spatial Data Models

a structure. This can improve the performance of the model.


 Easy to use: The relational model is easy as tables consisting of rows and columns is quite
natural and simple to understand.
 Query capability: It makes possible for a high-level query language like SQL to avoid
complex database navigation.
 Data independence: The structure of a database can be changed without having to change
any application.
 Scalable: Regarding a number of records, or rows, and the number of fields, a database
should be enlarged to enhance its usability.

Disadvantages of using Relational model


 Few relational databases have limits on field lengths which can't be exceeded.
 Relational databases can sometimes become complex as the amount of data grows, and the
relations between pieces of data become more complicated.
 Complex relational database systems may lead to isolated databases where the information
cannot be shared from one system to another

2.2.2 Object Oriented Model


This data model is another method of representing real world objects. It considers each object
in the world as objects and isolates it from each other. It groups its related functionalities together
and allows inheriting its functionality to other related sub-groups.

Elements of Object oriented data model


 Objects
The real world entities and situations are represented as objects in the Object oriented
database model.
 Attributes and Method
Every object has certain characteristics. These are represented using Attributes. The
behaviour of the objects is represented using Methods.
 Class
Similar attributes and methods are grouped together using a class. An object can be called as
an instance of the class.
 Inheritance
A new class can be derived from the original class.
The derived class contains attributes and methods of the original class as well as its own.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2-6 Spatial Data Models

For an example consider an Employee database to understand this model better. In this database
we have different types of employees – Engineer, Accountant, Manager, Clerk. But all these
employees belong to Person group. Person can have different attributes like name, address, age and
phone.

Fig. 2.2.2 Object oriented model

Advantages
 Because of its inheritance property, we can re-use the attributes and functionalities. It
reduces the cost of maintaining the same data multiple times. Also, these informations are
encapsulated and, there is no fear being misused by other objects. If we need any new
feature we can easily add new class inherited from parent class and adds new features.
Hence it reduces the overhead and maintenance costs.
 Because of the above feature, it becomes more flexible in the case of any changes.
 Codes are re-used because of inheritance.
 Since each class binds its attributes and its functionality, it is same as representing the real
world object. We can see each object as a real entity. Hence it is more understandable.

Disadvantages
 It is not widely developed and complete to use it in the database systems. Hence it is not

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2-7 Spatial Data Models

accepted by the users.


 It is an approach for solving the requirement. It is not a technology. Hence it fails to put it in
the database management systems.

2.2.3 Entity-Relationship Model


Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships
among them. While formulating real-world scenario into the database model, the ER Model creates
entity set, relationship set, general attributes and constraints.
ER Model is best used for the conceptual design of a database.
ER Model is based on −
Entities and their attributes.
Relationships among entities.
These concepts are explained below.

Fig. 2.2.3 ER Diagram

 Entity − An entity in an ER Model is a real-world entity having properties called attributes.


Every attribute is defined by its set of values called domain. For example, in a school
database, a student is considered as an entity. Student has various attributes like name, age,
class, etc.
 Relationship − The logical association among entities is called relationship. Relationships
are mapped with entities in various ways. Mapping cardinalities define the number of
association between two entities.

Mapping cardinalities
1. one to one 2. one to many
3. many to one 4. many to many

Attributes
Entities are represented by means of their properties, called attributes. All attributes have
values. For example, a student entity may have name, class, and age as attributes.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2-8 Spatial Data Models

There exists a domain or range of values that can be assigned to attributes. For example, a
student's name cannot be a numeric value. It has to be alphabetic. A student's age cannot be
negative, etc.

Types of Attributes
 Simple attribute − Simple attributes are atomic values, which cannot be divided further. For
example, a student's phone number is an atomic value of 10 digits.
 Composite attribute − Composite attributes are made of more than one simple attribute.
For example, a student's complete name may have first_name and last_name.
 Derived attribute − Derived attributes are the attributes that do not exist in the physical
database, but their values are derived from other attributes present in the database. For
example, average_salary in a department should not be saved directly in the database,
instead it can be derived. For another example, age can be derived from data_of_birth.
 Single-value attribute − Single-value attributes contain single value. For example −
Social_Security_Number.
 Multi-value attribute − Multi-value attributes may contain more than one values. For
example, a person can have more than one phone number, email_address, etc.
 In the below diagram, Entities or real world objects are represented in a rectangular box.
Their attributes are represented in ovals. Primary keys of entities are underlined. All the
entities are mapped using diamonds. This is one of the methods of representing ER model.
There are many different forms of representation.

Fig. 2.2.4 Example - ER diagram

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2-9 Spatial Data Models

Basically, ER model is a graphical representation of real world objects with their attributes and
relationship. It makes the system easily understandable. This model is considered as a top down
approach of designing a requirement.

Advantages
 It makes the requirement simple and easily understandable by representing simple diagrams.
 One can covert ER diagrams into record based data model easily.
 Easy to understand ER diagrams

Disadvantages
 No standard notations are available for ER diagram. There is great flexibility in the notation.
It’s all depends upon the designer, how he draws it.
 It is meant for high level designs. We cannot simplify for low level design like coding.

2.3 Spatial Data Models


Spatial data refers to the data or information that describes the absolute or relative location of
geographic features on the earth. The non spatial data or the attribute data on the other hand
describes the characteristics of the spatial features. These characteristics can be quantitative or
qualitative.

2.3.1 Representation of Space


Burrough & McDonnell (1998) described two ways to represent the space (an area, landscape
or some bigger unit), which are as follows:
a) Discrete Entities: The space could be seen as occupied with entities that are described by
their properties and can be located on earth using coordinate systems. The entities have a
clear boundary.
Buildings, roads, land parcels etc. are the example of discrete entities.
b) Continuous fields: The variation of an attribute over the space as a continuous field. No
physical boundary can ever be observed in such case.
Temperature, pressure, elevation etc. across an area are the examples of continuous fields
The term spatial data model is used to describe, how geographical data are organized within a
GIS in order to represent real world phenomena.

2.3.2 Data Models


 A data model is a description or view of the real world.
 Data modeling is a process that formalizes the description or view at different levels of data
abstraction.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 10 Spatial Data Models

 Since, the real world is made up of complex spatial objects and phenomena, it is practically
impossible for a single data model to represent everything that is present.
 This means that different users may have different data models when they attempt to collect
data in the same location.

GIS uses one of two spatial data models

a) Raster data model


 It divides the study area into cells, usually rectangular grid cells.
 It is location based because emphasis is placed upon the location of each cell relative to
other cells.
 It is frequently used to model field data.
 They correspond to regularly spaced points on a continuous surface.

b) Vector data model


 It is used to represent discrete phenomena, represented by geometric primitives ( point, line
and polygon).
 It is object – based.
 3D (TIN) Triangular irregular Network.
Data models are conceptual models of the real world. These describe us the representation and
storage of the geographic data. The data models used in GIS are described below:

2.3.2.1 Vector Data Model


The vector data model is closely linked with the discrete object view. In vector data model,
geographical phenomena are represented in three different forms;-point, line and polygon. The
shape of a spatial entity is stored using two-dimensional (x, y) coordinate system.
Point : A location depicted by a single set of (x, y) coordinates at the scale of abstraction.
The wells in a village, electricity poles in a town and cities in the world map are the examples
of spatial features described by points
Note : A city can be marked as a single point on a world map but would be marked as a
polygon on a state map. The scale plays an important role in deciding the geometry of a
geographical feature.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 11 Spatial Data Models

Fig. 2.3.1 Vector data model

Line/Arc : Ordered sets of (x, y) coordinate pairs arranged to form a linear feature. The curves
in a linear feature are generated by increasing the density of points/vertices.
The roads, rails and telephone cables are the examples of the spatial features described by lines.
Polygon : The set of (x, y) coordinate pairs enclosing a homogeneous area
The land parcels, agricultural farms and water bodies are the examples of the spatial features
described by polygons.

2.3.2.2 Raster Data Model


The raster data model is commonly associated with the field conceptual model. Here,
geographic space is represented by array of cells or pixels (aka picture elements) which are
arranged in rows and columns. Each pixel has a value that represents information. The value can be
in the form of integer, floating points or alphanumeric.
A point can be represented by a single pixel in raster model. A line is a chain of spatially
connected cells with the same value. Similarly, a water body in raster data is represented as a set of
contiguous pixels having same value that represents a homogeneous area.

Fig. 2.3.2 Raster data model

Comparison between Vector and Raster Data Models

Data Advantages Disadvantages


Model
Simple data structure Cell size determines the resolution at which
the data is represented

Compatible with remote sensing or Requires a lot of storage space

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 12 Spatial Data Models

scanned data

Raster Spatial analysis is easier Projection transformations are time


consuming

Simulation is easy because each unit has Network linkages are difficult to establish
the same size and shape

Data is represented at its original The location of each vertex is to be stored


resolution and form without explicitly
generalization

Require less storage space Overlay based on criteria is difficult


Vector
Editing is faster and convenient Spatial analysis is cumbersome

Network analysis is fast Simulation is difficult because each unit has


a different topological form

Projection transformations are easier

2.4 Raster Data Structure


In a simple raster data structure the geographical entities are stored in a matrix of rectangular
cells. A code is given to each cell which informs users which entity is present in which cell.
The simplest way of encoding a raster data into computers can be understood as follows:
 Entity model
 Pixel values
 File structure

a) Entity model :

Fig. 2.4.1 Entity model

 It represents the whole raster data.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 13 Spatial Data Models

 Let us assume that the raster data belongs to an area where land is surrounded by water.
 Here a particular entity (land) is shown in SHADED color and the area where land is not
present is shown by white.

b) Pixel values :
 The pixel value for the full image is shown.
 Cells having a part of the land are encoded as 1 and others where land is not present are
encoded as 0.

Fig. 2.4.2 Pixel values

(c) File structure:


 It demonstrates the method of coding raster data.
 The first row of the file structure data tells that there are 5 rows and 5 columns in the image,
and 1 is the maximum pixel value.
 The subsequent rows have cells with value as either 0 or 1 (similar to pixel values).

Fig. 2.4.3 File structure

 The huge size of the data is a major problem with raster data.
 An image consisting of twenty different land-use classes takes the same storage space as a
similar raster map showing the location of a single forest.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 14 Spatial Data Models

 To address this problem many data compaction (Compression) methods have been
developed.

2.5 Raster Data Compression


 Data compression is the process of modifying, encoding or converting the bits structure of
data in such a way that it consumes less space on disk.
 It enables reducing the storage size of one or more data instances or elements. Data
compression is also known as source coding or bit-rate reduction.
 Data compression enables sending a data object or file quickly over a network or the Internet
and in optimizing physical storage resources.
 Data compression has wide implementation in computing services and solutions, specifically
data communications. Data compression works through several compressing techniques and
software solutions that utilize data compression algorithms to reduce the data size.
A common data compression technique removes and replaces repetitive data elements and
symbols to reduce the data size. Data compression for graphical data can be lossless compression
or lossy compression, where the former saves all replaces but save all repetitive data and the latter
deletes all repetitive data.

Compression techniques
 Run length encoding
 Block encoding
 Chain encoding
 Quadtree

2.5.1 Run Length Coding (Lossless)


 Geographical data tends to be "spatially autocorrelated", meaning that objects which are
close to each other tend to have similar attributes:
 "All things are related, but nearby things are more related than distant things"
(Tobler 1970) Because of this principle, we expect neighboring pixels to have similar values.
Therefore, instead of repeating pixel values, we can code the raster as pairs of numbers -
(run length, value).
 The run length coding is a widely used compression technique for raster data. The primary
data elements are pairs of values or tuples, consisting of a pixel value and a repetition count
which specifies the number of pixels in the run. Data are built by reading successively row
by row through the raster, creating a new tuple every time the pixel value changes or the end
of the row is reached. Describes the interior of an area by run-lengths, instead of the
boundary.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 15 Spatial Data Models

 In the example , the first row is blank and is stored as (0,8). This means there are 8 cells and
they are all zeros. In the second row, there are 4 consecutive zeros so it gets a value of (0,4).
After this, we have three consecutive cells with the value 1 so it gets a value of (1,3). This
continues until it reaches the bottom-right cell.

Fig. 2.5.1 Example for Run length coding

2.5.2 Block Coding-Grouping Blocks of Data


 The block coding raster storage technique assigns areas that are blocks to reduce
redundancy.
 The block coding raster image compression method subdivides an entire raster image
into hierarchical blocks. It’s an extension of the run length encoding technique, but extends
it to two dimensions.
In the example :
Instead of storing 64 grid cells, all it takes is just 7 blocks. Using block coding, it requires one
3×3 block, two 2×2 blocks and four 1×1 cell blocks to encode this raster image.
In this block coding example, the top-left corner is used as a reference for each block.

Fig. 2.5.2 Example for Block coding

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 16 Spatial Data Models

2.5.3 Chain Coding (Freeman Coding)-Defining the Exterior Boundary


 Chain coding defines the outer boundary using relative positions from a start point. The
sequence of the exterior is stored where the endpoint finishes at the start point.
 During the encoding, the direction is stored as an integer. However, in this example we use
cardinal directions for simplicity. For example, the value 0 is north and 1 is east, 2 is south,
3 is west
 In the example, we start at position (5,2). From here we define the border using cardinal
directions and number of movements. We move east 3 positions until we hit the edge. At
this location, we move south 4 positions. This process continues until the end point hits the
start point.
Note : Only for the purpose of this exercise, we used north, east, south and west as alphabetical
values. When encoded, it is a numerical value.

Fig 2.5.3 Example for chain coding

2.5.4 Quadtree Encoding - Subdividing Data into Quarters


 Quadtrees are raster data structures based on the successive reduction of homogeneous cells.
It recursively subdivides a raster image into quarters. The subdivision process continues
until each cell is classed.
 It reduces raster storage requirements. It also is dependent on the complexity of the feature
and the resolution of the smallest grid cell.
 In the example, the top-left and bottom-right 8×8 grids do not need to be subdivided further
because they are homogeneous. The top-right 8×8 grid is subdivided into three 4×4 grid. The
remaining 4×4 grid is separated into 4 individual classes.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 17 Spatial Data Models

Fig. 2.5.4 Example for quadtree coding

2.5.5 Image Compression Reduces File size


 GIS data is abundant. With satellites acquiring images each day, raster data is the spatial
model of choice.
 Deploying efficient raster image compression techniques means reducing storage space. This
is the primary benefit of compressing your data.
 It can save money and time. You can also improve your network performance because you
are working with a reduced amount of data.

2.6 Vector Data Structure


 Geographic entities encoded using the vector data model, are called features. The features
can be divided into two classes:

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 18 Spatial Data Models

a. Simple features
 These are easy to create, store and are rendered on screen very quickly.
 They lack connectivity relationships and so are inefficient for modeling phenomena
conceptualized as fields.
This is also called Feature data. Shapefiles are ArcView's native file format for geographic
features and attribute data. ArcView can also display Arc/Info Coverages, which comprises a
more complex representation of vector data. Vector comprise the following:
 Point - a pair of x and y coordinates.
 Line - a sequence of points
 Polygon - a closed set of lines
Attribute information is stored in Feature Tables.
Point entities : These represent all geographical entities that are positioned by a single XY
coordinate pair. Along with the XY coordinates the point must store other information such as
what does the point represent etc.
Line entities : Linear features made by tracing two or more XY coordinate pair.
 Simple line: It requires a start and an end point.
 Arc: A set of XY coordinate pairs describing a continuous complex line. The shorter the line
segment and the higher the number of coordinate pairs, the closer the chain approximates a
complex curve.
Simple Polygons : Enclosed structures formed by joining set of XY coordinate pairs. The
structure is simple but it carries few disadvantages which are mentioned below:
 Lines between adjacent polygons must be digitized and stored twice, improper digitization
give rise to slivers and gaps
 Convey no information about neighbor
 Creating islands is not possible

b. Topological features
 A topology is a mathematical procedure that describes how features are spatially related and
ensures data quality of the spatial relationships.
 Topological relationships include following three basic elements:
I. Connectivity : Information about linkages among spatial objects
II. Contiguity : Information about neighbouring spatial object
III. Containment : Information about inclusion of one spatial object within another spatial
object

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 19 Spatial Data Models

Connectivity- Information about linkages among spatial objects


 Arc node topology defines connectivity
 arcs are connected to each other if they share a common node.
 This is the basis for many network tracing and path finding operations.
 Arcs represent linear features and the borders of area features.
 Every arc has a from-node which is the first vertex in the arc and a to-node which is the last
vertex.
 These two nodes define the direction of the arc.
 Nodes indicate the endpoints and intersections of arcs.
 They do not exist independently and therefore cannot be added or deleted except by adding
and deleting arcs.

Fig. 2.6.1 Arc-node Topology

 Nodes can be used to represent point features which connect


segments of a linear feature.
 (e.g., intersections connecting street segments, valves
connecting pipe segments).
 Arc-node topology is supported through an arc-node list.
 For each arc in the list there is a from node and a to node.
Fig. 2.6.2 Node showing
 Connected arcs are determined by common node numbers.
intersection

Fig. 2.6.3 Arc-Node Topology with list

 Contiguity- Information about neighbouring spatial object


 Polygon topology defines contiguity.
TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 20 Spatial Data Models

 The polygons are said to be contiguous if they share a common arc.


 Contiguity allows the vector data model to determine adjacency.

Fig. 2.6.4 Polygon Topology

 The from node and to node of an arc indicate its direction, and it helps determining the
polygons on its left and right side.
 Left-right topology refers to the polygons on the left and right sides of an arc.
 In the illustration above, polygon B is on the left and polygon C is on the right of the arc 4.
 Polygon A is outside the boundary of the area covered by polygons B, C and D.
 It is called the external or universe polygon, and represents the world outside the study area.
 The universe polygon ensures that each arc always has a left and right side defined.

Containment
Geographic features cover distinguishable area on the surface of the earth. An area is
represented by one or more boundaries defining a polygon. The polygons can be simple or they can
be complex with a hole or island in the middle. In the illustration given below assume a lake with
an island in the middle. The lake actually has two boundaries, one which defines its outer edge and
the other (island) which defines its inner edge. An island defines the inner boundary of a polygon.
The polygon D is made up of arc 5, 6 and 7. The 0 before the 7 indicates that the arc 7 creates an
island in the polygon.

Fig. 2.6.5 Polygon ArcTopology

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 21 Spatial Data Models

Polygons are represented as an ordered list of arcs and not in terms of X, Y coordinates. This is
called Polygon-Arc topology. Since arcs define the boundary of polygon, arc coordinates are stored
only once, thereby reducing the amount of data and ensuring no overlap of boundaries of the
adjacent polygons.

Topologic Features
Networks : A network is a topologic feature model which is defined as a line graph composed
of links representing linear channels of flow and nodes representing their connections. The
topologic relationship between the features is maintained in a connectivity table. By consulting
connectivity table, it is possible to trace the information flowing in the network.
Polygons with explicit topological structures : Introducing explicit topological relationships
takes care of islands as well as neighbors. The topological structures are built either by creating
topological links during data input or using software. Dual Independent Map Encoding (DIME)
system of US Bureau of the Census is one of the first attempts to create topology in geographic
data.

Fig. 2.6.6 Polygons with explicit topological structures

 Polygons are formed using the lines and their nodes.


 Once formed, polygons are individually identified by a unique identification number.
 The topological information among the polygons is computed and stored using the
adjacency information (the nodes of a line, and identifiers of the polygons to the left and
right of the line) stored with the lines.
Poly Arcs Arc ID From To ARCID Left Poly Right
ID Poly
A 1, 2 1 x y 1 A U
B 2, 3, 4 2 x y 2 B A
C 3 3 z z 3 B B
4 x y 4 U B

Fig. 2.6.7 Arc node Topology, Polygon Topology, Polygon Arc Topology

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 22 Spatial Data Models

Fully topological polygon network structure


A fully topological polygon network structure is built using boundary chains that are digitized
in any direction. It takes care of islands and lakes and allows automatic checks for improper
polygons. Neighborhood searches are fully supported. These structures are edited by moving the
coordinates of individual points and nodes, by changing polygon attributes and by cutting out or
adding sections of lines or whole polygons. Changing coordinates require no modification to the
topology but cutting out or adding lines and polygons requires recalculation of topology and
rebuilding the database.

2.7 Vector and Raster - Advantages and Disadvantages


There are several advantages and disadvantages for using either the vector or raster data model
to store spatial data. These are summarized below.

Vector : Advantages
 Data can be represented at its original resolution and form without generalization.
 Graphic output is usually more aesthetically pleasing (traditional cartographic
representation);
 Since most data, e.g. hard copy maps, is in vector form no data conversion is required.
 Accurate geographic location of data is maintained.
 Allows for efficient encoding of topology, and as a result more efficient operations that
require topological information, e.g. proximity, network analysis.

Vector : Disadvantages
 The location of each vertex needs to be stored explicitly.
 For effective analysis, vector data must be converted into a topological structure. This is
often processing intensive and usually requires extensive data cleaning. As well, topology is
static, and any updating or editing of the vector data requires re-building of the topology.
 Algorithms for manipulative and analysis functions are complex and may be processing
intensive. Often, this inherently limits the functionality for large data sets, e.g. a large
number of features.
 Continuous data, such as elevation data, is not effectively represented in vector form.
Usually substantial data generalization or interpolation is required for these data layers.
 Spatial analysis and filtering within polygons is impossible.

Raster Advantages :
 The geographic location of each cell is implied by its position in the cell matrix.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 23 Spatial Data Models

Accordingly, other than an origin point, e.g. bottom left corner, no geographic coordinates
are stored.
 Due to the nature of the data storage technique data analysis is usually easy to program and
quick to perform.
 The inherent nature of raster maps, e.g. one attribute maps, is ideally suited for mathematical
modeling and quantitative analysis.
 Discrete data, e.g. forestry stands, is accommodated equally well as continuous data, e.g.
elevation data, and facilitates the integrating of the two data types.
 Grid-cell systems are very compatible with raster-based output devices, e.g. electrostatic
plotters, graphic terminals.

Raster : Disadvantages
 The cell size determines the resolution at which the data is represented.
 It is especially difficult to adequately represent linear features depending on the cell
resolution. Accordingly, network linkages are difficult to establish.
 Processing of associated attribute data may be cumbersome if large amounts of data exist.
Raster maps inherently reflect only one attribute or characteristic for an area.
 Since most input data is in vector form, data must undergo vector-to-raster conversion.
Besides increased processing requirements this may introduce data integrity concerns due to
generalization and choice of inappropriate cell size.
 Most output maps from grid-cell systems do not conform to high-quality cartographic needs.
It is often difficult to compare or rate GIS software that use different data models. Some
personal computer (PC) packages utilize vector structures for data input, editing, and display but
convert to raster structures for any analysis. Other more comprehensive GIS offerings provide both
integrated raster and vector analysis techniques. They allow users to select the data structure
appropriate for the analysis requirements. Integrated raster and vector processing capabilities are
most desirable and provide the greatest flexibility for data manipulation and analysis.

2.8 Triangular Irregular Network (TIN)


 A TIN is a data structure that defines geographic space as a set of contiguous, non-
overlapping triangles, which vary in size and angular proportion. Like grids, TINs are used
to represent surfaces such as elevation, and can be created directly from files of sample
points.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 24 Spatial Data Models

Fig. 2.8.1 Anatomy of a TIN

 The TIN data structure is defined by two elements: a set of input points with x,y, and z
values, and a series of edges connecting these points to form triangles. Each input point
becomes the node of a triangle in the TIN structure, and the output is a continuous faceted
surface of triangles
 The triangles are constructed according to a mathematical technique called Delaunay
triangulation. The technique guarantees that a circle drawn through the three nodes of any
triangle will contain no other input point.

Fig. 2.8.2 Delaunay Triangulation

 Because points can be placed irregularly over a surface a TIN can have higher resolution in
areas where surface is highly variable. The model incorporates original sample points
providing a check on the accuracy of the model. The information related to TIN is stored in a
file or a database table. Calculation of elevation, slope, and aspect is easy with TIN but these
are less widely available than raster surface models and more time consuming in term of
construction and processing.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 25 Spatial Data Models

Node attribute table


Vertex ID X Y Z
a 780,034 33,020 90
b 780,017 33,035 102
c 780,007 33,052 115
d 780,023 33,070 135

Fig. 2.8.3 TIN model

Polygon attribte table Arc attribute table

Triangle Area Edge1 Edge2 Edge3 Nighbors Edge Length From To


ID ID Node node
A 8200 1 2 12 B, F 1 160 f a
B 7040 3 4 2 C, A 2 140 a g
C 6000 5 6 4 D, B 3 130 a b
D 5440 7 8 6 E, C 4 140 b g
… …

The TIN model is a vector data model which is stored using the relational attribute tables. A
TIN dataset contains three basic attribute tables: Arc attribute table that contains length, from node
and to node of all the edges of all the triangles.
 Node attribute table that contains x, y coordinates and z (elevation) of the vertices
 Polygon attribute table that contains the areas of the triangles, the identification number of
the edges and the identifier of the adjacent polygons.
Storing data in this manner eliminated redundancy as all the vertices and edges are stored only
once even if they are used for more than one triangle. As TIN stores topological relationships, the
datasets can be applied to vector based geoprocessing such as automatic contouring, 3D landscape
visualization, volumetric design, surface characterization etc.
TIN GRID
Advantages  Ability to describe the surface at  Easy to store and manipulate
different  Easy integration with raster database
 Efficiency in storing data  Smoother, more natural appearance of
derived terrain features
Disadvantages  In many cases require visual  Inability to use various grid sizes to reflect
inpection and manual control of areas of different complexity of relief.
the network.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 26 Spatial Data Models

2.9 The Open Geospatial Consortium


The Open Geospatial Consortium (OGC) is an international not for profit organization
committed to making quality open standards for the global geospatial community. These standards
are made through a consensus process and are freely available for anyone to use to improve sharing
of the world's geospatial data.

Description of the OGC


The OGC provides open standard specifications with the aim to facilitate and encourage the use
of these standards when organisations develop their own geospatial software, or online geoportals
offering data and software services online. The collection of geoportals and various other
complimentary services, create a Spatial Data Infrastructure (SDI).

Benefits of the OGC


Interoperability of geospatial data and reduced fragmentation in data delivery.
Consensus based approach. Participation of organisations from the public sector, private sector,
academia and research when developing standards assures the interests and needs of the geospatial
community are considered.
OGC helps to bring together geospatial data and services from multiple sectors.

2.9.1 Common OGC standards


OGC Web Services (OWS) are the OGC standards that use the internet or alternatively the
World Wide Web to view, edit, manage and share geospatial data. To understand OGC web
services you must be able to understand how web services work.
How do web services work?
A fully operable web service requires the service itself, a server, a client and the internet. The
client is an application that will use the web service (e.g. an internet browser) and the server is
where the data and information is stored. The web service will process a request made by the client
and then collect the appropriate service from the server and return this to the client via the internet
for the client to retrieve. Facebook is a good example where the SERVER is Facebook's servers,
the CLIENT is either the website you access on a computer or the app you have on your phone or
tablet, while the WEB SERVICE is the image you requested or the status update you posted.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 27 Spatial Data Models

Fig. 2.9.1 Working of web service

The requests being made by the client are commonly known as Hypertext Transfer Protocol
(HTTP) requests. HTTP is the protocol used for data communication on the World Wide Web and
there are two types of HTTP requests that need to be defined for this course:

Fig. 2.9.2 Working of HTTP

So to appropriately define the use of web services: A web service is used to provide access to
data and information from a server via the internet to a client.

2.9.2 OGC Standards

● Web Map Service (WMS)


This standard visualises geographic data that can be displayed across the web and multiple
platforms. This standard does not provide the actual geospatial data; instead it just provides a
georeferenced image (e.g. PNG, JPEG or GIF files) of the data.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 28 Spatial Data Models

● Web Feature Service (WFS)


This standard allows the sharing of geographic data at a feature level (i.e. sharing vector based
data for example ESRI Shapefiles (.shp)). This standard allows the user to request specified
geographic data from a client and receives the requested data via the web.

● Web Coverage Service (WCS)


This standard allows the visualisation and provision of geospatial data according to temporal
and spatial characteristics from a web server and the data can be in multiple raster based data
formats (e.g. GeoTiffs, .img, ENVI (.hdr) file types)

● Web Processing Service (WPS)


This standard allows the potential for geospatial processing tools and applications to be used on
geographic data within an interface via the web. For example this standard could use digitising,
spatial analysis and network analysis tools to edit geospatial data.

● Web Map Context (WMC)


This standard allows for the creation of a XML document that will save the layers and
parameters of a web map project so that it can be used at a different time and/or location. The OGC
service layers being used in the map (e.g. WMS layers) and the map parameters (e.g. map extent,
projection scheme information) are saved within the xml context document.

● Web Map Tile Service (WMTS)


This standard allows the pre-rendering of image tiles for a web map application. The pre-
rendering allows the web map to add or remove data layers between different map extents. This
tiling can provide higher or lower detailed maps depending on the map extent being viewed and the
specific web map properties being used.

● Catalogue Service for the Web (CS-W)


This standard allows a user to find, request and modify geospatial metadata that is stored within
a spatial database via the web. The metadata can be utilised from multiple sources and is an
important part of understanding and applying data that is being shared to different users, locations
and to varying platforms.

● Styled Layer Descriptor (SLD)


This standard allows for the application of style properties to the geographic features of a web
map and also allows the retrieval of the web map style legend.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 29 Spatial Data Models

● KML
This standard is a XML language that is based around the visualisation of geographic data, and
is used within Google Earth and Google Maps. Geographic data that this standard commonly
represents include place marks, image overlays, polygon features and paths.

● Geography Markup Language (GML)


This standard is an encoding standard similar to XML, but is used for the description and
representation of points, lines and polygons to be made into specified geographic features. For
example lines could represent roads, points could represent and polygons could represent specific
buildings.

● Sensor Model Language (SensorML)


This standard allows for the encoding of models and XML for sensor and observation
processing. This standard was established for the OGC Sensor Web Enablement (SWE) that aims
to enable applications and services to gain access to sensors of all types overthe web.

● OpenGlS@ Open Location Service (OpenLS)


This standard allows for the sharing of Location Based Services (LBS) through various
interfaces. LBS include emergency response (E-911), personal navigation, traffic information
service, and travel directions.

● open GeoSMS
This standard provides an encoding and interface that offers the potential for the
communication of location content betvveen different Location Based Service (LBS) devices or
applications via a Short Message Service (SMS).

● GeoAP1
This standard defines a Java language application programming interface (API) that can be used
to manipulate geographic information via the use of a Java based standard library that contains the
types and methods that can be implemented.

2.10 Data Quality


Data quality is the degree of data excellency that satisfy the given objective. In other words,
completeness of attributes in order to achieve the given task can be termed as Data Quality. Data
created from different channels with different techniques can have discrepancies in terms of
resolution, orientation and displacements.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 30 Spatial Data Models

Data quality is a pillar in any GIS implementation and application as reliable data are
indispensable to allow the user obtaining meaningful results. The following review of data quality
focuses on three distinct components, data accuracy, quality, and error.

2.10.1 Accuracy
The fundamental issue with respect to data is accuracy. Accuracy is the closeness of results of
observations to the true values or values accepted as being true. This implies that observations
of most spatial phenomena are usually only considered to estimates of the true value. The
difference between observed and true (or accepted as being true) values indicates the accuracy of
the observations.
Basically two types of accuracy exist. These are positional and attribute accuracy
Positional accuracy is the expected deviance in the geographic location of an object from its
true ground position. There are two components to positional accuracy. These
are relative and absolute accuracy.
Absolute accuracy concerns the accuracy of data elements with respect to a coordinate scheme.
Relative accuracy concerns the positioning of map features relative to one another.
Attribute accuracy is equally as important as positional accuracy. It also reflects estimates of
the truth. Interpreting and depicting boundaries and characteristics for forest stands or soil
polygons can be exceedingly difficult and subjective.

2.10.2 Quality
Quality can simply be defined as the fitness for use for a specific data set. Data that is
appropriate for use with one application may not be fit for use with another. It is fully dependent
on the scale, accuracy, and extent of the data set, as well as the quality of other data sets to be used.
The recent U.S. Spatial Data Transfer Standard (SDTS) identifies five components to data quality
definitions. These are :
 Lineage
 Positional Accuracy
 Attribute Accuracy
 Logical Consistency
 Completeness

Lineage
The lineage of data is concerned with historical and compilation aspects of the data such as the:
 source of the data;

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 31 Spatial Data Models

 content of the data


 data capture specifications
 geographic coverage of the data
 compilation method of the data, e.g. digitizing versus scanned;
 transformation methods applied to the data; and
 the use of an pertinent algorithms during compilation, e.g. linear simplification, feature
generalization

● Positional Accuracy
The identification of positional accuracy is important. This includes consideration of inherent
error (source error) and operational error (introduced error).

● Attribute Accuracy
Consideration of the accuracy of attributes also helps to define the quality of the data. This
quality component concerns the identification of the reliability, or level of purity (homogeneity), in
a data set.

● Logical Consistency
This component is concerned with determining the faithfulness of the data structure for a data
set. This typically involves spatial data inconsistencies such as incorrect line intersections,
duplicate lines or boundaries, or gaps in lines. These are referred to as spatial or topological errors.

● Completeness
The final quality component involves a statement about the completeness of the data set. This
includes consideration of holes in the data, unclassified areas, and any compilation procedures that
may have caused data to be eliminated.
The ease with which geographic data in a GIS can be used at any scale highlights the
importance of detailed data quality information. Although a data set may not have a specific scale
once it is loaded into the GIS database, it was produced with levels of accuracy and resolution that
make it appropriate for use only at certain scales, and in combination with data of similar scales.

2.10.3 Error
Two sources of error, inherent and operational, contribute to the reduction in quality of the
products that are generated by geographic information systems.
Inherent error is the error present in source documents and data.
Operational error is the amount of error produced through the data capture and manipulation
functions of a GIS.
TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 32 Spatial Data Models

Possible sources of operational errors include:


 Mis-labelling of areas on thematic maps;
 misplacement of horizontal (positional) boundaries;
 human error in digitizing
 classification error;.
 GIS algorithm inaccuracies; and
 human bias.
An awareness of the error status of different data sets will allow user to make a subjective
statement on the quality and reliability of a product derived from GIS processing.
The validity of any decisions based on a GIS product is directly related to the quality
and reliability rating of the product.
Depending upon the level of error inherent in the source data, and the error operationally
produced through data capture and manipulation, GIS products may possess significant amounts of
error.

2.10.4 Sources of Spatial Data Discrepancy

● DataInformationExchange:
Data information exchange is basically the information about the data provided by the client to
organization. The degree of information provided by the client defines the accuracy and
completeness of data.

● TypeandSource:
Data type and source must be evaluated in order to get appropriate data values. There are many
spatial data formats and each one of them is having some beneficiary elements as well as some
drawbacks.

● DataCapture:
There are many tools that incorporate manual skills to capture the data using various softwares
like ArcGIS. These softwares allows user to capture information from the base data. During this
data capture, the user may misinterpret features from the base data and captures the features with
errors. Data capture must be performed on a perfect scale where one must be able to view the
features distinctly.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 33 Spatial Data Models

● CartographicEffects :
After capturing the data, some cartographic effects like symbology, pattern, colors, orientation
and size are assigned to the features. This is required for a better representation of reality. These
effects must be assigned according to the domain of the features

● DataTransfer:
Some discrepancies may occur while transferring the data from one place to another “There is
no bad or good data. There are only data which are suitable for a specific purpose.” So, Data must
be evaluated according to the domain for which it is supposed to be used.

● Metadata:
Sometimes metadata is not updated according to the original features. So, metadata must be
updated with the original data.

2.10.5 Data Quality Improvement Techniques


 Choice of relevant data from a relevant source.
 Derive precisions in the origin itself.
 Data quality testing in each phase of data capture.
 Using automated software tools for spatial and non-spatial data validation.
 Assessment of the mode of data uses and user.
 Determining the map elements like scale, visualization and feature orientation.

2.11 Two Marks Questions with Answers


Q.1 What is data model?
Ans. : Data Models: Vector and Raster Spatial data in GIS has two primary data formats: raster
and vector.
Raster uses a grid cell structure, whereas vector is more like a drawn map. Raster and Vector
Data Vector format has points, lines, polygons that appear normal, much like a map. Raster
format generalizes the scene into a grid of cells, each with a code to indicate the feature being
depicted. The cell is the minimum mapping unit. Raster has generalized reality: all of the
features in the cell area are reduced to a single cell identity.

Q.2 What is vector data?


Ans. : Vector data uses two dimensional cartesian coordinates to store the shape of spatial
entity. Vector based features are treated as discrete geometric objects over the space. In the
vector data base point is the basic building block from which all the spatial entities are

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 34 Spatial Data Models

constructed. The vector spatial entity, the point is represented by a single x,y coordinate pair.
Line and area entities are constructed by a series of points into chains.

Q.3 Define Raster data.


Ans. : Raster is a method for the storage, processing and display of spatial data. Each area is
divided into rows and columns, w hich form a regular grid structure. Each cell must be
rectangular in shape, but not necessarily square. Each cell within this matrix contains location
co-ordinates as well as an attribute value.
The origin of rows and column is at the upper left corner of the grid. Rows function as the
“y”coordinate and column as”x”coordinate in a two dimensional system. A cell is defined by its
location in terms of rows and columns.

Q.4 Why is compression needed for remote sensing data?


Ans. : Data compression method provides compact raster representation using a variable sized
grid. Large cells are used in areas of low detail while small cells are used in areas of high detail

Q.5 What is Vectorization?


Ans. : Vectors are data elements describing position and direction. In GIS, vector is the maplike
drawing of features, without the generalizing effect of a raster grid. Therefore, shape is better
retained. Vector is much more spatially accurate than the raster format.

Q.6 What is raster coding?


Ans. : In the data entry process, maps can be digitized or scanned at a selected cell size and each
cell assigned a code or value. The cell size can be adjusted according to the grid structure or by
ground units, also termed resolution. There are three basic and one advanced scheme for
assigning cell codes.
Presence/Absence: is the most basic method and to record a feature if some of it occurs in the
cell space.

Q.7 Compare vector and raster data structure.


Ans. : Vectors are data elements describing position and direction. In GIS, vector is the maplike
drawing of features, without the generalizing effect of a raster grid. Therefore, shape is better
retained. Vector is much more spatially accurate than the raster format.
In the data entry process, maps can be digitized or scanned at a selected cell size and each cell
assigned a code or value. The cell size can be adjusted according to the grid structure or by
ground units, also termed resolution. There are three basic and one advanced scheme for
assigning cell codes. Presence/Absence: is the most basic method and to record a feature if some
of it occurs in the cell space.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 35 Spatial Data Models

Q.8 What do you understand about data compression?


Ans. : Data compression method provides compact raster representation using a variable sized
grid. Large cells are used in areas of low detail while small cells are used in areas of high detail

Q.9 What is buffering?


Ans. : Buffering is the creation of polygons that surround other points, lines or polygons.
Buffers M/J be created either to exclude a certain amount of area around a point, line or polygon
or to include only the buffer area in a study

Q.10 List the advantages of raster data.


Ans. : Raster data also known as grid based system or cellular system consists of rectangular
cell. The raster system used minimum time for process in a and easy to program
Advantages : It is a simple data structure, Overaly operations are easily and efficiently
implemented, High spatial variability is efficiently represented in a raster format.
Disadvantage : 1. Less compact 2. Topographical relationship are more difficult to represent

Q.11 What you mean by data compression?


Ans. : Reducing the 'electronic space' (data bits) used in representing a piece of information, by
eliminating the repetition of identical sets of data bits (redundancy) in an audio/video, graphic,
or text data file. White spaces in text and graphics, large blocks of the same color in pictures, or
other continuously recurring data, is reduced or eliminated by coding(encryption) with a
program that uses a particular type of compression algorithm. The same program is used to
decompress (decrypt) the data so that it can be heard, read, or seen as the original data.

Q.12 List out the basics elements of GIS modeling.


Ans. : Geographic Information systems have three important components. They are
1. Computer hardware,
2. Set of application software modules
3. Spatial data
4. Data management and analysis procedures
5. Personnel to operate the GIS

Q.13 Write short notes on topographical overlay.


Ans. : Map overlay is the process by which it is possible to take two or more different
topographical layers of the same area and overlay them on top of the other and form a composite
new layer this techniques is used to overlay vector data on a raster image. In Vector base

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 36 Spatial Data Models

systems map overlay is time consuming, complex and computationally expensive. In raster
based systems it is quick, straightforward and efficient

Q.14 Discuss about network model.


Ans. : A network is a generalized graph that captures relationships between objects using
connectivity. A network database consists of a collection of records that are connected to each
other through links. A link is an association between two records. It allows each record to have
many parents and many children thus allowing a natural model of relationships between entities.

Q.15 Describe about object oriented model.


Ans. : The aim of object oriented model is to allow data modeling which is closer to real world.
An object-oriented database uses objects as elements within database files.
An object is a logical grouping of related data that represents a real world entity.
Each object is a distinct entity which is identified using a key attribute called ObjectID.
The object can be grouped together to form a class.
Objects of the same class have same attributes, behavior and relationships with other objects

Q.16 Write about relational model.


Ans. : The relational data model was introduced by Codd in 1970.
The relational database relates or connects data in different files through the use of a common
field. A flat file structure is used with a relational database model. In this arrangement, data is
stored in different tables made up of rows and columns. The columns of a table are named by
attributes. Each row in the table is called a tuple and represents a basic fact. No two rows of the
same table may have identical values in all columns. There are two crucial data integrity
constraints viz. primary key and foreign key. A primary key is an attribute whose value is
unique across all tuples (rows) in a relation (table). The primary key of one table appearing as
an attribute of another table is known as a foreign key in that table.

Q.17 What are the three basic elements of topological relationships?


Ans. : Topological features
A topology is a mathematical procedure that describes how features are spatially related and
ensures data quality of the spatial relationships. Topological relationships include following
three basic elements:
I. Connectivity: Information about linkages among spatial objects
II. Contiguity: Information about neighbouring spatial object
III. Containment: Information about inclusion of one spatial object within another spatial object

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 37 Spatial Data Models

Q.18 List the three models of encoding raster data.


Ans. : The simplest way of encoding a raster data into computers can be understood as follows:
 Entity model
 Pixel values
 File structure

Q.19 Define raster data model.


Ans. : The raster data model is commonly associated with the field conceptual model. Here,
geographic space is represented by array of cells or pixels (aka picture elements) which are
arranged in rows and columns. Each pixel has a value that represents information. The value can
be in the form of integer, floating points or alphanumeric.
A point can be represented by a single pixel in raster model. A line is a chain of spatially
connected cells with the same value. Similarly, a water body in raster data is represented as a set
of contiguous pixels having same value that represents a homogeneous area.

Q.20 What are the classifications of vector structures?


Ans. : Geographic entities encoded using the vector data model, are often called features. The
features can be divided into two classes:
a. Simple features : These are easy to create, store and are rendered on screen very quickly.
They lack connectivity relationships and so are inefficient for modelling phenomena
conceptualized as fields.
b. Topological features : A topology is a mathematical procedure that describes how
features are spatially related and ensures data quality of the spatial relationships.

2.12 Long Answered Questions with Answers


Q.1 Explain about Relational and Object oriented model in detail. (Refer section 2.2)

Q.2 Compare raster and vector data representation with suitable examples. (Refer section 2.7)

Q.3 With neat sketch explain briefly about Raster Data structures. (Refer section 2.4)

Q.4 Explain briefly about E-R diagrams. (Refer section 2.2.3)


Q.5 Discuss in detail about the data compression techniques. (Refer section 2.5)

Q.6 Describe TIN. Give the difference between TIN and GRID. (Refer section 2.8)

Q.7 Explain the concepts of Data quality. (Refer section 2.10)


Q.8 Write in detail about OGC standards for GIS. (Refer section 2.9)


TM
Technical Publications - An up thrust for knowledge
Geographic Information System 2 - 38 Spatial Data Models

Notes

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3-2 Data Input and Topology

3.1 Data Input


Data input is the procedure of encoding data into a computer-readable form and writing the data
to the GIS data base.
There are two types of data to be entered into a GIS :
1. Spatial data
2. Associated non-spatial attribute data.
The spatial data represents the geographical location of the features.
 The non-spatial attribute data provide descriptive information like the name of a street,
salinity of the lake or the type of tree stand. It must be logically attached to the features they
describe.
 The data input and output functions are the means by which a GIS communicates with the
world outside.
 The objective in defining GIS input and output requirements is to identify the mix of
equipment and methods needed to meet the required level of performance and quality. No
one device or approach is optimum for all situations.
 Data entry is usually the major bottleneck in implementing a GIS. The initial cost of building
the database is commonly 5 to 10 times to cost of the GIS hardware and software.
 The creation of an accurate and well-documented database is critical to the operation of the
GIS.
 Accurate information can only be generated if the data on which it is based were accurate to
begin with.
 Data quality information includes the date of collection, the positional accuracy,
completeness, and the method used to collect and encode the data.
There are five types of data entry systems commonly used in a GIS :
 keyboard entry
 coordinate geometry
 manual digitizing
 scanning
 input of existing digital files

3.1.1 Keyboard Entry


 It involves manually entering the data at a computer terminal. Attribute data are commonly
input by keyboard whereas spatial data are rarely input this way.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3-3 Data Input and Topology

 Keyboard entry may also be used during manual digitizing to enter the attribute information.
However this is usually more efficiently handled as a separate operation.
 Roads files versus the census file -- roads file will use codes for the various road types while
the census file uses exact numbers for things like total population, age range, etc.

3.1.2 Coordinate Geometry (COGO)


 This technique is also called as COGO method.
 In this method survey measurements such as bearings and lengths are taken as input and
entered into GIS using keyboard.
 Coordinates of objects and features are calculated by the GIS.
 This input technique produces highly accurate results and is useful in preparing cadastral
maps.
 However, it takes lot of time, manpower and cost to produce the maps compared with
normal digitizing process.
 Surveyors and engineers want the higher accuracy of COGO for their applications. Planners
and most others are happy with the lower accuracy provided by manual digitizing.

3.1.3 Manual Digitizing


 Digitizing is the process of interpreting and converting paper map or image data to vector
digital data.
 Digitizing is the process by which coordinates from a map, image, or other sources of data
are converted into a digital format in a GIS.
 This process becomes necessary when available data is gathered in formats that cannot be
immediately integrated with other GIS data.
 Digitization results in shape files, which are vector features.
 Manual digitization is a tedious job and if operator is not efficient it may lead to several
digitizing errors. Hence, it has to be done with most skill and caution.
 Manual digitizing is a tedious job. Operator fatigue (eye strain, back soreness, etc.) can
seriously degrade the data quality.
 Managers must limit the number of hours an operator works at one time.
 A commonly used quality check is to produce a verification plot of the digitized data that is
visually compared with the map from which the data were originally digitized.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3-4 Data Input and Topology

3.1.4 Scanning
 Scanning provides a faster means of data entry compared to manual digitizing.
 The process of conversion of paper maps into digital format usable by computer is known as
scanning.
 It is used to convert an analog map into a scanned file, which is again converted to vector
format through tracing.
 Scanning automatically captures map features, text and symbols as individual cells, or pixels
and produces an automated image.
 The scanned file shows map features as raster lines (a series of connected pixels). And must
be vectorized to complete the process of digitizing.
 Vectorization is converting raster lines into vector lines in a process known as tracing.

3.1.5 Inputting Existing Digital Files


 There are many companies and organizations on the market that provide or sell digital data
files often in a format that can be read directly into a GIS.
 These digital data sets are priced at a fraction of the cost of digitizing existing maps.
 Over the next decade, the increased availability of data should reduce the current high cost
and lengthy production times needed to develop digital geographic data bases.

3.2 Scanner - Raster Data Input


 The process of conversion of paper maps into digital format usable by computer is known as
scanning.
 It is used to convert an analog map into a scanned file, which is again converted to vector
format through tracing. Scanning automatically captures map features, text and symbols as
individual cells, or pixels and produces an automated image.
 The scanned file shows map features as raster lines (a series of connected pixels). And must
be vectorized to complete the process of digitizing.
 Vectorization is converting raster lines into vector lines in a process known as tracing.
 A variety of scanning devices exist for the automatic capture of spatial data. While several
different technical approaches exist in scanning technology, all have the advantage of being
able to capture spatial features from a map at a rapid rate of speed.
 Scanners are generally expensive to acquire and operate. As well, most scanning devices
have limitations with respect to the capture of selected features, e.g. text and symbol
recognition.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3-5 Data Input and Topology

3.2.1 Operation of Scanner


 The primary function of any scanner is to convert measured quantities of light to electrical
analogs. The light that is measured may be light that has been transmitted through the
material, as would be the case when film transparencies are scanned, or the light that is
measured could be that which is reflected from the surface of a map or photograph.
 For GIS and other computer applications, the electrical analogs are subsequently converted
to a binary form suitable for computer processing. If the output of the scanner is to be used
as input to a GIS, care must be taken to preserve the spatial integrity of the item being
scanned.
 Preservation of the spatial integrity is normally accomplished by describing the scanned
document as an orthogonal array of grid cells (raster array). Each grid cell represents an
instantaneous field of view within which the scanner makes a measurement. The manner in
which the grid cell is defined depends upon the particular scanner being used.
 The following four types of scanner are commonly used in GIS and remote sensing.

Fig. 3.2.1 Major types of scanner

a. Mechanical scanner
It is called drum scanner since a map or an image placed on a drum is digitized mechanically
with rotation of the drum and shift of the sensor as shown in Fig. 3.2.1(a). It is accurate but slow.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3-6 Data Input and Topology

b. Video Camera
Video camera with CRT (cathode ray tube) is often used to digitize a small part of map of firm.
This is not very accurate but cheap. (See Fig. 3.2.1(b))

c. CCD Camera
Area CCD camera (called digital still camera) instead of video camera will be also convenient
to acquire digital image data (See Fig. 3.2.1 (c)). It is more stable and accurate than video camera.

d. CCD Scanner
Flat bed type or roll feed type scanner with linear CCD (charge coupled device) is now
commonly used to digitize analog maps in raster format, either in mono-tone or color mode. It is
accurate but expensive. (See Fig. 3.2.1 (d)).

Type Resolution Accuracy Cost

Mechanical scanner Very high Very high High

( 25 - 100 μm)

Video camera Low Low Cheap

(500 x 500 pixels)

CCD camera Medium cheap Medium Cheap (low resolution)


(500 x 500 pixels) High High ( High resolution)
( 4000 x 4000 pixels)

CCD scanner Very High High High

Table 3.2.1 Shows the performance of major scanners

3.2.2 Types of Scanners


 There are several different types of scanners performing
the same job but handling the job differently using
different technologies and producing results depending on
their varying capabilities.
 Hand-held scanners although portable, can only scan
images up to about four inches wide. They require a very
steady hand for moving the scan head over the document.
They are useful for scanning small logos or signatures and
are virtually of no use for scanning maps and photographs.
Fig. 3.2.2 Hand held scanner

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3-7 Data Input and Topology

Flatbed Scanner
 The most commonly used scanner is a flatbed scanner also
known as desktop scanner. It has a glass plate on which the
picture or the document is placed. The scanner head placed
beneath the glass plate moves across the picture and the result is
a good quality scanned image. For scanning large maps or
toposheets wide format flatbed scanners can be used.

Fig.3.2.3 Flatbed scanner

Drum Scanner
 Then there are the drum scanners which are mostly used by
the printing professionals. In this type of scanner, the image
or the document is placed on a glass cylinder that rotates at
very high speeds around a centrally located sensor
containing photo-multiplier tube instead of a CCD to scan.
Prior to the advances in the field of sheet fed scanners, the
drum scanners were extensively used for scanning maps and
other documents.

Fig. 3.2.4 Rotating drum scanner

3.2.3 Methods of Scanning


 Scanning captures map features, text, and symbols as individual cells, or pixels, and produce
an automated image.
 Based on the document to be scanned there are different scanning procedures followed.

Black and White Raster Scanning :


 Image scanned in B&W
 Image scanned in B&W Black and white or “binary” scanning is the simplest method of
converting any document and can be performed on line drawings, reduced media, text or any
one colour document.
 This is the appropriate solution for archiving and storage projects, in which the documents
will be viewed and printed but never changed.
 It is, therefore, an ideal solution as the first stage in a planned document conversion project.

Grey Scale and Colour Raster Scanning :


 Image scanned in greyscale.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3-8 Data Input and Topology

 Image scanned in Gray colour scale and (especially) colour images can be quite large.
 It must be made sure that the system is capable of handling files whose size is often
measured in tens of megabytes.
 Because virtually every pixel is populated with a value, an attempt to compress the file
results in little or no reduction in file size.

3.2.4 Limitations in use of Scanners


 Hard copy maps are often unable to be removed to where a scanning device is available, e.g.
most companies or agencies cannot afford their own scanning device and therefore must
send their maps to a private firm for scanning;
 Hard copy data may not be in a form that is viable for effective scanning, e.g. maps are of
poor quality, or are in poor condition;
 Geographic features may be too few on a single map to make it practical, cost-justifiable, to
scan;
 Often on busy maps a scanner may be unable to distinguish the features to be captured from
the surrounding graphic information, e.g. dense contours with labels;
 With raster scanning there it is difficult to read unique labels (text) for a geographic feature
effectively; and
 Scanning is much more expensive than manual digitizing, considering all the
cost/performance issues.

3.3 Raster Data File Formats


 Raster data represents the world as a surface divided into regular grid of cells. Raster data
models are useful for storing data that varies continuously, as in an aerial photograph, a
satellite image or an elevation surface.
 There are two types of raster data : Continuous and discrete. Raster stores the data in the
type of digital image represented by reducible and enlargeable grids and these grid of
cells contains a value representing information, such as temperature, discrete data represents
features such as land-use or soils data.
 Raster data provides a matrix of cells with values representing a coordinate and sometimes
linked to an attribute table and it is much simpler for many layers combinations. Raster data
is very easy to modify or program due to simple data structure.
 Rasters are in part defined by their pixel depth. Pixel depth defines the range of distinct
values the raster can store. For example, a 1-bit raster can only store 2 distinct values :
0 and 1.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3-9 Data Input and Topology

 There is a wide range of raster file formats used in the GIS world. Some of the most popular
ones are listed below.

Tagged Image File Formats (TIFF)


 This format is associated with scanners. It saves the scanned images and reads them. TIFF
can use run length and other image compression schemes. It is not limited to 256 colors like
a GIF.

GEO-TIFF
 As part of a header in a TIFF format it puts Lat/Long at the edges of the pixels.

Graphic Interchange Format (GIF)


 Graphic Interchange Format. A file format for image files, commonly used on the Internet. It
is well-suited for images with sharp edges and relatively few gradations of color.

Joint Photograph Experts Group (JPEG)


 JPEG is a common picture format. It uses a variable-resolution compression system offering
both partial and full resolution recovery.

DEM
 Digital Elevation Models or DEM have two types of displays. The first is 30-meter elevation
data from 1:24,000 seven-and-a-half minute quadrangle map. The second is the 1:250,000 3
arc-second digital terrain data. DEMs are produced by the National Mapping Division of
USGS.

Band Interleaved by Pixel (BIP), Band Interleaved by Line (BIL)


 BIP and BIL are formats produced by remote sensing systems. The primary difference
among them is the technique used to store brightness values captured simultaneously in each
of several colors or spectral bands.

RS Landsat
 Landsat satellite imagery and BIL information are used in RS Landsat. In one format, using
BIL, pixel values from each band are pulled out and combined. Programs that use this kind
of information include IDRISI, GRASS, and MapFactory. It is fairly easy to exchange
information from within these raster formats.

Portable Network Graphics (PNG)


 Provides a well-compressed, lossless compression for raster files. It supports a large range of
bit depths from monochrome to 64-bit color. Its features include indexed color images of up
to 256 colors and effective 100 percent lossless images of up to 16 bits per pixel.
Single file - extension *.png
TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 10 Data Input and Topology

JPEG File Interchange Format (JFIF)


A standard compression technique for storing full-color and grayscale images. Support for
JPEG compression is provided through the JFIF file format.

Single file - extension *.jpg, *.jpeg, *.jpc, or *.jpe

World file - extension *.jgw


ArcCatalog only recognizes the .jpg file extension by default. To add .jpeg or .jpe files to
ArcMap without renaming them, add those file extensions to ArcCatalog or drag those files from
Windows Explorer into your map.

3.4 Digitizer - Vector Data Input


 Digitizing is the process of interpreting and converting paper map or image data to vector
digital data.
 Digitizing is the process by which coordinates from a map, image, or other sources of data
are converted into a digital format in a GIS. This process becomes necessary when available
data is gathered in formats that cannot be immediately integrated with other GIS data.
 Digitization results in shape files, which are vector features.
 Manual digitization is a tedious job and if operator is not efficient it may lead to several
digitizing errors. Hence, it has to be done with most skill and caution.
 Manual digitizing is a tedious job. Operator fatigue (eye strain, back soreness, etc.) can
seriously degrade the data quality. Managers must limit the number of hours an operator
works at one time. A commonly used quality check is to produce a verification plot of the
digitized data that is visually compared with the map from which the data were originally
digitized.
 Tablet digitizers with a free cursor connected with a personal computer are the most
common device for digitizing spatial features with the plainmetric coordinates from analog
maps. The analog map is placed on the surface of the digitizing tablet as shown in figure.
The size of digitizer usually ranges from A3 to A0 size.

Fig. 3.4.1 Tablet digitizer

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 11 Data Input and Topology

The digitizing operation is as follows

Step 1 : A map is affixed to a digitizing table.

Step 2 : Control points or tics at four corners of this map sheet should be digitized by the
digitizer and input to PC together with the map coordinates of the four corners.

Step 3 : Map contents are digitized according to the map layers and map code system in either
point mode or stream mode at short time interval.

Step 4 : Editing errors such as small gaps at line junctions, overshoots, duplicates etc. should
be made for a clean dataset without errors.

Step 5 : Conversion from digitizer coordinates to map coordinates to store in a spatial database.

Major problems of map digitization are :


- The map will stretch or shrink day by day which makes the newly digitized points slightly off
from the previous points.
- The map itself has errors.
- Discrepancies across neighbouring map sheets will produce disconnectivity.
Manual digitizing has many advantages. These include :
 Low capital cost, e.g. digitizing tables are cheap;
 Low cost of labour;
 Flexibility and adaptability to different data types and sources;
 Easily taught in a short amount of time - an easily mastered skill
 Generally the quality of data is high;
 Digitizing devices are very reliable and most often offer a greater precision that the data
warrants; and
 Ability to easily register and update existing data.

Heads-up digitization
This method uses scanned copy of the map or image and digitization is done on the screen of
the computer monitor. The scanned map lays vertical which can be viewed without bending the
head down and therefore is called as heads up digitization. Semi-automatic and automatic methods
of digitizing requires post processing but saves lot of time and resources compared to manual
method .

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 12 Data Input and Topology

Heads-down digitization
Digitizers are used to capture data from hardcopy maps. Heads down digitization is done on a
digitizing table using a magnetic pen known as Puck. The position of a cursor or puck is detected
when passed over a table inlaid with a fine mesh of wires. The function of a digitizer is to input
correctly the coordinates of the points and the lines. Digitization can be done in two modes.
Point mode : In this mode, digitization is started by placing a point that marks the beginning of
the feature to be digitized and after that more points are added to trace the particular feature (line or
a polygon). The number of points to be added to trace the feature and the space interval between
two consecutive points are decided by the operator
Stream mode : In stream digitizing, the cursor is placed at the beginning of the feature, a
command is then sent to the computer to place the points at either equal or unequal intervals as per
the position of the cursor moving over the image of the feature

3.5 Topology
 Topology expresses explicitly the spatial relationships between connecting or adjacent
vector features (points, polylines and polygons) in a GIS, such as two lines meeting perfectly
at a point and directed line having an explicit left and right side.
 Topological or topology based data are useful for detecting and correcting digitizing error in
geographic data set and are necessary for some GIS analyses.
 Topologic data structures help insure that information is not unnecessarily repeated. The
database stores one line only in order to represent a boundary (as opposed to two lines, one
for each polygon). The database tells us that the line is the “left side” of one polygon and the
“right side” of the adjacent polygon.
 Topology is the study of those properties of geometric objects that remain invariant under
certain transformations such as bending or stretching.
 Topology is often explained through graph theory.
 Topology has least two main advantages.
i) The assurance of data quality
ii) Enhance GIS analysis
 Topological relationships are built from simple elements into complex elements: points
(simplest elements), arcs (sets of connected points), areas (sets of connected arcs), and routes
(sets of sections, which are arcs or portions of arcs).

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 13 Data Input and Topology

3.5.1 Components of Topology


Topology has three basic components :

1. Connectivity (Arc - Node Topology) :


 Points along an arc that define its shape are called vertices.
 Endpoints of the arc are called nodes.
 Arcs join only at the nodes.

Fig. 3.5.1 Arc-Node topology with list

2. Area Definition / Containment (Polygon - Arc Topology) :


 An enclosed polygon has a measurable area.
 Lists of arcs define boundaries and closed areas are maintained.
 Polygons are represented as a series of (x, y) coordinates that connect to define an area.

Fig. 3.5.2 Polygon arc topology

3. Contiguity (Adjacency) :
 Every arc has a direction
 A GIS maintains a list of Polygons on the left and right side of each arc.
 The computer then uses this information to determine which features are next to one another.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 14 Data Input and Topology

Fig. 3.5.3 Polygon topology

Generally, topology is employed to do the following :


 Manage coincident geometry (constrain how features share geometry). For example,
adjacent polygons, such as parcels, have shared edges; street centerlines and the boundaries
of census blocks have coincident geometry; adjacent soil polygons share edges; etc.
 Define and enforce data integrity rules (such as no gaps should exist between parcel features,
parcels should not overlap, road centerlines should connect at their endpoints).
 Support topological relationship queries and navigation (for example, to provide the ability
to identify adjacent and connected features, find the shared edges, and navigate along a
series of connected edges).
 Support sophisticated editing tools that enforce the topological constraints of the data model
(such as the ability to edit a shared edge and update all the features that share the common
edge).
 Construct features from unstructured geometry (e.g., the ability to construct polygons from
lines sometimes referred to as "spaghetti").

3.5.2 Topology in Different GIS Format

1. Coverage
 Coverage is a topology based vector data format. Coverage can be a point coverage, line
coverage, or polygon coverage.
 The coverage model supports three basic topological relationships.
Connectivity : Arc connects to each other at nodes.
Area definition : An Area is defined by a series of connected arcs.
Contiguity : Arcs have directions and left and right polygon.

2. Shapefile
 Shapefile is a standard non topological data format. Shape file are a first attempt an object
spatial features.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 15 Data Input and Topology

 They are very simple floating point geometry feature. A Shapefile is a digital vector storage
format for storing geometric location and associated attribute information.
A shapefile is actually a set of several files
 .shp - shape format; the feature geometry itself
 .shx - shape index format; a positional index of the feature geometry to allow seeking
forwards and backwards quickly
 .dbf - attribute format; columnar attributes for each shape, in dBase III format
The geometry of a shapefile is stored in two basic files .shp and .shx :

3. DXF (Drawing exchange format)


It maintains data in separate layers. But it does not support topology. It is AutoCAD format.
4. Geodatabase
A geodatabase is a relational database that store geographic information.
 It is object-oriented model not a georelational.
 A relational database is a collection of tables logically associated with each other by
common key attribute field.
 A geodatabase can store geographic information because, besides storing a number or a
string in a attribute field; tables in a geodatabase can also store geometric coordinates to
define the shape and locations of points, lines or polygon.
Georelational Object based
Topological Coverage Geodatabase
Non-Topological Shapefile Geodatabase

3.6 Topological Consistency Rules


 Geodatabase topology rules allow you to define relationships between features in the same
feature class or subtype or between two feature classes or subtypes. The status of a topology,
including errors and exceptions, is saved to the source geodatabase.
Points
Must be Coincident With
Points in one feature class or subtype must be coincident
with points in another feature class or subtype. Use this rule
when points from one feature class or subtype should be
aligned with points from another feature class or subtype.
for example, when service meters must be coincident with
service points in an electric utility network.
TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 16 Data Input and Topology

Must Be Disjoint
Points cannot overlap within the same feature class or
subtype. Use this rule when points within one feature class
or subtype should never occupy the same space, for
example, when fittings in a water distribution network
should not overlap.

Must Be Covered By Boundary Of


Points in one feature class or subtype must touch
boundaries of polygons from another feature class or subtype.
Use this rule when you want points to be on or inside the
boundaries of polygons, for example, when utility service
points are required to be within the boundary of a parcel.

Must Be Properly Inside Polygons


Points in one feature class or subtype must be inside
polygons of another feature class or subtype. Use this rule
when you want points to be completely within the boundaries
of polygons, for example, when state capitals must be inside
each state.

Must Be Covered By Endpoint Of


Points in one feature class or subtype must be covered by the
ends of lines in another feature class or subtype. Use this rule
when you want to model points that are coincident with the ends
of lines, for example, when street intersections must be covered
by the endpoints of street centerlines.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 17 Data Input and Topology

Point Must be Covered by Line


Points in one feature class or subtype must be covered by lines in
another feature class or subtype. Use this rule when you want to model
points that are coincident with lines, for example, when monitoring
stations must fall along streams.

POLYLINE

Must be Larger than Cluster Tolerance


The cluster tolerance is the minimum distance between the vertices that make up a feature.
Vertices that fall within the cluster tolerance are determined to be
coincident. This rule is mandatory for a topology and applies to all
polyline feature classes.

Must Not Overlap


Lines must not overlap any part of another line within a feature class or
subtype. Lines can touch, intersect, and overlap themselves. Use this rule with
lines that should never occupy the same space with other lines, for example,
when lot lines cannot overlap one another.

Must Not Intersect


Lines must not cross or overlap any part of another line within the same
feature class or subtype. Use this rule with lines whose segments should never
cross or occupy the same space with other lines, for example, when lot lines
cannot intersect or overlap, but the endpoint of one feature can touch the
interior of another feature.

Must Not Have Dangles


The end of a line must touch any part of one other line or any part of itself within
a feature class or subtype. Use this rule when you want lines in a feature class or
subtype to connect to one another, for example, when a street network has line
segments that connect. In this example, you can set exceptions to this rule for road
segments that end at terminate with dead-ends.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 18 Data Input and Topology

POLYGON

Must be Larger than Cluster Tolerance


The cluster tolerance is the minimum distance between the
vertices that make up a feature. Vertices that fall within the cluster
tolerance are determined to be coincident. This rule is mandatory
for a topology and applies to all polygon feature classes.

Must Not Overlap


Requires that polygons must not overlap within a feature class or subtype. Polygons can be
disconnected, touch at a point, or touch along an edge. Use this rule to
make sure that no polygon feature overlaps another polygon feature in
the same feature class or subtype, for example, when administrative
boundaries such as ZIP Codes or voting districts, or mutually exclusive
area classifications such as land form types cannot have any overlaps.

Must Not Have Gaps


Requires that polygons must not have a void between them within a
feature class or subtype. Use this rule when all of your polygons should
form a continuous surface with no voids or gaps, for example, when soil
polygons cannot include gaps or form voids and must form a continuous
fabric.

Must Not Overlap with


Polygons of the first feature class or subtype must not overlap
polygons of the second feature class or subtype. Use this rule when
polygons from one feature class or subtype should not overlap
polygons of another feature class or subtype, for example, when
lakes and land parcels from two different feature classes must not
overlap.

Must be Covered by Feature Class of


The polygons in the first feature class or subtype must be covered
by the polygons of the second feature class or subtype. Use this rule
when each polygon in one feature class or subtype should be covered
by all the polygons of another feature class or subtype, for example,
when states are covered by counties.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 19 Data Input and Topology

Must Cover Each other


All polygons in the first feature class and all polygons in the second feature class must cover
each other. This means that feature class one (1) must be
covered by feature class two (2), and feature class two (2)
must be covered by a feature class of feature class one (1). Use
this rule when you want the polygons from two feature classes
or subtypes to cover the same area, for example, when
vegetation and soils must cover each other.

Must be Covered by
Polygons in one feature class or subtype must be covered by a
single polygon from another feature class or subtype. Use this rule
when you want one set of polygons to be covered by some part of
another single polygon in another feature class or subtype, for
example, when countries must be covered by states.

Boundary Must be Covered by


Polygon boundaries in one feature class or subtype must be covered
by the lines of another feature class or subtype. Use this rule when
polygon boundaries should be coincident with another line feature class
or subtype, for example, when major road lines form part of outlines
for census blocks.

Area Boundary Must be Covered by Boundary of


The boundaries of polygons in one feature class or subtype must be
covered by the boundaries of polygons in another feature class or
subtype. Use this rule when the boundaries of polygons in one feature
class or subtype should align with the boundaries of polygons in another
feature class or subtype, for example, when subdivision boundaries are
coincident with parcel boundaries but do not cover all parcels.
Contains Point
Each polygon of the first feature class or subtype must contain
within its boundaries at least one point of the second feature class or
subtype. Use this rule to make sure that all polygons have at least one
point within their boundaries. Overlapping polygons can share a
point in that overlapping area, for example, when school district
boundaries must contain at least one school.
TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 20 Data Input and Topology

3.7 Attribute Data Input and Management


 Attribute data describe the characteristics of the map feature.
 Attribute data are stored in tables
 Each row of a table represents a map feature.
 Each column represents a characteristic.
 The object-oriented data model stores both data in a single database, but can distinguish
spatial data from attribute data.

Linking Attribute Data and Spatial Data


 The georelational data model store spatial data and attribute data in separate files.
 Each map feature has unique label ID (Fig 3.7.1).
 Linked by feature ID, the two sets of data files can be queried, analyzed and displayed.
 Attribute data are stored in a table called feature attribute table (Fig 3.7.2).
 A row is called a record.
 A column is called a field.

Fig. 3.7.1 Attribute data linked to spatial data

Fig. 3.7.2 Attribute data table

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 21 Data Input and Topology

 Most GIS projects have many attributes.


 Data from both structures are linked together for use through unique identification numbers,
e.g. feature labels and DBMS primary keys.
 This coupling of spatial features with an attribute record is usually maintained by an internal
number assigned by the GIS software.
 A label is required so the user can load the appropriate attribute record for a given
geographic feature.
 Most often a single attribute record is automatically created by the GIS software once a
clean topological structure is properly generated.
 This attribute record normally contains the internal number for the feature, the user's label
identifier, the area of the feature, and the perimeter of the feature.
 Linear features have the length of the feature defined instead of the area.To store all
attributes in a single table is not efficient both time and computer space and difficult to use
and update. Most GIS packages include DBMS :
o INFO for Arc/Info
o MS Access for IDRISI, ArcView and ArcGIS

3.8 Open Database Connectivity


ODBC “Open Data Base Connectivity” A standard software API specification for using
database management systems (DBMS). A component of Windows Open Services Architecture
Independent of any programming language, database system and operating system
Open Database Connectivity-or ODBC-is an application programming interface (API) that lets
software connect with database management systems while remaining independent of them. This is
important, because it allows applications to interact with multiple databases simultaneously
using SQL (Structured Query Language).

Goals of ODBC
 Access any data from any application, regardless of which DBMS is handling the data Insert
a middle layer between an application and the database management system.
 A database driver this layer translates the application's data queries into commands that the
DBMS understands
 Allow application programs to use SQL to access data from any kinds of sources

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 22 Data Input and Topology

Some of the advantages of ODBC are :


 ODBC provides a consistent interface regardless of the kind of database server used.You can
have more than one concurrent connection.
 Applications do not have to be bound to each database on which they will run.
 Although COBOL for AIX does this bind for you automatically, it binds automatically to
only one database. If you want to choose which database to connect to dynamically at run
time, you must take extra steps to bind to a different database.

The ODBC architecture has four components :


 Application
 Driver Manager
 Driver
 Data Source

Application
Performs processing and calls ODBC functions to submit SQL statements and retrieve results.
A number of tasks are common to all applications, no matter how they use ODBC.
Taken together, they largely define the flow of any ODBC application. The tasks are :
 Selecting a data source and connecting to it.
 Submitting an SQL statement for execution.
 Retrieving results (if any).
 Processing errors.
 Committing or rolling back the transaction enclosing the SQL statement.
 Disconnecting from the data source.

Driver Manager
 Loads and unloads drivers on behalf of an application. Processes ODBC function calls or
passes them to a driver.
 The Driver Manager exists mainly as a convenience to application writers and solves a
number of problems common to all applications. These include determining which driver to
load based on a data source name, loading and unloading drivers, and calling functions in
driver
 Driver Processes ODBC function calls, submits SQL requests to a specific data source, and
returns results to the application. If necessary, the driver modifies an application's request so
that the request conforms to syntax supported by the associated DBMS.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 23 Data Input and Topology

Drivers are libraries that implement the functions in the ODBC API. Each is specific to a
particular DBMS; Drivers expose the capabilities of the underlying DBMSs; they are not required
to implement capabilities not supported by the DBMS. The only major exception to this is that
drivers for DBMSs that do not have stand-alone database engines, such as Xbase, must implement
a database engine that at least supports a minimal amount of SQL.
Data Source consists of the data the user wants to access and its associated operating system,
DBMS, and network platform (if any) used to access the DBMS.
A data source is simply the source of the data. It can be a file, a particular database on a
DBMS, or even a live data feed. The data might be located on the same computer as the program,
or on another computer somewhere on a network.
The purpose of a data source is to gather all of the technical information needed to access the
data - the driver name, network address, network software, and so on - into a single place and hide
it from the user.

Fig.3.8.1 Four Components of Open Database Connectivity(ODBC)

3.9 GPS or Global Positioning System

What is GPS ?
GPS or Global Positioning System is a satellite navigation system that furnishes location and
time information in all climate conditions to the user. GPS is used for navigation in planes, ships,
cars and trucks also. The system gives critical abilities to military and civilian users around the
globe. GPS provides continuous real time, 3-dimensional positioning, navigation and timing
worldwide.

How does GPS System Work ?


The GPS system consists of three segments :
1) The space segment: the GPS satellites

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 24 Data Input and Topology

2) The control system, operated by the U.S. military,


3) The user segment, which includes both military and civilian users and their GPS equipment.

Fig 3.9.1 Three elements of GPS

Space Segment :
The space segment is the number of satellites in the constellation. It comprises of 29 satellites
circling the earth every 12 hours at 12,000 miles in altitude. The function of the space segment is
utilized to route/navigation signals and to store and retransmit the route/navigation message sent by
the control segment. These transmissions are controlled by highly stable atomic clocks on the
satellites. The GPS Space Segment is formed by a satellite constellation with enough satellites to
ensure that the users will have, at least, 4 simultaneous satellites in view from any point at the
Earth surface at any time.

Control Segment :
The control segment comprises of a master control station and five monitor stations outfitted
with atomic clocks that are spread around the globe. The five monitor stations monitor the GPS
satellite signals and then send that qualified information to the master control station where
abnormalities are revised and sent back to the GPS satellites through ground antennas. Control
segment also referred as monitor station.

User Segment :
The user segment comprises of the GPS receiver, which receives the signals from the GPS
satellites and determine how far away it is from each satellite. Mainly this segment is used for the
U.S military, missile guidance systems, civilian applications for GPS in almost every field. Most of
TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 25 Data Input and Topology

the civilian uses this from survey to transportation to natural resources and from there to
agriculture purpose and mapping too.

3.9.2 How does the GPS Work ?


 Principle : GPS works on the principle of trilateration i.e. determining absolute or relative
locations of points based on the distances to at least three known positions.
 Determining the location of a receiver
 GPS receiver calculates its distance from a satellite by
measuring how long a signal from the satellite takes to
reach it. It is implied that the receiver is located
somewhere on the surface of an imaginary sphere
centered at the satellite.
 The distance to the other satellite will also be calculated by the receiver. Similarly a sphere
centred at B (satellite 2) with a radius R2 can be imagined on whose surface lies the receiver.
Since the receiver is R1 distance from A (satellite1) and R2 distance from B (satellite 2), it is
clear that the receiver will be on either of the points of intersection of the two spheres
(shown by red dots).

 The distance calculated from the third satellite will add one more sphere to be imagined on
whose surface lies the receiver. This gives rise to only one valid intersection i.e. the point
where the three spheres intersect is the position of the receiver in a two dimensional space.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 26 Data Input and Topology

A GPS receiver determines its position by using the signals that it receives from different
satellites. Since the receiver must solve for its position (X,Y,Z) and the clock error (d), four
satellite are required to solve receiver’s position using the following four equations:

Fig. 3.9.3 Determining the location of receiver

R 12 = (X  X1 ) 2  (Y  Y1 )2  (Z  Z1 ) 2  d 2
R 22 = (X  X 2 ) 2  (Y  Y2 )2  (Z  Z2 ) 2  d 2
R 32 = (X  X3 ) 2  (Y  Y3 ) 2  (Z  Z3 ) 2  d 2
R 24 = (X  X 4 ) 2  (Y  Y4 )2  (Z  Z4 ) 2  d 2
Where (X1, Y1, Z1) (X2, Y2, Z2) (X3, Y3, Z3 ) and (X4, Y4, Z4) are the locations of the satellites
and R1, R2, R3, R4 are the distances of satellites from the receiver position. Hence solving the four
equations for four unknown factors X, Y, Z and d, the location of the receiver is calculated.

3.9.3 Sources of Error


Following are the possible sources of errors that may affect a GPS reading :
 Ionosphere and troposphere delays : When a satellite signal passes through the
atmosphere it slows down. This slow down is taken care of by the built-in model of the GPS
system which calculates the average delay to correct this type of error.
 Signal multipath Error : The error arises when the GPS signal is reflected off objects such
as tall buildings, mountains or such other hinderances before it reaches the receiver. The
reflectance increases the distance that the signal had to travel to reach the receiver and the
receiver assumes that the satellite is more distant than it actually is.
 Receiver clock errors : The built-in clock of a receiver may not be as accurate as the atomic
clocks onboard the GPS satellites and this may lead to errors in calculating the time.
 Orbital errors : These are the inaccuracies of the satellite's reported location. Though the
satellites remain in a fixed orbit but due to graviatational force a slight shift in the orbit
could occur.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 27 Data Input and Topology

 Satellite visibility : The more number of satellites a GPS receiver can observe, the better the
accuracy. Buildings, terrain, or dense foliage can block signals which can cause inaccurate
estimation of the position or no position reading at all.
 Satellite geometry : The satellite geometry refers to the relative position of the satellites at
any given time. Better GPS signals are obtained when satellites are separated from each than
when they are in tight grouping.

3.9.4 Applications of GPS


GPS is an essential element of the global information infrastructure. It is free, open and so
dependable that it makes its presence in everything from wrist watches to shipping containers. One
may find GPS in sectors such as farming, construction, mining, surveying, and logistics.
The benefits arising from the use of GPS in various fields are mentioned below :

Agriculture
 Allows accurate field navigation, and maximum ground coverage in the shortest possible
time.
 Enhancement of crop productivity by having precision soil sampling, correct estimation of
variation in chemical applications and planting density.

Environment
 Environmental disasters such as fires and oil spills can be tracked accurately.
 GPS tracking and mapping to facilitate monitoring and preservation of endangered species.

Aviation
 Free, continuous and accurate positioning information of flights on a global basis.
 Safe and fuel-efficient routes for airspace service providers.

Public Safety and Disaster Relief


 Helps in mapping the disaster affected regions.
 Can provide positional information about individuals with mobile phones in case of
emergency.

Surveying and Mapping


 Provides significant productivity gains over traditional surveying by eliminating many of its
inherent limitations
 Allows surveyors to work uninterrupted in periods of poor weather conditions

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 28 Data Input and Topology

Advantages of GPS :
 GPS satellite based navigation system is an important tool for military, civil and commercial
users
 Vehicle tracking systems GPS-based navigation systems can provide us with turn by turn
directions
 Very high speed

Disadvantages of GPS :
 GPS satellite signals are too weak when compared to phone signals, so it doesn’t work as
well indoors, underwater, under trees, etc.
 The highest accuracy requires line-of-sight from the receiver to the satellite, this is why GPS
doesn’t work very well in an urban environment.

3.10 Two Marks Questions with Answers


Q.1 What are the data input devices used in a GIS ?
Ans. : The different methods of input into a GIS are by
 Keyboard entry
 Manual digitizing
 Scanning and
 Automatic digitizing.
Q.2 What are the data output devices used in a GIS ?
Ans. : The important data output devices used in a GIS are
 Plotter : Used to plot the graphical information after analysis on a paper
 Printer : Used to print the information after analysis on a paper
 VDU : Visual display unit –used to display the results after analysis
 Tape Drive : Used to store the results after analysis and take it to other systems.
Q.3 What is buffering ?
Ans. : Buffering is the creation of polygons that surround other points, lines or polygons.
Buffers M/J be created either to exclude a certain amount of area around a point, line or polygon
or to include only the buffer area in a study
Q.4 Write short notes on digitizing.
Ans. : The process of convert the data from maps and other documents in to digital form. The
digital form M/J be vector or raster data. A digitizer is used to convert the data from maps into
digital form. Manual digitizing &automatic digitizing.
TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 29 Data Input and Topology

Q.5 List the various errors in digitizing.


Ans. : Scale and resolution of the source/base map. Quality of the equipment and the software
used. Incorrect registration. A shaky hand. Line thickness. Overshoot. Under shoot. Spike.
Displacement. Polygonal knot. Psychological errors.
Q.6 What is scanning ?
Ans. : A piece of hard ware for converting an analogue source of document into digital raster
format (a light sensitive device).Most commonly used method. When raster data are there to be
encoded scanning is the most appropriate option. There are three different types of scanners
available in usage :
 Flat-bed scanners (a PC peripheral).
 Rotating drum scanners.
 Large format feed scanners
Q.7 What is overlaying ?
Ans. : Map overlay is the process by which it is possible to take two or more different thematic
map layers of the same area and overlay them on top of the other and form a composite new
layer this techniques is used to overlay vector data on a raster image. In Vector base systems
map overlay is time consuming, complex and computationally expensive. In raster based
systems it is quick, straightforward and efficient
Q.8 What are the Scanners available for GIS software’s ?
Ans. :
 Mechanical scanner
 Video camera
 CCD camera
 CCD scanner(CCD-Charge Coupled Scanner)
Q.9 Name any five Raster Data File format.
Ans. :
 JPEG File Interchange Format (JFIF)
 Portable Network Graphics (PNG)
 Tagged Image File Formats (TIFF)
 Graphic Interchange Format (GIF)
 Joint Photograph Experts Group (JPEG)

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 30 Data Input and Topology

Q.10 What are the two modes of digitization ?


Point mode : In this mode, digitization is started by placing a point that marks the beginning
of the feature to be digitized and after that more points are added to trace the particular feature
(line or a polygon).
Stream mode : In stream digitizing, the cursor is placed at the beginning of the feature, a
command is then sent to the computer to place the points at either equal or unequal intervals as
per the position of the cursor moving over the image of the feature
Q.11 What are the three types of topological feature ?
Ans. :
 Connectivity : Information about linkages among spatial objects
 Contiguity : Information about neighbouring spatial object
 Containment : Information about inclusion of one spatial object within another spatial
object
Q.12 Give any five topological consistency rules for Polygon.
Ans. :
 Must Be Larger Than Cluster Tolerance
 Must Not Overlap
 Must Not Have Gaps
 Must Not Overlap With
 Must Be Covered By Feature Class Of
 Must Cover Each Other
Q.13 What is GPS ?
Ans. : GPS or Global Positioning System is a satellite navigation system that furnishes location
and time information in all climate conditions to the user. GPS is used for navigation in planes,
ships, cars and trucks also. The system gives critical abilities to military and civilian users
around the globe. GPS provides continuous real time, 3-dimensional positioning, navigation and
timing worldwide.
Q.14 What are the advantages and disadvantages of GPS ?
Ans. : Advantages of GPS :
 GPS satellite based navigation system is an important tool for military, civil and
commercial users.
 Vehicle tracking systems GPS-based navigation systems can provide us with turn by turn
directions.
 Very high speed.

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 31 Data Input and Topology

Disadvantages of GPS :
 GPS satellite signals are too weak when compared to phone signals, so it doesn’t work as
well indoors, underwater, under trees, etc.
 The highest accuracy requires line-of-sight from the receiver to the satellite, this is why
GPS doesn’t work very well in an urban environment.
Q.15 Define ODBC.
Ans. : Open Database Connectivity-or ODBC-is an application programming interface (API)
that lets software connect with database management systems while remaining independent of
them. This is important, because it allows applications to interact with multiple databases
simultaneously using SQL
Q.16 What are the four components of ODBC ?
Ans. :
 Application
 Driver Manager
 Driver
 Data Source
Q.17 Name the three segments in GPS.
Ans. :
The GPS system consists of three segments :
 The space segment : the GPS satellites

 The control system, operated by the U.S. military,

 The user segment, which includes both military and civilian users and their GPS equipment.
Q.18 Write few advantages of ODBC.
Ans. :
Some of the advantages of ODBC are :
 ODBC provides a consistent interface regardless of the kind of database server used.

 You can have more than one concurrent connection.

 Applications do not have to be bound to each database on which they will run.
Q.19 Name some Topological Consistency rules for Line .
Ans. :
 Must Coincide With

 Must Be Disjoint

 Must Be Covered By Boundary Of

TM
Technical Publications - An up thrust for knowledge
Geographic Information System 3 - 32 Data Input and Topology

 Must Be Properly Inside

 Must Be Covered By Endpoint Of

 Must Be Covered By Line


Q.20 Give the working operation of Tablet Digitizer.
Ans. : The digitizing operation is as follows
Step 1 : A map is affixed to a digitizing table.
Step 2 : Control points or tics at four corners of this map sheet should be digitized by the
digitizer and input to PC together with the map coordinates of the four corners.
Step 3 : Map contents are digitized according to the map layers and map code system in either
point mode or stream mode at short time interval.
Step 4 : Editing errors such as small gaps at line junctions, overshoots, duplicates etc. should be
made for a clean dataset without errors.
Step 5 : Conversion from digitizer coordinates to map coordinates to store in a spatial database.

3.11 Long Answered Questions with Answers


Q.1 Explain in detail about GPS and its working principle. (Refer section 3.9)
Q.2 Discuss in detail about the topological consistency rules. (Refer section 3.6)
Q.3 What is Scanner ? Give its types and its operation procedures. (Refer section 3.2)
Q.4 What is Digitizer ? Explain the working process. (Refer section 3.4)
Q.5 Define ODBC. What are the components of ODBC ? (Refer section 3.8)
Q.6 Illustrate the concept of topological Features. (Refer section 3.5)
Q.7 List out the raster data file formats. (Refer section 3.3)



TM
Technical Publications - An up thrust for knowledge
DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- I

systems. The process of projection transforms the Earth’s surface to a plane, and the outcome

is a map projection, ready to be used for a projected coordinate system.

P
The map shows the interstate highways in Idaho and The map shows the connected interstate

AP
Montana based on different coordinate systems. networks based on the same coordinate system.

Geographic Coordinate System:


The geographic coordinate system is the reference system for locating spatial features
R
on the Earth’s surface. The geographic coordinate system is defined by longitude and
latitude. Both longitude and latitude are angular measures: longitude measures the angle east
CO

or west from the prime meridian, and latitude measures the angle north or south of the
equatorial plane. For example, the longitude at point X is the angle a west of the prime
meridian, and the latitude at point Y is the angle b north of the equator.
U
ST

The geographic coordinate system.


Meridians are lines of equal longitude. The prime meridian passes through
Greenwich, England, and has the reading of 0°. Using the prime meridian as a reference, we

can measure the longitude value of a point on the Earth’s surface as 0° to 180° east or west of

the prime meridian. Meridians are therefore used for measuring location in the E–W

direction. Parallels are lines of equal latitude.

RMKCET – CSE DEPT Page No- 3

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- I

The flattening is based on the difference between the semimajor axis a and the semiminor axis b.
The angular measures of longitude and latitude may be expressed in degrees-minutes-
seconds (DMS), decimal degrees (DD), or radians (rad). Given that 1 degree equals 60
minutes and 1 minute equals 60 seconds, we can convert between DMS and DD. For
example, a latitude value of 45°52'30" would be equal to 45.875° (45 + 52/60 + 30/3600).

P
Radians are typically used in computer programs. One radian equals 57.2958°, and one
degree equals 0.01745 rad.

AP
Map Projections:
A map projection transforms the geographic coordinates on an ellipsoid into locations
on a plane. The outcome of this transformation process is a systematic arrangement of
R
parallels and meridians on a flat surface representing the geographic coordinate system. A
map projection provides a couple of distinctive advantages. First, a map projection allows us
CO

to use two-dimensional maps, either paper or digital. Second, a map projection allows us to
work with plane coordinates rather than longitude and latitude values.
Map projections can be grouped by either the preserved property or the projection
surface. Cartographers group map projections by the preserved property into the following
U

four classes: conformal, equal area or equivalent, equidistant, and azimuthal or true direction.
A conformal projection preserves local angles and shapes. An equivalent projection
ST

represents areas in correct relative size. An equidistant projection maintains consistency of


scale along certain lines. And an azimuthal projection retains certain accurate directions. The
preserved property of a map projection is often included in its name, such as the Lambert
conformal conic projection or the Albers equal-area conic projection.

RMKCET – CSE DEPT Page No- 4

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- I

Case and projection

P
A map projection is defined by its parameters. Typically, a map projection has five or more
parameters. A standard line refers to the line of tangency between the projection surface and

AP
the reference globe. The standard line is called the standard parallel if it follows a parallel,
and the standard meridian if it follows a meridian. The principal scale, or the scale of the
reference globe, can be derived from the ratio of the globe’s radius to the Earth’s radius
(3963 miles or 6378 kilometers). The scale factor is the normalized local scale, defined as the
R
ratio of the local scale to the principal scale. The false easting is the assigned x-coordinate
value and the false northing is the assigned y-coordinate value. Essentially, the false easting
CO

and false northing create a false origin so that all points fall within the NE quadrant and have
positive coordinates. The following are the commonly used map projections: Transverse
Mercator, Lambert Conformal Conic, Albers Equal-Area Conic, Equidistant Conic, Web
Mercator.
U
ST

Projected Coordinate Systems:


A projected coordinate system is built on a map projection. Projected coordinate
systems and map projections are often used interchangeably. For example, the Lambert
conformal conic is a map projection but it can also refer to a coordinate system. In practice,
however, projected coordinate systems are designed for detailed calculations and positioning,
and are typically used in large-scale mapping such as at a scale of 1:24,000 or larger.
Accuracy in a feature’s location and its position relative to other features is therefore a key
consideration in the design of a projected coordinate system. To maintain the level of
accuracy desired for measurements, a projected coordinate system is often divided into
different zones, with each zone defined by a different projection center.

RMKCET – CSE DEPT Page No- 5

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- I

(a) (b)
The Projected Coordinate System (a): Representation of points in Geographic Coordinate System
(b): Equivalent representation in Projected coordinate system

P
Three coordinate systems are commonly used in the United States: the Universal
Transverse Mercator (UTM) grid system, the Universal Polar Stereographic (UPS) grid

AP
system, and the State Plane Coordinate (SPC) system.
R
CO
U

Example Coordinate Systems


ST

World Geodetic System (WGS-84) is familiar to many non-geographers because it is


used by GPS devices to describe locations all over the Earth. A different GCS, called OSGB-
36, which is more accurate for describing locations in Britain but not as good for other
countries, is used specifically for British data. Web Mercator is a PCS based on WGS-84
used for global maps, and British National Grid is a PCS based on OSGB-36 used for British
maps. Converting between coordinate systems that are based on the same GCS is relatively
straightforward, but when converting, for example, GPS (WGS-84) coordinates to BNG, a
mathematical transformation is required. The "Petroleum" transformation is an accurate
transformation from WGS-84 to OSGB-36.

RMKCET – CSE DEPT Page No- 6

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- I

COMPONENTS OF GIS:
A GIS is an organized collection of computer hardware, software, geographic data,
and personnel designed to efficiently capture, store, update, manipulate, analyze, and display
all forms of geographically referenced information. GIS technology integrates common
database operations, such as query and statistical analysis, with the unique visualization and
geographic analysis benefits offered by maps. A working GIS integrates the following key
components: hardware, software, data, people, and methods.
o Hardware - GIS hardware includes computers for data processing, data storage,
and input/output; printers and plotters for reports and hard-copy maps; digitizers
and scanners for digitization of spatial data; and GPS (Global Positioning System)

P
and mobile devices for fieldwork.

AP
o Software - GIS software, either commercial or open source, includes programs
and applications to be executed by a computer for data management, data analysis,
data display, and other tasks. Additional applications, written in Python,
JavaScript, VB.NET, or C++, may be used in GIS for specific data analyses.
o Method - A successful GIS operates according to a well-designed plan and
R
business rules, which are the models and operating practices unique to each
CO

organization. Any organization has documented their process plan for GIS
operation. These document address number question about the GIS methods:
number of GIS expert required, GIS software and hardware, Process to store the
data, what type of DBMS (database management system) and more. Well
U

designed plan will address all these questions.


o People - GIS technology is of limited value without the people who manage the
ST

system and to develop plans for applying it. GIS users range from technical
specialists who design and maintain the system, to those who use it to help them
do their everyday work.

RMKCET – CSE DEPT Page No- 7

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- I

P
o Data - Maybe the most important component of a GIS is the data. Geographic

AP
data and related tabular data can be collected in-house or bought from a
commercial data provider. Most GIS employ a DBMS to create and maintain a
database to help organize and manage data. The data that a GIS operates on
consists of any data bearing a definable relationship to space, including any data
R
about things and events that occur in nature. At one time this consisted of hard-
copy data, like traditional cartographic maps, surveyor’s logs, demographic
CO

statistics, geographic reports, and descriptions from the field. Advances in spatial
data collection, classification, and accuracy have allowed more and more standard
digital base-maps to become available at different scales.
o Organization - GIS operations exist within an organizational environment;
U

therefore, they must be integrated into the culture and decision-making processes
of the organization for such matters as the role and value of GIS, GIS training,
ST

data collection and dissemination, and data standards.

WORKING OF GIS:
GIS consists of the following elements i.e. geospatial data, data acquisition, data
management, data display, data exploration, and data analysis.

 Geospatial Data: By definition, geospatial data cover the location of spatial features.

To locate spatial features on the Earth’s surface, we can use either a geographic or a

projected coordinate system. A geographic coordinate system is expressed in


longitude and latitude and a projected coordinate system in x, y coordinates. Many

RMKCET – CSE DEPT Page No- 8

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- I

projected coordinated systems are available for use in GIS. A GIS represents
geospatial data as either vector data or raster data.

The vector data model uses x, y coordinates to represent point features

The vector data model uses points, lines, and polygons to represent spatial features
with a clear spatial location and boundary such as streams, land parcels, and

P
vegetation stands. Each feature is assigned an ID so that it can be associated with its

AP
attributes.
The raster data model uses a grid and grid cells to represent spatial features: point
features are represented by single cells, line features by sequences of neighbouring
cells, and polygon features by collections of contiguous cells. The cell value
R
corresponds to the attribute of the spatial feature at the cell location. Raster data are
ideal for continuous features such as elevation and precipitation.
CO
U

The raster data model uses cells in a grid to represent point features
ST

A vector data model can be georelational or object-based, with or without topology,


and simple or composite. The georelational model stores geometries and attributes of
spatial features in separate systems, whereas the object-based model stores them in a
single system. Topology explicitly expresses the spatial relationships between
features, such as two lines meeting perfectly at a point.

 Data Acquisition: Data acquisition is usually the first step in conducting a GIS
project. The need for geospatial data by GIS users has been linked to the development
of data clearinghouses and geoportals. Since the early 1990s, government agencies at

RMKCET – CSE DEPT Page No- 9

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- I

different levels in the United States as well as many other countries have set up
websites for sharing public data and for directing users to various data sources.

Data acquisition involves compilation of existing and new data. To be used in a GIS,
a newly digitized map or a map created from satellite images requires geometric
transformation (i.e., geo-referencing). Additionally, both existing and new spatial data
must be edited if they contain digitizing and/or topological errors.

 Attribute Data Management: A GIS usually employs a database management system

P
(DBMS) to handle attribute data, which can be large in size in the case of vector data.
Each polygon in a soil map, for example, can be associated with dozens of attributes

AP
on the physical and chemical soil properties and soil interpretations. Attribute data are
stored in a relational database as a collection of tables. These tables can be prepared,
maintained, and edited separately, but they can also be linked for data search and
retrieval.
R
 Data Display: A routine GIS operation is mapmaking because maps are an interface
CO

to GIS. Mapmaking can be informal or formal in GIS. It is informal when we view


geospatial data on maps, and formal when we produce maps for professional
presentations and reports. A professional map combines the title, map body, legend,
U

scale bar, and other elements together to convey geographic information to the map
reader.

To make a “good” map, we must have a basic understanding of map symbols, colors,
ST

and typology, and their relationship to the mapped data. Additionally, we must be
familiar with map design principles such as layout and visual hierarchy. After a map
is composed in a GIS, it can be printed or saved as a graphic file for presentation. It
can also be converted to a KML file, imported into Google Earth, and shared publicly
on a web server.

 Data Exploration: Data exploration refers to the activities of visualizing,


manipulating, and querying data using maps, tables, and graphs. These activities offer
a close look at the data and function as a precursor to formal data analysis. Data

RMKCET – CSE DEPT Page No- 10

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- I

exploration in GIS can be map or feature-based. Map-based exploration includes data


classification, data aggregation, and map comparison.

Feature-based query can involve either attribute or spatial data. Attribute data query is
basically the same as database query using a DBMS. In contrast, spatial data query
allows GIS users to select features based on their spatial relationships such as
containment, intersect, and proximity. A combination of attribute and spatial data
queries provides a powerful tool for data exploration.

 Data Analysis: A GIS has a large number of tools for data analysis. Some are basic

P
tools, meaning that they are regularly used by GIS users. Other tools tend to be

AP
discipline or application specific. Two basic tools for vector data are buffering and
overlay: buffering creates buffer zones from select features, and overlay combines the
geometries and attributes of the input layers.
R
CO

A vector-based overlay operation combines geometries and


attributes from different layers to create the output.
U

Four basic tools for raster data are local, neighbourhood, zonal, and global operations,
depending on whether the operation is performed at the level of individual cells, or
ST

groups of cells, or cells within an entire raster.

RMKCET – CSE DEPT Page No- 11

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- I

P
AP
Basic operation of Raster Data
GIS SOFTWARE PRODUCTS:
The below table shows a select list of commercial GIS software in the left column and
free and open source software (FOSS) for GIS in the right column.
R
CO
U
ST

ArcGIS is composed of applications and extensions at three license levels. The


applications include ArcMap,ArcGIS Pro, ArcCatalog, ArcScene, and ArcGlobe, and the

RMKCET – CSE DEPT Page No- 12

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- I

extensions include 3D Analyst, Network Analyst, Spatial Analyst, Geostatistical Analyst, and
others.

GRASS GIS (Geographic Resources Analysis Support System), the first FOSS for
GIS, was originally developed by the U.S. Army Construction Engineering Research
Laboratories in the 1980s. Well known for its analysis tools, GRASS GIS is currently
maintained and developed by a worldwide network of users. Academicians, government
agencies (NASA, NOAA, USDA and USGS) and GIS practitioners use this open source

P
software because its code can be inspected and tailored to their needs.

AP
SAGA GIS (System for Automated Geoscientific Analyses) is one of the classics in
the world of free GIS software. It started out primarily for terrain analysis such as
hillshading, watershed extraction and visibility analysis. Now, SAGA GIS is a powerhouse
because it delivers a fast growing set of geoscientific methods to the geoscientific
R
community.
GeoDa is a free GIS software program primarily used to introduce new users into
CO

spatial data analysis. Its main functionality is data exploration in statistics. One of the nicest
things about it is how it comes with sample data for you to give a test-drive. From simple
box-plots all the way to regression statistics, GeoDa has complete arsenal of statistics to do
nearly anything spatially.
U

APPLICATION OF GIS:
ST

GIS is a useful tool because a high percentage of information we routinely encounter


has a spatial component. An often cited figure among GIS users is that 80 percent of data is
geographic. Since its beginning, GIS has been important for land use planning, natural hazard
assessment, wildlife habitat analysis, riparian zone monitoring, timber management, and
urban planning. The list of fields that have benefited from the use of GIS has expanded
significantly for the past three decades.
In the United States, the U.S. Geological Survey (USGS) is a leading agency in the
development and promotion of GIS. The USGS website provides case studies as well as
geospatial data for applications in climate and land use change, ecosystem analysis, geologic
mapping, petroleum resource assessment, watershed management, coastal zone management,

RMKCET – CSE DEPT Page No- 13

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- I

natural hazards (volcano, flood, and landslide), aquifer depletion, and ground water
management.
In the private sector, most GIS applications are integrated with the Internet, GPS,
wireless technology, and Web services. The following shows some of these applications:
 Online mapping websites offer locators for finding real estate listings, vacation
rentals, banks, restaurants, coffee shops, and hotels.
 Location-based services allow mobile phone users to search for nearby banks,
restaurants, and taxis; and to track friends, dates, children, and the elderly.
 Mobile GIS allows field workers to collect and access geospatial data in the field.
 Mobile resource management tools track and manage the location of field crews and

P
mobile assets in real time.

AP
 Automotive navigation systems provide turn by-turn guidance and optimal routes
based on precise road mapping using GPS and camera.
 Augmented reality lets a smart phone user look through the phone’s camera with
superimposed data or images (e.g., 3-D terrain from a GIS, monsters in Pokemon Go)
R
about the current location.
CO

SCALES/LEVELS OF MEASUREMENTS:
Scales of Measurement or level of measurement is a system for classifying attribute
data into four categories namely nominal, ordinal, interval and ratio.
 Nominal: In this level of measurement, the numbers in the variable are used only to
U

classify the data. In this level of measurement, words, letters, and alpha-numeric
symbols can be used. Suppose there are data about people belonging to three
ST

different gender categories. In this case, the person belonging to the female gender
could be classified as F, the person belonging to the male gender could be classified
as M, and transgendered classified as T. This type of assigning classification is
nominal level of measurement.
 Ordinal: This level of measurement depicts some ordered relationship among the
variable’s observations. Suppose a student scores the highest grade of 100 in the
class. In this case, he would be assigned the first rank. Then, another classmate
scores the second highest grade of an 92; she would be assigned the second rank. A
third student scores a 81 and he would be assigned the third rank, and so on. The
ordinal level of measurement indicates an ordering of the measurements.

RMKCET – CSE DEPT Page No- 14

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- I

P
 AP
Levels of Measurements
Interval: The interval level of measurement not only classifies and orders the
R
measurements, but it also specifies that the distances between each interval on the
scale are equivalent along the scale from low interval to high interval. For
CO

example, an interval level of measurement could be the measurement of anxiety in a


student between the score of 10 and 11, this interval is the same as that of a student
who scores between 40 and 41. A popular example of this level of measurement
is temperature in centigrade, where, for example, the distance between 940C and
U

960C is the same as the distance between 1000C and 1020C.


 Ratio: In this level of measurement, the observations, in addition to having equal
ST

intervals, can have a value of zero as well. A common geographic example of ratio
data is density (i.e. population, ethnicity, etc.). Any percent value from 0 to 100 will
have a meaningful zero.

*****

RMKCET – CSE DEPT Page No- 15

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

UNIT II SPATIAL DATA MODELS

Database Structures – Relational, Object Oriented – ER diagram - spatial data models –


Raster Data Structures – Raster Data Compression - Vector Data Structures - Raster vs
Vector Models- TIN and GRID data models - OGC standards - Data Quality.

DATABASE MODEL:

Data model defines the logical structure of a database. Data Models are fundamental
entities to introduce abstraction in a DBMS. Data models define how data is connected to

P
each other and how they are processed and stored inside the system. There are a number of
different database data models. Amongst those that have been used for attribute data in GIS

AP
are the hierarchical, network, relational, object-relational and object-oriented data models. Of
these the relational data model has become the most widely used model.

Relational Data Model:


R
Data are organized in a series of two-dimensional tables, each of which contains
records for one entity. These tables are linked by common data known as keys. Queries are
CO

possible on individual tables or on groups of tables. For the Happy Valley data, the below
figure illustrates an example of one such table.
U
ST

Relational database table data for Happy Valley

The data in a relational database are stored as a set of base tables with the
characteristics described above. Other tables are created as the database is queried and these
represent virtual views. The table structure is extremely flexible and allows a wide variety of
queries on the data. Queries are possible on one table at a time (for example, you might ask
‘which hotels have more than 14 rooms?’ or ‘which hotels are luxury standard?’), or on more
than one table by linking through key fields (for instance, ‘which passengers originating from
the UK are staying in luxury hotels?’ or ‘which ski lessons have pupils who are over 50 years

RMKCET - CSE DEPT Page No- 1

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

of age?’). Queries generate further tables, but these new tables are not usually stored. There
are few restrictions on the types of query possible.

Database terminology applied to Happy Valley table


With many relational databases querying is facilitated by menu systems and icons, or

P
‘query by example’ systems. Frequently, queries are built up of expressions based on
relational algebra, using commands such as SELECT (to select a subset of rows), PROJECT

AP
(to select a subset of columns) or JOIN (to join tables based on key fields). SQL (standard
query language) has been developed to facilitate the querying of relational databases. The
advantages of SQL for database users are its completeness, simplicity, pseudo English-
language style and wide application. However, SQL has not really developed to handle
R
geographical concepts such as ‘near to’, ‘far from’ or‘connected to’.
CO

ER Diagram:
An Entity–relationship model (ER model) describes the structure of a database with
the help of a diagram, which is known as Entity Relationship Diagram (ER Diagram). An ER
model is a design or blueprint of a database that can later be implemented as a database. ER
U

Model is best used for the conceptual design of a database. The main components of E-R
model are:
ST

 Entity − An entity in an ER Model is a real-world entity having properties


called attributes. Every attribute is defined by its set of values called domain. For
example, in a school database, a student is considered as an entity. Student has
various attributes like name, age, class, etc.
 Relationship − The logical association among entities is called relationship.
Relationships are mapped with entities in various ways. Mapping cardinalities define
the number of association between two entities. The following are the Mapping
cardinalities - one to one, one to many, many to one & many to many.

RMKCET - CSE DEPT Page No- 2

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

The following are the various symbols used in ER diagram:

P
AP
R
CO
U
ST

RMKCET - CSE DEPT Page No- 3

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

The figure shows the ER diagram for the GPS tracking system. The design has three
entities namely User-generated Data, Publisher and Subscriber.

P
AP
ER Diagram of GPS System
R
SPATIAL DATA MODEL:
CO

Raster Data Model:


The raster spatial data model is one of a family of spatial data models described as
tessellations. In the raster world individual cells are used as the building blocks for creating
images of point, line, area, network and surface entities. In the raster world the basic building
U

block is the individual grid cell, and the shape and character of an entity is created by the
grouping of cells. The size of the grid cell is very important as it influences how an entity
ST

appears.

Representation of Spatial Features:


The vector data model uses the geometric objects of point, line, and polygon to
represent spatial features. A point has zero dimension and has only the property of location.
A point feature is made of a point or a set of points. Wells, benchmarks, and gravel pits on a
topographic map are examples of point features. A line is one-dimensional and has the
property of length, in addition to location. A line has two end points and may have additional
points in between to mark the shape of the line. polygon is two-dimensional and has the
properties of area (size) and perimeter, in addition to location. Made of connected, closed,
nonintersecting lines, the perimeter or the boundary defines the area of a polygon.

RMKCET - CSE DEPT Page No- 4

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

Topology:
Topology refers to the study of those properties of geometric objects that remain
invariant under certain transformations such as bending or stretching. An example of a
topological map is a subway map.
A subway map depicts correctly the connectivity between the subway lines and
stations on each line but has distortions in distance and direction. In GIS, vector data can be
topological or non-topological, depending on whether topology is built into the data or not.
Topology can be explained through directed graphs (digraphs), which show the arrangements
of geometric objects and the relationships among objects. An edge or arc is a directed line

P
with a starting point and an ending point. The end points of an arc are nodes, and

AP
intermediate points, if any, are vertices. And a face refers to a polygon bounded by arcs. If an
arc joins two nodes, the nodes are said to be adjacent and incident with the arc.

TIGER:
R
An early application example of topology is the Topologically Integrated Geographic
CO

Encoding and Referencing (TIGER) data base from the U.S. Census Bureau. The TIGER
database links statistical area boundaries such as counties, census tracts, and block groups to
roads, railroads, rivers, and other features by topology.
U
ST

Topology in the TIGER database involves nodes, arcs and faces.

RMKCET - CSE DEPT Page No- 5

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

Topology has three main advantages. First, it ensures data quality and integrity.
Second, topology can enhance GIS analysis. Third, topological relationships between spatial
features allow GIS users to perform spatial data query.

Vector Data Model:


A vector spatial data model uses two-dimensional Cartesian (x,y) co-ordinates to store
the shape of a spatial entity. In the vector world the point is the basic building block from
which all spatial entities are constructed. The simplest spatial entity, the point, is represented
by a single (x,y) co-ordinate pair. Line and area entities are constructed by connecting a
series of points into chains and polygons.

P
The more complex the shape of a line or area feature the greater the number of points

AP
required to represent it. Selecting the appropriate number of points to construct an entity is
one of the major dilemmas when using the vector approach.
R
CO
U
ST

Raster and vector spatial data


If too few points are chosen the character, shape and spatial properties of the entity
(for example, area, length, perimeter) will be compromised. If too many points are used,
unnecessary duplicate information will be stored and this will be costly in terms of data
capture and computer storage.

RMKCET - CSE DEPT Page No- 6

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

P
Effect of changing resolution in the vector and raster worlds

AP
SPATIAL DATA STRUCTURES:
Data structures provide the information that the computer requires to reconstruct the
spatial data model in digital form. There are many different data structures in use in GIS. This
diversity is one of the reasons why exchanging spatial data between different GIS software
R
can be problematic. However, despite this diversity data structures can be classified
according to whether they are used to structure raster or vector data.
CO

Raster data structures:


In the raster world a range of different methods is used to encode a spatial entity for
U

storage and representation in the computer. The below figure shows the most straightforward
method of coding raster data. The cells in each line of the image (Figure: a) are mirrored by
an equivalent row of numbers in the file structure (Figure: c). The first line of the file tells the
ST

computer that the image consists of 10 rows and 10 columns and that the maximum cell value
is 1. In this example, a value of 0 has been used to record cells where the entity is not present
and a value of 1 for cells where the entity is present (Figure: b).

A simple raster data structure

RMKCET - CSE DEPT Page No- 7

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

In a simple raster data structure, such as illustrated in the above figure, different
spatial features must be stored as separate data layers. Thus, to store more raster entities,
separate data files would be required, each representing a different layer of spatial data.
However, if the entities do not occupy the same geographic location (or cells in the raster
model), then it is possible to store them all in a single layer, with an entity code given to each
cell. This code informs the user which entity is present in which cell.

P
AP
Feature coding of cells in the raster world
R
Above figure shows how different land uses can be coded in a single raster layer. The values
CO

1, 2 and 3 have been used to classify the raster cells according to the land use present at a
given location. The value 1 represents residential area; 2, forest; and 3, farmland.
One of the major problems with raster data sets is their size, because a value must be
recorded and stored for each cell in an image. Thus, a complex image made up of a mosaic of
U

different features (such as a soil map with 20 distinct classes) requires the same amount of
storage space as a similar raster map showing the location of a single forest. To address this
ST

problem a range of data compaction methods have been developed.

Vector Data Structure:


There are many potential vector data structures that can be used to store the geometric
representation of entities in the computer. The simplest vector data structure that can be used
to reproduce a geographical image in the computer is a file containing (x,y) co-ordinate pairs
that represent the location of individual point features (or the points used to construct lines or
areas).

RMKCET - CSE DEPT Page No- 8

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

Data structures in the vector world: simple point dictionary


data structure

The above figure shows such a vector data structure for the Happy Valley car park.

P
Note how a closed ring of co-ordinate pairs defines the boundary of the polygon. The

AP
limitations of simple vector data structures start to emerge when more complex spatial
entities are considered. For example, consider the Happy Valley car park divided into
different parking zones (Figure: b). The car park consists of a number of adjacent polygons.
If the simple data structure, illustrated in Figure: a, were used to capture this entity then the
R
boundary line shared between adjacent polygons would be stored twice. This may not appear
too much of a problem in the case of this example, but consider the implications for a map of
CO

the 50 states in the USA.


The amount of duplicate data would be considerable. This method can be improved
by adjacent polygons sharing common co-ordinate pairs (points). To do this all points in the
data structure must be numbered sequentially and contain an explicit reference which records
U

which points are associated with which polygon. This is known as a point dictionary. The
data structure in Figure: b, shows how such an approach has been used to store data for the
ST

different zones in the Happy Valley car park.


There is a considerable range of topological data structures in use by GIS. All the
structures available try to ensure that:
 no node or line segment is duplicated;
 line segments and nodes can be referenced to more than one polygon;
 all polygons have unique identifiers; and
 island and hole polygons can be adequately represented.

RMKCET - CSE DEPT Page No- 9

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

RASTER DATA COMPRESSION:


Data compression refers to the reduction of data volume, a topic particularly
important for data delivery and Web mapping. Data compression is related to how raster data
are encoded. Quadtree and RLE, because of their efficiency in data encoding, can also be
considered as data compression methods.
A variety of techniques are available for data compression. They can be lossless or
lossy. A lossless compression preserves the cell or pixel values and allows the original raster
or image to be precisely reconstructed. Therefore, lossless compression is desirable for raster
data that are used for analysis or deriving new data. RLE is an example of lossless

P
compression. Other methods include LZW (Lempel—Ziv-Welch) and its variations (e.g.,

LZ77,LZMA).

AP
A lossy compression cannot reconstruct fully the original image but can achieve
higher compression ratios than a lossless compression. Lossy compression is therefore useful
for raster data that are used as background images rather than for analysis. Image degradation
through lossy compression can affect GIS-related tasks such as extracting ground control
R
points from aerial photographs or satellite images for the purpose of georeferencing.
Run length encoding:
CO

Run length encoding stores cells on a row-by-row basis. Instead of recording each
individual cell’s values, run length encoding groups cell values by row.
U
ST

Block coding:
The block coding raster storage technique assigns areas that are blocks to reduce
redundancy. The block coding raster image compression method subdivides an entire raster
image into hierarchical blocks. It’s an extension of the run length encoding technique, but
extends it to two dimensions.

RMKCET - CSE DEPT Page No- 10

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

Chain Coding:
Chain coding defines the outer boundary using relative positions from a start point.
The sequence of the exterior is stored where the endpoint finishes at the start point. During
the encoding, the direction is stored as an integer. However, in this example we use cardinal
directions for simplicity. For example, the value 0 is north and 1 is east.

P
Quadtree encoding:
Quadtrees are raster data structures based on the successive reduction of

AP
homogeneous cells. It recursively subdivides a raster image into quarters. The subdivision
process continues until each cell is classed.
R
CO

MrSID uses the wavelet transform for data compression. The wavelet-based
compression is also used by JPEG 2000 and ECW (Enhanced Compressed Wavelet). The
wavelet transform treats an image as a wave and progressively decomposes the wave into
U

simpler wavelets (Addison 2002). Using a wavelet (mathematical) function, the transform
repetitively averages groups of adjacent pixels (e.g., 2, 4, 6, 8, or more) and, at the same time,
ST

records the differences between the original pixel values and the average. The differences,
also called wavelet coefficients, can be 0, greater than 0, or less than 0. In parts of an image
that have few significant variations, most pixels will have coefficients of 0 or very close to 0.
To save data storage, these parts of the image can be stored at lower resolutions by rounding
off low coefficients to 0, but storage at higher resolutions is required for parts of the same
image that have significant variations (i.e., more details). Box 4.4 shows a simple example of
using the Haar function for the wavelet 3transform.

RMKCET - CSE DEPT Page No- 11

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

The Haar wavelet and the wavelet transform.


(a) Three Haar wavelets at three scales (b) A simple example of the wavelet
(resolutions). transform.

P
VECTOR vs RASTER:

AP
Vector Raster
Usually Complex. Usually Simple.
Difficult for overlay operation. Efficient for overlay operation.
High spatial variability is inefficiently High spatial variability is efficiently
R
represented. represented.
CO

Small file size. Large file size.


Vector data model is often used for Raster data model is widely used for
representing discrete features with representing continuous spatial features.
definable boundaries.
U

Example: Example:
ST

DIGITAL TERRAIN MODELLING:


The abbreviation DTM is used to describe a digital data set which is used to model a
topographic surface (a surface representing height data). To model a surface accurately it
would be necessary to store an almost infinite number of observations. Since this is
impossible, a surface model approximates a continuous surface using a finite number of

RMKCET - CSE DEPT Page No- 12

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

observations. Thus, an appropriate number of observations must be selected, along with their
geographical location.
The ‘resolution’ of a DTM is determined by the frequency of observations used.
DTMs are created from a series of either regularly or irregularly spaced (x,y,z) data points
(where x and y are the horizontal co-ordinates and z is the vertical or height co-ordinate).
DTMs may be derived from a number of data sources. These include contour and spot height
information found on topographic maps, stereoscopic aerial photography, satellite images and
field surveys.
Triangulated Irregular Networks:
A commonly used data structure in GIS software is the triangulated irregular network

P
(TIN). It is on the standard implementation techniques for digital terrain models, but it can be

AP
used to represent any continuous field. The principles behind a TIN are simple. It is built
from a set of locations for which we have a measurement for instance an elevation. The
locations can be arbitrarily scattered in space and are usually not on a nice regular grid. Any
location together with its elevation value can be viewed as a point in three dimensional space.
This is illustrated in below figure. From these 3D points, we can construct an irregular
R
tessellation made of triangles.
CO
U

Input locations and their (elevation) values for a TIN construction.


ST

In three-dimensional space, three points uniquely determine a plane, as long as they


not collinear, i.e. they must not be positioned on the same line. A plane fitted through these
points has a fixed aspect and gradient and can be used to compute an approximation f
elevation of other locations. Since we pick many triples of points, we can construct many
such planes and therefore we can have many elevation approximations for a single location
such as `P`. So, it is wise to restrict the use of a plane to the triangular area between the three
points.
If we restrict the use of a plane to the area between its three anchor points, we obtain a
triangular tessellation of the complete study space. Unfortunately, there are many different
tessellations for a given input set of anchor points. Some tessellations are better than others,
in the sense that they make smaller errors of elevation approximation. For instance, it we base

RMKCET - CSE DEPT Page No- 13

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

our elevation computation for location `P` on the left hand shaded triangle, we will get
another value than from the right hand shaded triangle. The second will provide a better
approximation because the average distance from `P` to the three triangle anchors is smaller.
The triangulation shown in below figure happens to be a Delaunay triangulation, which in a
sense is an optimal triangulation. There are multiple ways of defining what such a
triangulation is, but we suffice here to state two important properties. The first is that the
triangles are as equilateral (‘equal-sided’) as they can be, given the set of anchor points. The
second property is that for each triangle, the circumcircle through its three anchor points does
not contain any other anchor point. One such circumcircle is depicted on the right of Figure
(b).

P
AP
R
Two triangulations based on the input locations (a) one with many ‘stretched’ triangles;
(b) the triangles are more equilateral – Delaunay triangulation.
CO

A TIN clearly is a vector representation: each anchor point has a stored georeference.
Yet, we might also call it an irregular tessellation, as the chosen triangulation provides a
partitioning of the entire study space. However, in this case, the cells do not have an
associated stored value as is typical of tessellations, but rather a simple interpolation function
U

that uses the elevation values of its three anchor points.


ST

GIS DATA STANDARDS:


The number of formats available for GIS data is almost as large as the number of GIS
packages on the market. This makes the sharing of data difficult and means that data created
on one system is not always easily read by another system. This problem has been addressed
in the past by including data conversion functions in GIS software. These conversion
functions adopt commonly used exchange formats such as DXF and E00.

Open Geospatial Consortium (OGC):


There is still no universally accepted GIS data standard, although the Open
Geospatial Consortium (OGC), formed in 1994 by a group of leading GIS software and data

RMKCET - CSE DEPT Page No- 14

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

vendors, is working to deliver spatial interface specifications that are available for global use
(OGC, 2001). The OGC has proposed the Geography Markup Language (GML) as a new
GIS data standard.
The Geography Markup Language (GML) is a non-proprietary computer language
designed specifically for the transfer of spatial data over the Internet. GML is based on XML
(eXtensible Markup Language), the standard language of the Internet, and allows the
exchange of spatial information and the construction of distributed spatial relationships.
GML has been proposed by the Open Geospatial Consortium as a universal spatial
data standard. GML is likely to become very widely used because it is:
 Internet friendly;

P
 not tied to any proprietary GIS;

AP
 specifically designed for feature-based spatial data;
 open to use by anyone;
 compatible with industry-wide IT standards.
It is also likely to set the standard for the delivery of spatial information content to
R
PDA and WAP devices, and so form an important component of mobile and location-based
(LBS) GIS technologies. The collection of geoportals and various other compliemntary
CO

services, create a Spatial Data Infrastructure (SDI).

Spatial Data Infrastructure (SDI):


An SDI is used to represent all the components that enable access to spatial data
U

including relevant technologies, policies and institutional arrangements. Using electronic


media, SDIs connect nationally distributed repositories of geospatial information and make
ST

them available on a device through a single entry point often referred to as a 'geoportal'. They
facilitate data providers and users to participate in the digital spatial community at a national
scale and provide a basis for spatial data discovery, evaluation and application for users
within government, commercial and non-profit sectors, and academia and by citizens in
general. The Global Spatial Data Infrastructure (GSDI) Association links national SDIs to
establish a connection for all users in the world to share and reuse the available datasets.

Data Accuracy:
In GIS, data quality is used to give an indication of how good data are. It describes
the overall fitness or suitability of data for a specific purpose or is used to indicate data free
from errors and other problems. Examining issues such as error, accuracy, precision and bias

RMKCET - CSE DEPT Page No- 15

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- II

can help to assess the quality of individual data sets. In addition, the resolution and
generalization of source data, and the data model used, may influence the portrayal of
features of interest. Data sets used for analysis need to be complete, compatible and
consistent, and applicable for the analysis being performed.
Accuracy is the extent to which an estimated data value approaches its true value
(Aronoff, 1989). If a GIS database is accurate, it is a true representation of reality. It is
impossible for a GIS database to be 100 per cent accurate, though it is possible to have data
that are accurate to within specified tolerances. For example, a ski lift station co-ordinate may
be accurate to within plus or minus 10 metres.
Several types of error can arise when accuracy and/or precision requirements are not

P
met during data capture and creation. The five types of error in a geospatial dataset are

AP
related to -
Positional Accuracy:
The identification of positional accuracy is important. This includes consideration of
inherent error (source error) and operational error (introduced error). A more detailed review
is provided in the next section.
R
Attribute Accuracy:
CO

Consideration of the accuracy of attributes also helps to define the quality of the data.
This quality component concerns the identification of the reliability, or level of purity
(homogeneity), in a data set.
Logical Consistency:
U

This component is concerned with determining the faithfulness of the data structure
for a data set. This typically involves spatial data inconsistencies such as incorrect line
ST

intersections, duplicate lines or boundaries, or gaps in lines. These are referred to as spatial
or topological errors.
Completeness:
The final quality component involves a statement about the completeness of the data
set. This includes consideration of holes in the data, unclassified areas, and any compilation
procedures that may have caused data to be eliminated.

*****

RMKCET - CSE DEPT Page No- 16

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

UNIT III DATA INPUT AND TOPOLOGY

Scanner - Raster Data Input – Raster Data File Formats – Vector Data Input –Digitiser –
Topology - Adjacency, connectivity and containment – Topological Consistency rules –
Attribute Data linking – ODBC – GPS - Concept GPS based mapping.

Introduction:
Data encoding is the process of getting data into the computer. It is a process that is
fundamental to almost every GIS project. For example:

P
An archaeologist may encode aerial photographs of ancient remains to integrate with
newly collected field data.

AP
 A planner may digitize outlines of new buildings and plot these on existing
topographical data.
 An ecologist may add new remotely sensed data to a GIS to examine changes in
habitats.
R
 A historian may scan historical maps to create a virtual city from the past.
 A utility company may encode changes in pipeline data to record changes and
CO

upgrades to their pipe network.

Once in a GIS, data almost always need to be corrected and manipulated to ensure
that they can be structured according to the required data model. Problems that may have to
U

be addressed at this stage of a GIS project include:


 the re-projection of data from different map sources to a common projection;
ST

 the generalization of complex data to provide a simpler data set; or


 the matching and joining of adjacent map sheets once the data are in digital form.

This unit looks in detail at the range of methods available to get data into a GIS.
These include keyboard entry, digitizing, scanning and electronic data transfer. Then,
methods of data editing and manipulation are reviewed, including re-projection,
transformation and edge matching. The whole process of data encoding and editing is often
called the ‘data stream’.
Analogue data are normally in paper form, and include paper maps, tables of statistics
and hard-copy (printed) aerial photographs. These data all need to be converted to digital

RMKCET – CSE DEPT Page No- 1

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

form before use in a GIS, thus the data encoding and correction procedures are longer than
those for digital data. Digital data are already in computer-readable formats and are supplied
on CD-ROM or across a computer network. Map data, aerial photographs, satellite imagery,
data from databases and automatic data collection devices (such as data loggers and GPS) are
all available in digital form.

P
AP
R
CO
U

Figure: The Data Stream


ST

SCANNER:
Scanning coverts paper maps into digital format by capturing features as individual
cells, or pixels, producing an automated image. Maps are generally considered the backbone
of any GIS activity. But many a time paper maps are not easily available in a form that can be
readily used by the computers. Most of the paper maps had been prepared on the basis of old
conventional surveys. New maps can be produced using improved technologies but this
requires time as it increases the volume of work. Thus, we have to resort to the available
maps. These paper maps have to be first converted into a digital format usable by the
computer. This is a critical step as the quality of the analog document must be preserved in
the transition to the computer domain.

RMKCET – CSE DEPT Page No- 2

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

The technology used for this kind of conversions is known as scanning and the
instrument used for this kind of operation is known as a scanner. A scanner can be thought of
as an electronic input device that converts analog information of a document like a map,
photograph or an overlay into a digital format that can be used by the computer. Scanning
automatically captures map features, text, and symbols as individual cells, or pixels, and
produces an automated image.

Working of a Scanner:
The most important component inside a scanner is the scanner head which can move
along the length of the scanner. The scanner head contains either a charged-couple device

P
(CCD) sensor or a contact image (CIS) sensor. A CCD consists of a number of photosensitive

AP
cells or pixels packed together on a chip. The most advanced large format scanners use
CCD’s with 8000 pixels per chip for providing a very good image quality.
While scanning a bright white light from the scanner strikes the image to be scanned
and is reflected onto the photosensitive surface of the sensor placed on the scanner head.
Each pixel transfers a gray tone value (values given to the different shades of black in the
R
image ranging from 0 (black) – 255 (white) i.e. 256 values to the scan board (software). The
CO

software interprets the value in terms of 0 (Black) or 1 (white), thereby, forming a


monochrome image of the scanned portion. As the head moves ahead, it scans the image in
tiny strips and the sensor continues to store the information in a sequential fashion. The
software running the scanner pierces together the information from the sensor into a digital
U

form of the image. This type of scanning is known as one pass scanning.
Scanning a colour image is slightly different in which the scanner head has to scan the
ST

same image for three different colours i.e. red, green, blue. In older colour scanners, this was
accomplished by scanning the same area three times over for the three different colours. This
type of scanner is known as three-pass scanner. However, most of the colour scanners now
scan in one pass scanning all the three colours in one go by using colour filters. In principle, a
colour CCD works in the same way as a monochrome CCD. But in this each colour is
constructed by mixing red, green and blue. Thus, a 24-bit RGB CCD presents each pixel by
24 bits of information. Usually, a scanner using these three colours (in full 24 RGB mode)
can create up to 16.8 million colours.

RMKCET – CSE DEPT Page No- 3

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

Types of Scanners:
Hand-held scanners although portable, can only scan images up to about four inches
wide. They require a very steady hand for moving the scan head over the document. They are
useful for scanning small logos or signatures and are virtually of no use for scanning maps
and photographs.
The most commonly used scanner is a flatbed scanner also known as desktop scanner.
It has a glass plate on which the picture or the document is placed. The scanner head placed
beneath the glass plate moves across the picture and the result is a good quality scanned
image. For scanning large maps or top sheets wide format flatbed scanners can be used.

P
AP
R
CO
U

Figure: Types of Scanners

Then there are the drum scanners which are mostly used by the printing professionals.
ST

In this type of scanner, the image or the document is placed on a glass cylinder that rotates at
very high speeds around a centrally located sensor containing photo-multiplier tube instead of
a CCD to scan. Prior to the advances in the field of sheet fed scanners, the drum scanners
were extensively used for scanning maps and other documents.

RASTER GIS FILE FORMATS:


Raster data is made up of pixels (also referred to as grid cells). They are usually
regularly-spaced and square but they don’t have to be. Rasters have pixel that are associated
with a value (continuous) or class (discrete).

RMKCET – CSE DEPT Page No- 4

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

Extension File Type Description


ERDAS Imagine IMG files is a proprietary file
format developed by Hexagon Geospatial. IMG
files are commonly used for raster data to store
single and multiple bands of satellite data.

IMG files use a hierarchical format (HFA) that


are optional to store basic information about the
file. For example, this can include file
ERDAS Imagine (IMG) .IMG information, ground control points and sensor
type.

Each raster layer as part of an IMG file contains

P
information about its data values. For example,
this includes projection, statistics, attributes,
pyramids and whether or not it’s a continuous or

AP
discrete type of raster.
ASCII uses a set of numbers (including floats)
between 0 and 255 for information storage and
processing. They also contain header
information with a set of keywords.
American Standard Code for
R
Information Interchange .ASC In their native form, ASCII text files store GIS
ASCII Grid data in a delimited format. This could be
comma, space or tab-delimited format. Going
CO

from non-spatial to spatial data, you can run a


conversion process tool like ASCII to raster.
The GeoTIFF has become an industry image
standard file for GIS and satellite remote sensing
applications. GeoTIFFs may be accompanied by
U

other files:

 TFW is the world file that is required to


ST

.TIF give your raster geolocation.


GeoTIFF .TIFF
 XML optionally accompany GeoTIFFs
.OVR
and are your metadata.
 AUX auxiliary files store projections and
other information.
 OVR pyramid files improves
performance for raster display.

RMKCET – CSE DEPT Page No- 5

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

IDRISI assigns RST extensions to all raster


layers. They consist of numeric grid cell values
as integers, real numbers, bytes and RGB24.

.RST The raster documentation file (RDC) is a


IDRISI Raster companion text file for RST files. They assign
.RDC
the number of columns and rows to RST files.
Further to this, they record the file type,
coordinate system, reference units and positional
error.
Band Interleaved files are a raster storage
extension for single/multi-band aerial and
satellite imagery.

P
 Band Interleaved for Line (BIL) stores
pixel information based on rows for all

AP
bands in an image.
.BIL  Whereas Band interleaved by pixel (BIP)
Envi RAW Raster .BIP assigns pixel values for each band by
.BSQ rows.
 Finally, Band sequential format (BSQ)
stores separate bands by rows.
R
BIL files consist of a header file (HDR) that
describes the number of columns, rows, bands,
CO

bit depth and layout in an image.


Grid files are a proprietary format developed by
Esri. Grids have no extension and are unique
because they can hold attribute data in a raster
file. But the catch is that you can only add
U

attributes to integer grids.

Attributes are stored in a value attribute tables


(VAT) – one record for each unique value in the
ST

Esri Grid grid, and the count representing the number of


cells.

The two types of Esri Grid files are integer and


floating point grids. Land cover would be an
example of a discrete grid. Each class has a
unique integer cell value. Elevation data is an
example of a floating point grid. Each cell
represents an elevation floating value.

RMKCET – CSE DEPT Page No- 6

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

VECTOR GIS FILE FORMATS:

Vector data is not made up of grids of pixels. Instead, vector graphics are comprised
of vertices and paths. The three basic symbol types for vector data are points, lines and
polygons (areas).

P
AP
R
CO
U
ST

RMKCET – CSE DEPT Page No- 7

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

Extension File Type Description


The shapefile is BY FAR the most common
geospatial file type you’ll encounter. All
commercial and open source accept shapefile
as a GIS format. It’s so ubiquitous that it’s
become the industry standard.

Esri Shapefile .SHP, But you’ll need a complete set of three files
.DBF, that are mandatory to make up a shapefile. The
.SHX three required files are:

P
SHP is the feature geometry.
 SHX is the shape index position.
 DBF is the attribute data.

AP
The GeoJSON format is mostly for web-based
mapping. GeoJSON stores coordinates as text
in JavaScript Object Notation (JSON) form.
This includes vector points, lines and polygons
as well as tabular information.
R
GeoJSON store objects within curly braces {}
.GEOJSON and in general have less markup overhead
CO

Geographic JavaScript Object


.JSON (compared to GML). GeoJSON has
Notation (GeoJSON)
straightforward syntax that you can modify in
any text editor.

Webmaps browsers understand JavaScript so


U

by default GeoJSON is a common web format.


But JavaScript only understands binary
objects. Fortunately, JavaScript can convert
ST

JSON to binary.
GML allows for the use of geographic
coordinates extension of XML. And
eXtensible Markup Language (XML) is both
human-readable and machine-readable.

GML stores geographic entities (features) in


the form of text. Similar to GeoJSON, GML
Geography Markup Language
.GML can be updated in any text editor. Each feature
(GML)
has a list of properties, geometry (points, lines,
curves, surfaces and polygons) and spatial
reference system.

There is generally more overhead when


compare GML with GeoJSON. This is because
GML results in more data for the same amount
of information.
RMKCET – CSE DEPT Page No- 8

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

KML stands for Keyhole Markup Language.


This GIS format is XML-based and is
primarily used for Google Earth. KML was
developed by Keyhole Inc which was later
acquired by Google.

KMZ (KML-Zipped) replaced KML as being


Google Keyhole Markup the default Google Earth geospatial format
.KML
because it is a compressed version of the file.
Language (KML/KMZ) .KMZ KML/KMZ became an international standard
of the Open Geospatial Consortium in 2008.

The longitude, latitude components (decimal


degrees) are as defined by the World Geodetic

P
System of 1984 (WGS84). The vertical
component (altitude) is measured in meters
from the WGS84 EGM96 Geoid vertical

AP
datum.
GPS Exchange format is an XML schema that
describes waypoints, tracks and routes
captured from a GPS receiver. Because GPX is
an exchange format, you can openly transfer
GPS data from one program to another based
R
GPS eXchange Format (GPX) .GPX on its description properties.

The minimum requirement for GPX are


CO

latitude and longitude coordinates. In addition,


GPX files optionally stores location properties
including time, elevation and geoid height as
tags.
IDRISI vector data files have a VCT extension
U

along with an associated vector documentation


file with a VDC extension.
ST

VCT format are limited to points, lines,


.VCT polygons, text and photos. Upon the creation
IDRISI Vector
.VDC of an IDRISI vector file, it automatically
creates a documentation file for building
metadata.

Attributes are stored directly in the vector files.


But you can optionally use independent data
tables and value files.

RMKCET – CSE DEPT Page No- 9

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

DIGITIZING:
Digitizing in GIS is the process of converting geographic data either from a hardcopy
or a scanned image into vector data by tracing the features. During the digitzing process,
features from the traced map or image are captured as coordinates in either point, line, or
polygon format.

Types of Digitizing in GIS:


The most common method of encoding spatial features from paper maps is manual

P
digitizing. It is an appropriate technique when selected features are required from a paper

AP
map. Manual digitizing requires a digitizing table that is linked to a computer workstation.
The digitizing table is essentially a large flat tablet, the surface of which is underlain by a
very fine mesh of wires. Attached to the digitizer via a cable is a cursor (puck) that can be
moved freely over the surface of the table. Buttons on the cursor allow the user to send
instructions to the computer. The position of the cursor on the table is registered by reference
R
to its position above the wire mesh.
CO
U
ST

Figure: Manual Digitizer

Heads up digitizing (also referred to as on-screen digitizing) is the method of tracing


geographic features from another dataset (usually an aerial, satellite image, or scanned image
of a map) directly on the computer screen. Automated digitizing involves using
image processing software that contains pattern recognition technology to generated vectors.

RMKCET – CSE DEPT Page No- 10

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

The procedure followed when digitizing a paper map using a manual digitizer has the
following five stages:
 Registration: The map to be digitized is fixed firmly to the table top with sticky tape.
Five or more control points are identified (usually the four corners of the map sheet
and one or more grid intersections in the middle). The geographic co-ordinates of the
control points are noted and their locations digitized by positioning the cross-hairs on
the cursor exactly over them and pressing the ‘digitize’ button on the cursor. This
sends the co-ordinates of a point on the table to the computer and stores them in a file
as ‘digitizer co-ordinates’.
 Digitizing point features: Point features, for example spot heights, hotel locations or

P
meteorological stations, are recorded as a single digitized point. A unique code

AP
number or identifier is added so that attribute information may be attached later. For
instance, the hotel with ID number ‘1’ would later be identified as ‘Mountain View’.
 Digitizing line features: Line features (such as roads or rivers) are digitized as a series
of points that the software will join with straight line segments. In some GIS packages
R
lines are referred to as arcs, and their start and end points as nodes. This gives rise to
the term arc–node topology, used to describe a method of structuring line features.
CO

 Digitizing area (polygon) features: Area features or polygons, for example forested
areas or administrative boundaries, are digitized as a series of points linked together
by line segments in the same way as line features. Here it is important that the start
and end points join to form a complete area. Polygons can be digitized as a series of
U

individual lines, which are later joined to form areas. In this case it is important that
each line segment is digitized only once.
ST

 Adding attribute information: Attribute data may be added to digitized polygon


features by linking them to a centroid (or seed point) in each polygon. These are either
digitized manually (after digitizing the polygon boundaries) or created automatically
once the polygons have been encoded. Using a unique identifier or code number,
attribute data can then be linked to the polygon centroids of appropriate polygons. In
this way, the forest stand may have data relating to tree species, tree ages, tree
numbers and timber volume attached to a point within the polygon.

Manual digitizers may be used in one of two modes: point mode or stream mode. In
point mode the user begins digitizing each line segment with a start node, records each

RMKCET – CSE DEPT Page No- 11

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

change in direction of the line with a digitized point and finishes the segment with an end
node. Thus, a straight line can be digitized with just two points, the start and end nodes. For
more complex lines, a greater number of points are required between the start and end nodes.
Smooth curves are problematic since they require an infinite number of points to record their
true shape.

P
AP
(a) Point mode - person digitizing decides where to place each individual point such as to most accurately
R
represent the line within the accepted tolerances of the digitizer.
(b) Stream mode – person digitizing decides on time or distance interval between the digitizing hardware
registering each point as the person digitizing moves the cursor along the line.
CO

In stream mode the digitizer is set up to record points according to a stated time
interval or on a distance basis. Once the user has recorded the start of a line the digitizer
U

might be set to record a point automatically every 0.5 seconds and the user must move the
cursor along the line to record its shape. An end node is required to stop the digitizer
ST

recording further points. The speed at which the cursor is moved along the line determines
the number of points recorded. Thus, where the line is more complex and the cursor needs to
be moved more slowly and with more care, a greater number of points will be recorded.
Conversely, where the line is straight, the cursor can be moved more quickly and fewer
points are recorded.
The choice between point mode and stream mode digitizing is largely a matter of
personal preference. Stream mode digitizing requires more skill than point mode digitizing,
and for an experienced user may be a faster method. Stream mode will usually generate more
points, and hence larger files, than point mode.

RMKCET – CSE DEPT Page No- 12

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

TOPOLOGY:
Topology is the mathematical representation of the physical relationships that exists
between the geographical elements. Topology has long been a key GIS requirement for data
management and integrity. In general, a topological data model manages spatial relationships
by representing spatial objects (point, line, and area features) as an underlying graph of
topological primitives—nodes, faces, and edges. These primitives, together with their
relationships to one another and to the features whose boundaries they represent, are defined
by representing the feature geometries in a planar graph of topological elements.
Topology is useful in GIS because many spatial modeling operations don’t require
coordinates, only topological information. For example, to find an optimal path between two

P
points requires a list of the arcs that connect to each other and the cost to traverse each arc in

AP
each direction. Coordinates are only needed for drawing the path after it is calculated.

The topological structure supports three major topological concepts:


 Connectivity: Arcs connect to each other at nodes.
 Area definition: Arcs that connect to surround an area define a polygon.
R
 Contiguity: Arcs have direction and left and right sides.
CO

Connectivity
Connectivity is defined through arc-node topology. This is the basis for many network
tracing and path finding operations. Connectivity allows you to identify a route to the airport,
U

connect streams to rivers, or follow a path from the water treatment plant to a house.
In the arc-node data structure, an arc is defined by two endpoints: the from-node
ST

indicating where the arc begins and a to-node indicating where it ends. This is called arc-node
topology.

Figure: Arc-Node topology example

RMKCET – CSE DEPT Page No- 13

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

Arc-node topology is supported through an arc-node list. The list identifies the from-
and to-nodes for each arc. Connected arcs are determined by searching through the list for
common node numbers. In the above example, it is possible to determine that arcs 1, 2, and 3
all intersect because they share node 11. The computer can determine that it is possible to
travel along arc 1 and turn onto arc 3 because they share a common node (11), but it's not
possible to turn directly from arc 1 onto arc 5 because they don't share a common node.

Containment:
Many of the geographic features that may be represented cover a distinguishable area
on the surface of the earth, such as lakes, parcels of land, and census tracts. An area is

P
represented in the vector model by one or more boundaries defining a polygon. Although this
sounds counterintuitive, consider a lake with an island in the middle. The lake actually has

AP
two boundaries: one that defines its outer edge and the island that defines its inner edge. In
the terminology of the vector model, an island defines an inner boundary (or hole) of a
polygon.
The arc-node structure represents polygons as an ordered list of arcs rather than a
R
closed loop of x,y coordinates. This is called polygon-arc topology. In the illustration below,
polygon F is made up of arcs 8, 9, 10, and 7 (the 0 before the 7 indicates that this arc creates
CO

an island in the polygon).


U
ST

Figure: Polygon-Arc topology example


Each arc appears in two polygons (in the above example, arc 6 appears in the list for
polygons B and C). Since the polygon is simply the list of arcs defining its boundary, arc
coordinates are stored only once, thereby reducing the amount of data and ensuring that the
boundaries of adjacent polygons don't overlap.

Contiguity:
Two geographic features that share a boundary are called adjacent. Contiguity is the
topological concept that allows the vector data model to determine adjacency. Polygon

RMKCET – CSE DEPT Page No- 14

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

topology defines contiguity. Polygons are contiguous to each other if they share a common
arc. This is the basis for many neighbor and overlay operations.
Recall that the from-node and to-node define an arc. This indicates an arc's direction
so the polygons on its left and right sides can be determined. Left-right topology refers to the
polygons on the left and right sides of an arc. In the below example, polygon B is on the left
of arc 6, and polygon C is on the right. Thus we know that polygons B and C are adjacent.

P
AP
Figure: Left-Right topology example
R
Notice that the label for polygon A is outside the boundary of the area. This polygon
is called the external, or universe, polygon and represents the world outside the study area.
CO

The universe polygon ensures that each arc always has a left and right side defined.

Topology Rules:
There are many topology rules you can implement in your geodatabase, depending on
U

the spatial relationships that are most important for your organization to maintain. You
should carefully plan the spatial relationships you will enforce on your features. Some
ST

topology rules govern the relationships of features within a given feature class, while others
govern the relationships between features in two different feature classes or subtypes.
Topology rules can be defined between sub types of features in one or another feature class.
This could be used, for example, to require street features to be connected to other street
features at both ends, except in the case of streets belonging to the cul-de-sac or dead-end
subtypes.
Many topology rules can be imposed on features in a geodatabase. A well-designed
geodatabase will have only those topology rules that define key spatial relationships needed
by an organization. Most topology violations have fixes that you can use to correct errors.

RMKCET – CSE DEPT Page No- 15

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

Topology rules based on points:


Must Coincide With:
Requires that points in one feature class (or subtype) be coincident with points in
another feature class (or subtype). This is useful for cases where points must be covered by
other points, such as transformers must coincide with power poles in electric distribution
networks and observation points must coincide with stations.

P
AP
Must Be Disjoint:
Requires that points be separated spatially from other points in the same feature class
(or subtype). Any points that overlap are errors. This is useful for ensuring that points are not
coincident or duplicated within the same feature class, such as in layers of cities, parcel lot ID
R
points, wells, or streetlamp poles.
CO
U
ST

Must Be Covered By Endpoint of:


Requires that points in one feature class must be covered by the endpoints of lines in
another feature class. This rule is similar to the line rule Endpoint Must Be Covered By
except that, in cases where the rule is violated, it is the point feature that is marked as an error
rather than the line. Boundary corner markers might be constrained to be covered by the
endpoints of boundary lines.

RMKCET – CSE DEPT Page No- 16

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

Point Must Be Covered By Line:


Requires that points in one feature class be covered by lines in another feature class. It
does not constrain the covering portion of the line to be an endpoint. This rule is useful for
points that fall along a set of lines, such as highway signs along highways.

P
AP
Must Be Properly Inside Polygons:
Requires that points fall within area features. This is useful when the point features
are related to polygons, such as wells and well pads or address points and parcels.
R
CO

Must Be Covered By Boundary of:


U

Requires that points fall on the boundaries of area features. This is useful when the
point features help support the boundary system, such as boundary markers, which must be
ST

found on the edges of certain areas.

Topology rules based on Lines:


Must Not Have Dangles:
Requires that a line feature must touch lines from the same feature class (or subtype)
at both endpoints. An endpoint that is not connected to another line is called a dangle. This

RMKCET – CSE DEPT Page No- 17

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

rule is used when line features must form closed loops, such as when they are defining the
boundaries of polygon features. It may also be used in cases where lines typically connect to
other lines, as with streets. In this case, exceptions can be used where the rule is occasionally
violated, as with cul-de-sac or dead-end street segments.

P
Must Not Overlap:
Requires that lines not overlap with lines in the same feature class (or subtype). This

AP
rule is used where line segments should not be duplicated, for example, in a stream feature
class. Lines can cross or intersect but cannot share segments.
R
CO

Must Not Self-Overlap:


Requires that line features not overlap themselves. They can cross or touch
U

themselves but must not have coincident segments. This rule is useful for features, such as
streets, where segments might touch in a loop but where the same street should not follow the
ST

same course twice.

Must Not Self-Intersect:


Requires that line features not cross or overlap themselves. This rule is useful for
lines, such as contour lines, that cannot cross themselves.

RMKCET – CSE DEPT Page No- 18

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

Must Not Intersect :


Requires that a line in one feature class (or subtype) must only touch other lines of the
same feature class (or subtype) at endpoints. Any line segment in which features overlap or
any intersection not at an endpoint is an error. This rule is useful where lines must only be
connected at endpoints, such as in the case of plot lines, which must split (only connect to the

P
endpoints of) back lot lines and cannot overlap each other.

AP
R
Must Not Have Pseudo Nodes:
Requires that a line connect to at least two other lines at each endpoint. Lines that
CO

connect to one other line (or to themselves) are said to have pseudo nodes. This rule is used
where line features must form closed loops, such as when they define the boundaries of
polygons or when line features logically must connect to two other line features at each end,
as with segments in a stream network, with exceptions being marked for the originating ends
U

of first-order streams.
ST

Must Be Larger Than Cluster Tolerance:


Requires that a feature does not collapse during a validate process. This rule is
mandatory for a topology and applies to all line and polygon feature classes. In instances
where this rule is violated, the original geometry is left unchanged.

RMKCET – CSE DEPT Page No- 19

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

Topology rules based on Polygons:


Must Not Overlap:
Requires that the interior of polygons not overlap. The polygons can share edges or
vertices. This rule is used when an area cannot belong to two or more polygons. It is useful

P
for modeling administrative boundaries, such as ZIP Codes or voting districts, and mutually

AP
exclusive area classifications, such as land cover or landform type.
R
CO

Must Not Have Gaps:


This rule requires that there are no voids within a single polygon or between adjacent
polygons. All polygons must form a continuous surface. An error will always exist on the
U

perimeter of the surface. You can either ignore this error or mark it as an exception. Use this
rule on data that must completely cover an area. For example, soil polygons cannot include
ST

gaps or form voids—they must cover an entire area.

Contains Point:
Requires that a polygon in one feature class contain at least one point from another
feature class. Points must be within the polygon, not on the boundary. This is useful when
every polygon should have at least one associated point, such as when parcels must have an
address point.

RMKCET – CSE DEPT Page No- 20

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

Contains One Point:


Requires that each polygon contains one point feature and that each point feature falls
within a single polygon. This is used when there must be a one-to-one correspondence
between features of a polygon feature class and features of a point feature class, such as

P
administrative boundaries and their capital cities. Each point must be properly inside exactly

AP
one polygon and each polygon must properly contain exactly one point. Points must be
within the polygon, not on the boundary.
R
CO

Must Not Overlap With:


Requires that the interior of polygons in one feature class (or subtype) must not
overlap with the interior of polygons in another feature class (or subtype). Polygons of the
U

two feature classes can share edges or vertices or be completely disjointed. This rule is used
when an area cannot belong to two separate feature classes. It is useful for combining two
ST

mutually exclusive systems of area classification, such as zoning and water body type, where
areas defined within the zoning class cannot also be defined in the water body class and vice
versa.

Must Cover Each Other:


Requires that the polygons of one feature class (or subtype) must share all of their
area with the polygons of another feature class (or subtype). Polygons may share edges or

RMKCET – CSE DEPT Page No- 21

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

vertices. Any area defined in either feature class that is not shared with the other is an error.
This rule is used when two systems of classification are used for the same geographic area,
and any given point defined in one system must also be defined in the other. One such case
occurs with nested hierarchical datasets, such as census blocks and block groups or small
watersheds and large drainage basins. The rule can also be applied to non-hierarchically
related polygon feature classes, such as soil type and slope class.

P
Area Boundary Must Be Covered By Boundary of:

AP
Requires that boundaries of polygon features in one feature class (or subtype) be
covered by boundaries of polygon features in another feature class (or subtype). This is useful
when polygon features in one feature class, such as subdivisions, are composed of multiple
polygons in another class, such as parcels, and the shared boundaries must be aligned.
R
CO
U

ATTRIBUTE DATA LINKING:


ST

There are two types of GIS data: spatial data (coordinate and projection information
for spatial features) and attribute data. Attribute data is additional information appended in
tabular format linked with spatial features. The attribute data is linked with spatial data
through unique id (i.e. feature ID). The spatial data contains information about where and
attribute data can contain information about what, where, and why. Attribute data provides
characteristics about spatial data.

RMKCET – CSE DEPT Page No- 22

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

Figure: Attribute data and Spatial data linking

Joins:
When our data was all in a single table, we could easily retrieve a particular row from

P
that table. But if the data we are looking for is available in two or more tables then joins can

AP
be used to retrieve those data. Join is used to fetch data from two or more tables, which is
joined to appear as single set of data. It is used for combining column from two or more
tables by using values common to both tables.
There are several types of JOINs: INNER, LEFT OUTER and RIGHT OUTER; they
all do slightly different things, but the basic theory behind them all is the same.
R
CO

Inner Join:
An INNER JOIN returns a result set that contains the common elements of the tables,
i.e. the intersection where they match on the joined condition. An INNER JOIN focuses on
the commonality between two tables. When using an INNER JOIN, there must be at least
U

some matching data between two (or more) tables that are being compared. INNER JOINs
are the most frequently used JOIN operation.
ST

RMKCET – CSE DEPT Page No- 23

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

Left Outer Join:


A LEFT JOIN or a LEFT OUTER JOIN takes all the rows from one table, defined as
the left table, and joins it with a second table. A LEFT JOIN will always include the rows
from the LEFT table, even if there are no matching rows in the table it is JOINed with.

P
AP
R
CO

Left outer join


U
ST

Right Outer Join:


A RIGHT OUTER JOIN is similar to a LEFT OUTER JOIN except that the roles
between the two tables are reversed, and all the rows on the second table are included along
with any matching rows from the first table i.e. A RIGHT JOIN will always include the rows
from the RIGHT table, even if there are no matching rows in the table it is JOINed with.

RMKCET – CSE DEPT Page No- 24

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

P
AP
R
Relates:
Relates can help us to discover specific information within our data. A relate (also
CO

called a table relate) is a property of a layer. We can create a table relate so that we can query
and select features in one layer and see all the related features in another layer or
table. Unlike joining tables, relating tables simply defines a relationship between two tables.
The associated data isn't appended to the layer's attribute table like it is with a join. Instead,
U

we can access the related data through selected features or records in your layer or table.
ST

Relation Class:
A relationship class is an object in a geo-database that stores information about a
relationship between two feature classes, between a feature class and a non-spatial table, or
between two non-spatial tables. Both participants in a relationship class must be stored in the
same geo-database.
A relationship class stores information about associations among features and records
in a geo-database and can help ensure your data's integrity. Relates that are added to a layer
or table in a map are essentially the same as simple relationship classes defined in a geo-
database, except that they are saved with the map instead of in a geo-database.

RMKCET – CSE DEPT Page No- 25

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

Open Database Connectivity (ODBC):


An Open Database Connectivity (ODBC) is an interface that allows applications to
access data in database management systems (DBMS) using SQL as a standard for accessing
the data. ODBC permits maximum interoperability, which means a single application can
access different DBMS. Application end users can then add ODBC database drivers to link
the application to their choice of DBMS. Application end users can then add ODBC database
drivers to link the application to their choice of DBMS.

P
AP
R
CO

Figure: Architecture of ODBC


The ODBC solution for accessing data led to ODBC database drivers, which are
dynamic-link libraries on Windows and shared objects on Linux/UNIX. These drivers allow
U

an application to gain access to one or more data sources. ODBC provides a standard
interface to allow application developers and vendors of database drivers to exchange data
ST

between applications and data sources.


ODBC Driver Manager:
The ODBC Driver Manager loads and unloads ODBC drivers on behalf of an
application. The Windows platform comes with a default Driver Manager, while non-
windows platforms have the choice to use an open source ODBC Driver Manager like
unixODBC and iODBC. The ODBC Driver Manager processes ODBC function calls, or
passes them to an ODBC driver and resolves ODBC version conflicts.
ODBC Driver:
The ODBC driver processes ODBC function calls, submits SQL requests to a specific
data source and returns results to the application. The ODBC driver may also modify an

RMKCET – CSE DEPT Page No- 26

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

application’s request so that the request conforms to syntax supported by the associated
database. A framework to easily build an ODBC drivers is available from Simba
Technologies, as are ODBC drivers for many data sources, such as Salesforce, MongoDB,
Spark and more.
The following are the steps involved in connecting application programs with the
database using ODBC API:
 Load ODBC driver: The forName() method of Class class is used to register the driver
class. This method is used to dynamically load the driver class.
 Establish Connection: The getConnection() method of DriverManager class is used to
establish connection with the database.

P
 Prepare and Execute SQL Statement: The createStatement() method of Connection

AP
interface is used to create statement. The executeQuery() and execute() method is
used to execute queries to the database.
 Process the result: The executeQuery() method returns the object of ResultSet that can
be used to get all the records of a table.

R
Close connection: The close() method is used to close the connection in order to free
the allocated resource used by the connection.
CO

The below java code is used for connecting with mysql database using ODBC
application programming interface.
U
ST

Global Positioning System:


The Global Positioning System (GPS) is a U.S.-owned utility that provides users with
positioning, navigation, and timing (PNT) services. This system consists of three segments:
the space segment, the control segment, and the user segment. The U.S. Air Force develops,
maintains, and operates the space and control segments. GPS technology was first used by the
United States military in the 1960s and expanded into civilian use over the next few decades.

RMKCET – CSE DEPT Page No- 27

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

Today, GPS receivers are included in many commercial products, such as automobiles, smart
phones, exercise watches, and GIS devices.

P
Figure: Three Segments of GPS
Space Segment:

AP
The GPS space segment consists of a constellation of satellites transmitting radio
signals to users. The United States is committed to maintaining the availability of at least 24
operational GPS satellites, 95% of the time. To ensure this commitment, the Air Force has
been flying 31 operational GPS satellites for the past few years. GPS satellites fly in medium
R
Earth orbit (MEO) at an altitude of approximately 20,200 km (12,550 miles). Each satellite
circles the Earth twice a day. The satellites in the GPS constellation are arranged into six
CO

equally-spaced orbital planes surrounding the Earth. Each plane contains four "slots"
occupied by baseline satellites. This 24-slot arrangement ensures users can view at least four
satellites from virtually any point on the planet.
U
ST

Figure: Constellation of satellites


The Air Force normally flies more than 24 GPS satellites to maintain coverage
whenever the baseline satellites are serviced or decommissioned. The extra satellites may
increase GPS performance but are not considered part of the core constellation. In June 2011,
the Air Force successfully completed a GPS constellation expansion known as the
"Expandable 24" configuration. Three of the 24 slots were expanded, and six satellites were

RMKCET – CSE DEPT Page No- 28

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

repositioned, so that three of the extra satellites became part of the constellation baseline. As
a result, GPS now effectively operates as a 27-slot constellation with improved coverage in
most parts of the world.

Control Segments:
The GPS control segment consists of a global network of ground facilities that track
the GPS satellites, monitor their transmissions, perform analyses, and send commands and
data to the constellation. The current Operational Control Segment (OCS) includes a master
control station, an alternate master control station, 11 command and control antennas, and 16
monitoring sites. The locations of these facilities are shown in the map above.

P
AP
R
CO
U

Figure: GPS Control Segments


The GPS constellation delivers consistently high performance thanks to the dedicated
ST

efforts of its operators — the men and women of the U.S. Air Force's 2nd Space Operations
Squadron (2SOPS) and the Air Force Reserve's 19th Space Operations Squadron (19SOPS)
at Schriever Air Force Base, Colorado.

User Segments:
Like the Internet, GPS is an essential element of the global information infrastructure.
The free, open, and dependable nature of GPS has led to the development of hundreds of
applications affecting every aspect of modern life. GPS technology is now in everything from
cell phones and wristwatches to bulldozers, shipping containers, and ATM's.

RMKCET – CSE DEPT Page No- 29

DOWNLOADED FROM STUCOR APP


DOWNLOADED FROM STUCOR APP
OCE552 – GEOGRAPHIC INFORMATION SYSTEM UNIT- III

GPS BASED MAPPING:


The surveying and mapping community was one of the first to take advantage of GPS
because it dramatically increased productivity and resulted in more accurate and reliable data.
Today, GPS is a vital part of surveying and mapping activities around the world. When used
by skilled professionals, GPS provides surveying and mapping data of the highest accuracy.
GPS-based data collection is much faster than conventional surveying and mapping
techniques, reducing the amount of equipment and labor required. A single surveyor can now
accomplish in one day what once took an entire team weeks to do. GPS supports the accurate
mapping and modeling of the physical world — from mountains and rivers to streets and
buildings to utility lines and other resources. Features measured with GPS can be displayed

P
on maps and in geographic information systems (GIS) that store, manipulate, and display

AP
geographically referenced data.
Governments, scientific organizations, and commercial operations throughout the
world use GPS and GIS technology to facilitate timely decisions and wise use of resources.
Any organization or agency that requires accurate location information about its assets can
benefit from the efficiency and productivity provided by GPS positioning. Unlike
R
conventional techniques, GPS surveying is not bound by constraints such as line-of-sight
CO

visibility between survey stations. The stations can be deployed at greater distances from
each other and can operate anywhere with a good view of the sky, rather than being confined
to remote hilltops as previously required.
GPS is especially useful in surveying coasts and waterways, where there are few land-
U

based reference points. Survey vessels combine GPS positions with sonar depth soundings to
make the nautical charts that alert mariners to changing water depths and underwater hazards.
ST

Bridge builders and offshore oil rigs also depend on GPS for accurate hydrographic surveys.
Land surveyors and mappers can carry GPS systems in backpacks or mount them on vehicles
to allow rapid, accurate data collection. Some of these systems communicate wirelessly with
reference receivers to deliver continuous, real-time, centimeter-level accuracy and
unprecedented productivity gains. To achieve the highest level of accuracy, most survey-
grade receivers use two GPS radio frequencies: L1 and L2. Currently, there is no fully
functional civilian signal at L2, so these receivers leverage a military L2 signal using
"codeless" techniques.

*****

RMKCET – CSE DEPT Page No- 30

DOWNLOADED FROM STUCOR APP

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy