0% found this document useful (0 votes)
44 views40 pages

XML: Extensible Markup Language

Elmasri_6e_Ch12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views40 pages

XML: Extensible Markup Language

Elmasri_6e_Ch12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 40

Chapter 12

XML: Extensible
Markup
Language

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley


Chapter 12 Outline
 Structured, Semistructured,
and Unstructured Data
 XML Hierarchical (Tree) Data Model
 XML Documents, DTD, and XML Schema
 Storing and Extracting XML Documents
from Databases
 XML Languages
 Extracting XML Documents from Relational
Databases
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
XML: Extensible
Markup Language
 Data sources
 Database storing data for Internet applications
 Hypertext documents
 Common method of specifying contents and
formatting of Web pages
 XML data model

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Structured, Semistructured,
and Unstructured Data
 Structured data
 Represented in a strict format
 Example: information stored in databases

 Semistructured data
 Has a certain structure
 Not all information collected will have identical
structure

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Structured, Semistructured,
and Unstructured Data (cont’d.)
 Schema information mixed in with data values
 Self-describing data
 May be displayed as a directed graph
• Labels or tags on directed edges represent:
• Schema names
• Names of attributes
• Object types (or entity types or classes)
• Relationships

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Structured, Semistructured,
and Unstructured Data (cont’d.)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Structured, Semistructured,
and Unstructured Data (cont’d.)
 Unstructured data
 Limited indication of the of data document that
contains information embedded within it
 HTML tag
 Text that appears between angled brackets:
<...>
 End tag
 Tag with a slash: </...>

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Structured, Semistructured,
and Unstructured Data (cont’d.)
 HTML uses a large number of predefined
tags
 HTML documents
 Do not include schema information about type
of data
 Static HTML page
 All information to be displayed explicitly spelled
out as fixed text in HTML file

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Copyright © 2011 Ramez Elmasri and Shamkant Navathe
XML Hierarchical (Tree) Data
Model
 Elements and attributes
 Main structuring concepts used to construct an
XML document
 Complex elements
 Constructed from other elements hierarchically
 Simple elements
 Contain data values
 XML tag names
 Describe the meaning of the data elements in
the document
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
XML Hierarchical (Tree) Data
Model (cont’d.)
 Tree model or hierarchical model
 Main types of XML documents
 Data-centric XML documents
 Document-centric XML documents
 Hybrid XML documents

 Schemaless XML documents


 Do not follow a predefined schema of element
names and corresponding tree structure

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


XML Hierarchical (Tree) Data
Model (cont’d.)
 XML attributes
 Describe properties and characteristics of the
elements (tags) within which they appear
 May reference another element in another
part of the XML document
 Common to use attribute values in one element
as the references

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


XML Documents, DTD, and XML
Schema
 Well formed
 Has XML declaration
• Indicates version of XML being used as well as any
other relevant attributes
 Every element must matching pair of start and
end tags
• Within start and end tags of parent element
 DOM (Document Object Model)
 Manipulate resulting tree representation
corresponding to a well-formed XML document
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
XML Documents, DTD, and XML
Schema (cont’d.)
 SAX (Simple API for XML)
 Processing of XML documents on the fly
• Notifies processing program through callbacks
whenever a start or end tag is encountered
 Makes it easier to process large documents
 Allows for streaming

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


XML Documents, DTD, and XML
Schema (cont’d.)
 Valid
 Document must be well formed
 Document must follow a particular schema
 Start and end tag pairs must follow structure
specified in separate XML DTD (Document
Type Definition) file or XML schema file

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


XML Documents, DTD, and XML
Schema (cont’d.)
 Notation for specifying elements
 XML DTD
 Data types in DTD are not very general
 Special syntax
• Requires specialized processors
 All DTD elements always forced to follow the
specified ordering of the document
• Unordered elements not permitted

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


XML Schema
 XML schema language
 Standard for specifying the structure of XML
documents
 Uses same syntax rules as regular XML
documents
• Same processors can be used on both

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Copyright © 2011 Ramez Elmasri and Shamkant Navathe
XML Schema (cont’d.)
 Identify specific set of XML schema
language elements (tags) being used
 Specify a file stored at a Web site location
 XML namespace
 Defines the set of commands (names) that can
be used

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


XML Schema (cont’d.)
 XML schema concepts:
 Description and XML namespace
 Annotations, documentation, language
 Elements and types
 First level element
 Element types, minOccurs, and maxOccurs
 Keys
 Structures of complex elements
 Composite attributes
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Storing and Extracting XML
Documents from Databases
 Most common approaches
 Using a DBMS to store the documents as text
• Can be used if DBMS has a special module for
document processing
 Using a DBMS to store document contents as
data elements
• Require mapping algorithms to design a database
schema that is compatible with XML document
structure

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Storing and Extracting XML
Documents from Databases
(cont’d.)
 Designing a specialized system for storing
native XML data
• Called Native XML DBMSs
 Creating or publishing customized XML
documents from preexisting relational
databases
• Use a separate middleware software layer to handle
conversions

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


XML Languages
 Two query language standards
 XPath
• Specify path expressions to identify certain nodes
(elements) or attributes within an XML document
that match specific patterns
 XQuery
• Uses XPath expressions but has additional
constructs

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


XPath: Specifying Path
Expressions in XML
 XPath expression
 Returns a sequence of items that satisfy a
certain pattern as specified by the expression
 Either values (from leaf nodes) or elements or
attributes
 Qualifier conditions
• Further restrict nodes that satisfy pattern
 Separators used when specifying a path:
 Single slash (/) and double slash (//)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


XPath: Specifying Path
Expressions in XML (cont’d.)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


XPath: Specifying Path
Expressions in XML (cont’d.)
 Attribute name prefixed by the @ symbol
 Wildcard symbol *
 Stands for any element
 Example: /company/*

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


XPath: Specifying Path
Expressions in XML (cont’d.)
 Axes
 Move in multiple directions from current node
in path expression
 Include self, child, descendent, attribute,
parent, ancestor, previous sibling, and next
sibling

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


XPath: Specifying Path
Expressions in XML (cont’d.)
 Main restriction of XPath path expressions
 Path that specifies the pattern also specifies
the items to be retrieved
 Difficult to specify certain conditions on the
pattern while separately specifying which result
items should be retrieved

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


XQuery: Specifying Queries in
XML
 XQuery FLWR expression
 Four main clauses of XQuery
 Form:
FOR <variable bindings to individual
nodes (elements)>
LET <variable bindings to collections of
nodes (elements)>
WHERE <qualifier conditions>
RETURN <query result specification>
 Zero or more instances of FOR and LET
clauses
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
XQuery: Specifying Queries in
XML (cont’d.)
 XQuery contains powerful constructs to
specify complex queries
 www.w3.org
 Contains documents describing the latest
standards related to XML and XQuery

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Other Languages and Protocols
Related to XML
 Extensible Stylesheet Language (XSL)
 Define how a document should be rendered for
display by a Web browser
 Extensible Stylesheet Language for
Transformations (XSLT)
 Transform one structure into different structure
 Web Services Description Language
(WSDL)
 Description of Web Services in XML
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Other Languages and Protocols
Related to XML (cont’d.)
 Simple Object Access Protocol (SOAP)
 Platform-independent and programming
language-independent protocol for messaging
and remote procedure calls
 Resource Description Framework (RDF)
 Languages and tools for exchanging and
processing of meta-data (schema) descriptions
and specifications over the Web

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Extracting XML Documents from
Relational Databases
 Creating hierarchical XML views over flat or
graph-based data
 Representational issues arise when converting
data from a database system into XML
documents
 UNIVERSITY database example

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Breaking Cycles to Convert
Graphs into Trees
 Complex subset with one or more cycles
 Indicate multiple relationships among the
entities
 Difficult to decide how to create the document
hierarchies
 Can replicate the entity types involved to
break the cycles

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Other Steps for Extracting XML
Documents from Databases
 Create correct query in SQL to extract
desired information for XML document
 Restructure query result from flat relational
form to XML tree structure
 Customize query to select either a single
object or multiple objects into document

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Summary
 Three main types of data: structured, semi-
structured, and unstructured
 XML standard
 Tree-structured (hierarchical) data model
 XML documents and the languages for
specifying the structure of these documents
 XPath and XQuery languages
 Query XML data

Copyright © 2011 Ramez Elmasri and Shamkant Navathe

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy