0% found this document useful (0 votes)
73 views75 pages

4020 Week 3

The document outlines a class on SGML, HTML, XML and the World Wide Web framework. It discusses these topics and provides examples of XML and HTML code. It also covers XML goals and comparisons between HTML and XML.

Uploaded by

khanpmsg26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views75 pages

4020 Week 3

The document outlines a class on SGML, HTML, XML and the World Wide Web framework. It discusses these topics and provides examples of XML and HTML code. It also covers XML goals and comparisons between HTML and XML.

Uploaded by

khanpmsg26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

Outline of Today’s Class

◆ SGML, HTML and XHTML

◆ XML and DTD

◆ XML Examples

◆ The Framework of WWW


SGML

Standardized General Markup Language

)
SGML

◆ Standardized General Markup Language


◆ Developed by a committee!
◆ Led by Charles Goldfarb, 1978-1986
◆ A grammar to define the structure of documents

◆ Rules define the construct or structure


◆ Terminals are <tags> and strings
HTML & XML
➢ HTML is a subset of SGML with a shared
DTD

➢ HTMLDOC::=(<html> HEAD BODY </html>)

➢ XML is a subset of SGML with many DTDs


allowed
XML
Uses tags to identify semantics of data
◆ looks like HTML, but isn’t
<slide><title>Introduction</title>
<author><first>Jimmy</first>
<last>Huang</last>
</author>
<content>XML this and that</content>
</slide>
◆ is license free, platform-independent and
well-supported
HTML

Hypertext Markup Language


◆ Hypertext Markup Language
◆ Presents documents via WWW browsers
◆ Specifies document layout and hyperlink
◆ Predefines set of tags (ie. Common DTD)
HTML
HTML - Advantages
◆ Simple - fixed set of tags
◆ Portable - used with all browsers
◆ Linking - within and to external documents

HTML - Disadvantages
◆ Limited tag set
◆ Can’t separate the presentation from content
◆ Can’t define structure of contents
XHTML

EXtensible Hyper-Text Makeup Language

)
XHTML Basics
◆ Very few real changes from HTML
◆ But more strict

◆ All tags are in lowercase


◆ All tags must be closed
➢ Empty tags
➢ Paired tags
XHTML Document Structure
Overlap versus Nesting
XHTML tags
◆ Start tags and end tags
◆ Start tags - delimited by < and >
◆ End tags - delimited by </ and >
➢ <h1>This is a Large Heading</h1>
➢ <br>This text starts on a new line.

◆ Some start tags also include attributes which


further define information about the element.
!DOCTYPE
◆ HTML 3.2
➢ <!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 3.2
Draft//EN”>
◆ Netscapes HTML standard
➢ <!DOCTYPE HTML PUBLIC “-//WebTechs//DTD Mozilla
HTML 2.0//EN”>
◆ Not strictly necessary for HTML, highly recommended
◆ Future browsers can still attempt to display your older documents
(written to previous HTML standards) in the way that was
originally intended, even though the HTML language may have
evolved
◆ XHTML
➢ <?xml version = "1.0"?>
➢ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Strict//EN“ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-
strict.dtd">
!DOCTYPE
!DOCTYPE Title tags

<?xml version = "1.0"?>


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
Body tags
<!– Comments: name_of_webpage.html -->

<html xmlns = "http://www.w3.org/1999/xhtml">


<head>
<title> Web Engineering: XHTML I </title>
</head>

<body>
<p>Welcome to XHTML!</p>
</body>
</html>
Images
The value of the src attribute

<?xml version = "1.0"?> of the image element is the


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" location of the image file.
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<!-- Pictures with XHTML -->
The height and width attributes of the
<html xmlns = "http://www.w3.org/1999/xhtml">
<head> image element give the height
<title>Web Engineering - pictures</title> and width of the image.
</head>
<body>
<p><img src = "angelheart.jpg" height = "251" width = "367"
alt = "An angel" />
<img src = "grail.jpg" height = "180" width = "130"
alt = "A chalice" /></p>
</body>
</html>
The value of the alt attribute gives a
description of the image. This description
is displayed if the image cannot be displayed.
Colours
◆ <BODY TEXT=“aqua”>
aqua black blue fuchsia
gray green lime maroon
navy olive purple red
silver teal white yellow
◆ <BODY TEXT=“#00FF00”>
◆ <FONT COLOR = “#rrggbb” | “colour name”>
text</FONT>

000000 00FF00 FFFFFF


BLACK BRIGHT-GREEN WHITE
Inline Styles
<h1 style="color:blue; font-style: italic">First
Stylesheet Example</h1>

<p>The first example of stylesheets uses an inline


style.</p>

<h1>Second Stylesheet Example</h1>

<p>The second example of stylesheets uses a document-


level style.</p>

<h1>Third Stylesheet Example</h1>

<p> The third example of stylesheets uses an external


stylesheet.</p>
Demonstration:
inline_css.html
XML

EXtensible Markup Language

)
XML Introduction
◆ The Extensible Markup Language (XML) is a document
processing standard proposed by the World Wide Web
Consortium (W3C), which is related to Standard
Generalised Markup Language (SGML).
◆ Possible to search, sort, manipulate and render XML
using Extensible Markup Language (XSL).
◆ Highly portable
◆ Files end in the .xml extension.
XML & W3C
• XML has been in development since the 1960s through its parent called
SGML (Standard Generalized Markup Language) which is also the parent for
HTML

• XML is a streamlined version of SGML designed for transmission of structured


data over the Web by a working group in the World Wide Web Consortium
(W3C) in 1996

• Passed as W3C standard in Feb 1998

- www.w3.org/xml
- www.xml.com/axml/axml.html (annotated version)
XML-related Technologies
◆ DTD (Document Type Definition) and XML Schemas are
used to define legal XML tags and their attributes for
particular purposes

◆ CSS (Cascading Style Sheets) describe how to display


HTML or XML in a browser

◆ XSLT (eXtensible Stylesheet Language Transformations)


and XPath are used to translate from one form of XML to
another

◆ DOM (Document Object Model), SAX (Simple API for


XML, and JAXP (Java API for XML Processing) are all
APIs for XML parsing
From HTML to XML..
• HTML major drawback – information loses its
structure when translated into HTML
• HTML is a presentation-oriented markup language,
so information embodied in it is difficult to process
• Information and knowledge servers are overloaded
since we have to search information and perform
format processing
• Servers often answer the same request many times
if users request several views on the same data
From HTML to XML..
• HTML:
- Lacks extensibility – can’t create tags or attributes
to parameterise or semantically qualify data
- Lacks structure – does not support the
specification of deep structures needed to represent
database schemas or object-oriented hierarchies
- Lacks validation – does not support language
specification that lets applications check imported
data’s structural validity
XML Goals
As a portable, platform independent data storage

• support a wide variety of applications,


• easy to use across the Internet,
• compatible with SGML,
• easy to create programs that process XML,
• clear and legible (self-describing),
• XML documents should be easy to create
• XML designs should be quickly prepared, formal & concise etc.
XML
• XML is not for displaying information but for managing
information.
•Working group of World Wide Web Consortium (W3C) created
XML as a standard for creating markup languages.
• Designed it for distributing structured documents over the web
• A kind of “light” SGML (Standard General Markup Language)
simplified to meet Web requirements
• Unlike HTML, XML lets users:
 Extract data from a document
 Define their own tags and attributes
 Define data structures and nest document structures to any
complexity level
 Make applications that validate a documents structure. Any XML
document can contain an optional description of its grammar for use by
applications that perform structural validation
XML
◆ The problem that XML helps us to solve is how to transfer data
between servers, or between the client and the server.
◆ It is a Markup language for describing structured data – content is
separated from presentation.
◆ XML documents contain only data
➢ Applications decide how to display the data
◆ Language for creating markup languages
➢ Can create new tags
◆ XML documents contain only data, not formatting instructions, so
applications that process XML documents must decide how to display
the documents data.
◆ For example a PDA (personal digital assistant) may render an XML
document differently than a wireless phone or desktop computer would
render that document.
HTML and XML
XML stands for eXtensible Markup Language
HTML is used to mark up XML is used to mark up
text so it can be displayed to data so it can be processed
users by computers
HTML describes both XML describes only
structure (e.g. <p>, <h2>, content, or “meaning”
<em>) and appearance (e.g.
<br>, <font>, <i>)

HTML uses a fixed, In XML, you make up


unchangeable set of tags your own tags
XML
◆ XML is a meta-language
◆ With HTML, existing markup is static: <HEAD> and <BODY>
for example, are tightly integrated into the HTML standard and
cannot be changed or extremely difficult extended.
XML
◆ XML is a meta-language
◆ With HTML, existing markup is static: <HEAD> and <BODY>
for example, are tightly integrated into the HTML standard and
cannot be changed or extremely difficult extended.
◆ XML, on the other hand, allows ou to create your own markup
tags and configure each to your liking: for example
➢ <WebEngHeading>
➢ <WebEngSummary>
➢ <WebEngReallyWildFont>
◆ Each of these elements can be defined through user defined
document type definitions (DTD) and stylesheets are applied to
one or more XML documents.
◆ There are no ‘correct’ tags for an XML document, except those
defined by the author
Some Code
◆ Schema
◆ Entity ◆ Entity
➢ Passport Details ➢ Address
◆ SubEntities ◆ SubEntities
➢ Last Name ➢ Street
➢ First Name ➢ City
➢ Address ➢ Town
➢ State
➢ Province
➢ ……..
DTD
<!ELEMENT passport_details (last_name,first_name+,address)>
<!ELEMENT last_name (#PCDATA)>
<!ELEMENT first_name (#PCDATA)>
<!ELEMENT address
(street,(city|town),(state|province),(ZIP|postal_code),country,contact_no?,email*)>
<!ELEMENT street (#PCDATA)>
<!ELEMENT city (#PCDATA)>
<!ELEMENT town (#PCDATA)>
<!ELEMENT state (#PCDATA)>
<!ELEMENT province (#PCDATA)>
<!ELEMENT ZIP (#PCDATA)>
<!ELEMENT postal_code (#PCDATA)>
<!ELEMENT country (#PCDATA)>
<!ELEMENT phone_home (#PCDATA)>
<!ELEMENT email (#PCDATA)>
Internal DTD and Instance
<?xml version='1.0'?>
<!DOCTYPE passport_details [
<!ELEMENT passport_details <passport_details>
(last_name,first_name+,address)> <last_name>Smith</last_name>
<!ELEMENT last_name (#PCDATA)> <first_name>Jo</first_name>
<!ELEMENT first_name (#PCDATA)>
<first_name>Stephen</first_name>
<!ELEMENT address
(street,(city|town),(state|province) <address>
,(ZIP|postal_code),country,contact_no?,email*)> <street>1 Great Street</street>
<!ELEMENT street (#PCDATA)> <city>GreatCity</city>
<!ELEMENT city (#PCDATA)> <state>GreatState</state>
<!ELEMENT town (#PCDATA)> <postal_code>1234</postal_code>
<!ELEMENT state (#PCDATA)>
<country>GreatLand</country>
<!ELEMENT province (#PCDATA)>
<!ELEMENT ZIP (#PCDATA)> <email>jhuang@yorku.ca</email>
<!ELEMENT postal_code (#PCDATA)> </address>
<!ELEMENT country (#PCDATA)> </passport_details>
<!ELEMENT phone_home (#PCDATA)>
<!ELEMENT email (#PCDATA)>
]>
Shared DTD
XML Document specifies the DTD
<?xml version='1.0'?>

<!DOCTYPE passport_details SYSTEM "PassportExt.dtd">

<passport_details>
<last_name>Smith</last_name>
<first_name>Jo</first_name>
<first_name>Stephen</first_name>
<address>
<street>1 Great Street</street>
<city>GreatCity</city>
<state>GreatState</state>
<postal_code>1234</postal_code>
<country>GreatLand</country>
<email>jo@theworldaccordingtojo.com</email>
</address>
</passport_details>
XML Examples
◆ XML Source File
➢ http://www.yorku.ca/jhuang/xml/04.adhoc.topics.xml

◆ XML Style language


➢ http://www.yorku.ca/jhuang/xml/04.adhoc.topics.xsl

◆ Parsing and rendering XML with IE5+


➢ http://www.yorku.ca/jhuang/xml/04.adhoc.topics_xsl.xml
XML Applications
◆ XML permits document authors to create markup for
virtually any type of information.
◆ Authors can create entirely new markup languages for
describing specific types of data, including mathematical
formulas, chemical molecular structures, music, recipes etc.
- XHTML
- VoiceXML (for speech)
- MathML (for mathematics)
- SMIL (the Synchronous Multimedia Integration Language, for
multimedia presentations)
- CML (Chemical Markup Language, for chemistry)
- XBRL (Extensible Business Reporting Language, for financial
data exchange)
XML Parsers
◆ Processing an XML document requires a software program
called an XML parser (or processer). These are available at
no charge in many languages (Java, Python, C++ etc.).

https://www.w3schools.com/xml/xml_parser.asp

◆ Parsers check an XML documents syntax and enable software


programs to process marked-up data. XML parsers can
support the Document Object Model (DOM) or the Simple API
for XML (SAX).
➢ DOM: Build a tree structure containing the XML
document’s data
➢ SAX: Process the document and generate events
In Brief .. XML is for Data Exchange
• Very frequently companies need to exchange data among dissimilar
systems, locations, software, hardware, data formats etc.

• Data stored in different formats - Data that is not stored in databases


(unstructured data) is difficult to exchange and often require custom software

• Data can be interchanged in various ways


- agree on a totally custom format
- agree on a proprietary system
- using standard data format

• XML provides a standardised format for data and techniques for


generating, validating, formatting, transforming and extracting it
When Do You Use It?

• XML is good for exchanging data between dissimilar systems

• If data exchange only occurs between similar systems,

XML may not be the right choice!


B2B
• XML is frequently used in B2B applications
- B2B means that two companies are exchanging data
- also one company exchanging data between different locations
- agreement on the format (through DTD, XML Schema) of messages

B2C
• Business-to-Consumer involves sending XML directly to the client
• Data sent directly to the client needs a style (XSL) applied
• Applying style is best accomplished on the server side
Document Structure
• Three distinct parts
- Prolog <?xml version=“1.0” encoding=“UTF-8”?>
- Root Element
- Miscellaneous Section

• Prolog contains instructions that apply to the entire document


(such as XML declaration, DTD)
• Root element is a single element that encloses all of the data
• Miscellaneous is not recommended but still included in the
standard
XML Element Structure

Child
Xml document element
Child
element Child
element
Root element
Child
element
Child
element Child
element
XML Elements
- have the same overall structure
- can contain sub-elements
PCDATA
(Parsed Character Data)

<Student Sex = “Male” > Some Data </Student>


ATTRIBUTE

START TAG CONTENTS END TAG

ELEMENT

NAME
Element vs. Attribute based XML
<student> <student id = “9906789”> 2
1
<id> 9906789 </id> <name>Adam</name>
<name>Adam</name> <email>adam@unl.ac.uk</email>
<email>adam@unl.ac.uk</email> </student>
</student>

3
<student id = “9906789” name=“Adam email=“adam@yorku.ca”> </student>

Which is better? NO RIGHT ANSWER!


Some issues to consider
- elements can have substructures; but not attributes
- ID attributes can be easily located and processed
XML Document
(another sample .xml file)

<?xml version = "1.0"?>


prolog
• The document structures
data with ‘books’ element
<!-- article.xml -->
as the root node.
root
<books>
<author>
element
• Root node contains
<title> Introduction to Computer Graphics </title>
elements (e.g. author)
<date>1995</date>
<fname>James</fname>
<lname>Foley</lname> • Each element further
</author> contains child nodes that
<author>
describe data
<title> Principles of Database Systems </title>
<date month="February” >2000</date>
<fname>Greg</fname> attribute
<lname>Riccardi</lname> • <books>,<author>,<title>
</author>
etc. are customised tags
</books> Miscellaneous
<!- - This is a list of students - -> describing data.
XML Syntax
• XML elements must be enclosed within start and end tags
<title> Introduction to Computer Graphics </title>
If there is no data inside the element, tag can end with ‘/>’
<title/> which is same as <title> </title>

• Element attributes must be enclosed within double quotes:


<date month="February” >2000</date>

• Element tags are case sensitive <author> Adam </Author> is incorrect

• XML tags must be nested in correct order:


<books> <author> … </books> </author> Bad
<books> <author> … </author> </books> Good

XML is therefore very rigid in enforcing syntax compared to HTML (which


is very forgiving)

• A “well formed” documents follows all these rules


DTD: Document Type Definition
• The XML sample document shown earlier follows syntax rules only. It is
therefore called a well-formed document
• It can also be made to follow strict grammar rules for enforcing the
structure
• DTD specifies grammar rules for an XML document
- several XML documents prepared from various sources can be
validated using a single set of grammar rules
• An XML document that adheres to a DTD is called valid. A valid
document has stronger structure than a well-formed document
• DTD specifies rules for elements (child nodes) and how it can be
expanded into sub elements (child nodes)
• DTD consists - Element declarations, Attribute list, Data types etc.
• DTDs are based on SGML; difficult to create!
DTD Nested Elements
Define the list
<!ELEMENT author (date, title, fname, lname)>

• The author grammar indicates that it is made up of four elements defined as below:
<!ELEMENT date (#PCDATA)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT fname (#PCDATA)>
<!ELEMENT lname (#PCDATA)>

• Each element may have attributes that contains information about its content
e.g. <date month="February” >2000</date>
CDATA in non-
• An element’s attribute list can be defined using ATTLIKST tag: parsed
syntax: <!ATTLIST element_name attribute_name type default_value>

<!ATTLIST date month CDATA #IMPLIED>


Specifies the month attribute of the element date. CDATA means that it
is a character string. #IMPLIED means - the attribute is optional. If it is
not specified the system provides a value. Other options:
#REQUIRED: the XML author must provide the attribute value
#FIXED: the attribute value is fixed and can not be modified by the user
External DTD in XML document
• Any external DTD specification can be used by several XML documents
Example: <!DOCTYPE books SYSTEM "author.dtd">
books is the root element of the document. SYSTEM specifies the
DTD file.

<!-- DTD for books: author.dtd -->


<!ELEMENT books (author+)> External subset
<!ELEMENT author (date, title, fname, lname)> (specified in XML using SYSTEM or PUBLIC keywords)
<!ELEMENT date (#PCDATA)>
<!ATTLIST date month CDATA #IMPLIED>
<!ELEMENT title (#PCDATA)>
<!ELEMENT fname (#PCDATA)>
<!ELEMENT lname (#PCDATA)>
<!DOCTYPE books SYSTEM "author.dtd">
author.dtd <books>
<author>
<date>1995</date>
<title> Introduction to Computer Graphics </title>
<fname>James</fname>
<lname>Foley</lname>
</author>
authorDtd.xml using a DTD …..
</books>
Including DTD in XML document -
Internal/Inline
•. Introduced into XML using the document type declaration (DOCTYPE)
<!DOCTYPE books [
<!ELEMENT books (author+)>
<!ELEMENT author (date, title, fname, lname)>
<!ELEMENT date (#PCDATA)>
<!ATTLIST date month CDATA #IMPLIED>
Internal subset
<!ELEMENT title (#PCDATA)>
<!ELEMENT fname (#PCDATA)>
<!ELEMENT lname (#PCDATA)>
]>

<books>
<author>
<date>1995</date>
<title> Introduction to Computer Graphics </title>
<fname>James</fname>
<lname>Foley</lname>
</author> inLineDtdExample.xml
…..
</books>
DTDs - Disadvantages
• Notoriously hard to read
• Difficult to create (written in non-XML syntax; uses EBNF - Extended Backus-Naur
Form - grammar)
• No support for namespaces etc. Also study ANY, EMPTY,
• Limited data types (PCDATA, CDATA) Mixed Content

Alternative to DTDs - XML Schemas


• also referred as XSchema
If time permits – covered towards the end

• Easy to create and read (Well-formed XML syntax)


• can be edited using XML tools
• Support for namespaces
• More data types (byte, float, long; time, date; binary ..)
• User-defined data types (Facets are properties used to
specify a data type, setting limits and boundaries on data
values)
Developing XML data

Program that processes


XML documents

• First, create XML document that the contains content character data and
marked up with XML tags.
• Second, build Document Type Definition (DTD). The DTD specifies rules
such as ordering of elements, default values, and so on.
• Third, use XML Parser that checks the XML document against the DTD and
then splits the document up into markup regions and character-data regions.
• After processing with the XML parser, the data now is in a structured format
and can be processed by any XML application.
XML Parsers (or Processors)
• one of the most important layers to an XML-aware application (e.g.Firefox, IE 5+)
• input - raw XML document
• parses to ensure that the document is well formed and/or valid (if a DTD exists),
report errors and allows programmatic access to the document contents
• output - a data structure (XML document is transformed)

XML DTD Tree


Document
+ (optional)
XML parser
Structure

<books> books
<author>
<date>1995</date>
<title> Web IR </title> author
<fname>Jimmy</fname>
<lname>Huang</lname>
</author>
</books> 1995 Web IR Jimmy Huang
Parsing XML Documents
• Parsers can support the Document Object Model (DOM) and Simple API
for XML (SAX) for accessing document’s content programmatically using
languages such as Java, C, C++, Python etc.

• A DOM based parser builds a tree structure containing the XML


document’s data in memory.
(used to create and modify XML documents)

• A SAX based parser processes the document and generates events (I.e.
notifications to the application) when tags, comments etc. are
encountered. These events return data from the XML document.
(used to read XML documents only;
SAX is attractive for handling large documents because it is not required
to load the entire document)
DOM (Document Object Model)
• A DOM-based parser exposes a programmatic library called the DOM
API that allows data in an XML document to be accessed and modified by
manipulating the nodes in a DOM tree. DOM API is available in many
languages e.g. JavaScript.
• Data can be accessed quickly as all the document’s data is in memory.
• The DOM interfaces for creating and manipulating XML documents are
platform and language dependant. DOM parsers exist for Java, C, C++,
Python and Perl.
• JDOM provides a higher-level API than the W3C DOM for working with
XML documents in Java. See www.jdom.org
- provides full tree representation of the XML document
- allows random access to any node
- provides a variety of output formats
- less memory intensive than DOM API
• In order to use DOM API, programming experience is required.
SAX (Simple API for XML)
• Developed by the members of the XML-DEV mailing list
• Released in May 1998
• SAX and DOM are totally different APIs for accessing information in
XML documents.
• SAX based parsers invoke methods when markup (e.g. a start tag,
end tag etc.) is encountered. With this event based model, no tree
structure is created to store data. Instead, data is passed to the
application from the XML document as it is found.
=> greater performance and less memory overhead than with DOM
• Many DOM parsers use a SAX parser to retrieve data for building the
DOM tree.
• SAX parsers are typically used for reading documents that will not be
modified.
Parsing (msxml) and rendering XML
with IE
• XML document contains data, NOT formatting information.
• When XML document is loaded into IE5+, the document is
parsed by msxml.
• If the document is well-formed, the parser makes the
document’s data available to the application (I.e. IE5).
• The application can format and render the data and also
perform other processing.
• IE5 renders data by applying a stylesheet that formats and
colours the markup identically to the original document.
• Notice the - sign. It indicates that child elements are visible.
When clicked, it becomes + hiding the children.
• This behaviour is similar to viewing disk directory structure
using a program such as Windows Explorer.
Using XML:
How does browser read XML ?
◆ XML parser: A tool for reading XML documents.
◆ To manipulate an XML document, you need an XML
parser. The parser loads the document into your
computer's memory. Once the document is loaded,
its data can be manipulated using the DOM. The
DOM treats the XML document as a tree.
◆ Once you have installed Internet Explorer 5.0, the
Microsoft XML parser is available.
◆ http://www.w3schools.com/xml/xml_parser.asp
◆ https://developer.mozilla.org/en-
US/docs/Archive/Mozilla/XML_in_Mozilla (XML in
Mozilla)
Using XML: Presenting Data

➢ Need to convert XML tags into appropriate


HTML tags for use in a browser!!

➢ <lastname>Smith</lastname>

➢ <b>Smith</b> Smith
Extensible Stylesheet Language (XSL)
• XML is just data - no presentation information
• To present the data on the screen or paper or any media - apply appropriate style
• Style sheets contain rules that instruct the processor how to present elements
• Two style languages: CSS (Cascading Style Sheets) and XSL
• XSL is powerful than CSS and an excellent solution to control the presentation of
data
- resource intensive: memory and processing power
- complex to write
• transforms and translates XML data from one format into another
same document needed to be displayed in HTML, PDF and postscript form
CSS and XSL
◆ CSS - Cascading Style Sheets
➢ can predefined HTML display (font etc)
➢ these are shared and reused

◆ XSL - XML Style language


➢ predefine display characteristics for XML
entities
➢ transform into CSS for browsers to use
Cascading Style Sheets
CSS street, city, town, state, province,
ZIP, postal_code {
last_name font-family: verdana, arial;
{ font-size: 12pt;
font-family: verdana, arial; font-weight:bold;
font-size: 15pt; color:green;
font-weight:bold; display:block;
display: block; margin-bottom: 20pt;
margin-bottom: 5pt; margin-top: 40pt;
} }
first_name email {
{ font-family: verdana, arial;
font-family: verdana, arial; font-size: 12pt;
font-size: 15pt; font-weight:bold;
font-weight:bold; color:blue;
display: block; display:block;
margin-bottom: 5pt; margin-top: 5pt;
} }
Extensible Stylesheet Language
(XSL)
• XSL provides a complete separation of data or content and
presentation, and provides a method to translate data into a PDF or
HTML document.

• XSL is a combination of two languages:

* XSLT(Extensible Stylesheet Language Transformation): defines rules for


transforming an XML document into another format

* XSLFO (XSL Formatting Objects): specific XSL instructions that describe


how content should be rendered; sophisticated version of CSS; formatting of
<h1>,<table> tags can be set

• Details can be found at http://www.w3.org/Style/XSL


XSLT
• XSLT is a declarative language for transforming XML documents into other

• once an XML document is parsed, it is transformed through XSLT


that is, XSL textual stylesheet and a textual XML document are
merged together to produce data formatted according to the
stylesheet.
XML document
XML document
XSLT processor (another text-
based format)
XML stylesheet
(XSLT)

• XSLT Processors - selection criteria: speed of transformation and conformity to


the XSL and XSLT specifications

• some widely used parsers:


* Apache Xalan, Oracle XSL Processor, Lotus XSL Processor, James
Clark’s XT, Keith Visco’s XSL:P, Michael Kay’s SAXON, Microsoft XSL
processor (built into IE 5)
XSL (Style Language)
<?xml version='1.0'?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/TR/WD-xsl"
xmlns="http://www.w3.org/TR/REC-html40"
result-ns="">
<xsl:template><xsl:apply-templates/></xsl:template>
<xsl:template match="/">
<html>
<head>
<title><xsl:value-of select="/passport/last_name"/></title>
</head>
<body>
<H1><xsl:value-of select="/pastport/last_name, first_name"/></H1>
<H2>Address</H2>
<BLOCKQUOTE>
<xsl:apply-templates select="/passport/address"/>
</BLOCKQUOTE>
</body>
</html>
XSL: Examples
<xsl:template match=”EmployeeRecord/Name">
<Bold>
<xsl:apply-templates/>
</Bold> All the children of the “Name” element
contained in “EmployeeRecord” are
</xsl:template> processed with template.

<xsl:template match=”EmployeeRecord/Name">
<Bold>
<xsl:apply-templates select=“FirstName”/>
</Bold>
</xsl:template> The templates is applied only to the
`FirstName’ element of the `Name’
element contained in `EmployeeRecord’.
Options for Displaying XML
XSL XSL HTML
Transformation Transformation Document Web Browser
spec

CSS XML XSL


Stylesheet Document Stylesheet

XML enabled XML Display example1


Web Broswer Engine
example2
An Example

Boeing

◆ Boeing places a DTD on its site


◆ part purchasers use this DTD
◆ Boeing can use multiple XSL stylesheets
Example
Boeing (cont’d)
◆ customer creates an order document,
they can verify the validity of that
document against the DTD.
◆ this ensures they are transmitting only
type-valid orders.
◆ in turn, Boeing can ensure they are
receiving only type-valid documents.
Summary
XML - Advantages
◆ Platform and system independent
◆ User-defined tags
◆ Doesn’t require explicit DTD
◆ Display format and content are separate
XML - Disadvantages
◆ Requires a processing application
◆ “More difficult” than HTML
◆ Must be converted to HTML to view in
browser
Importance of XML
◆ Coordinating Heterogenous Databases

◆ Separation of Structure / Content / Display

◆ Document Validity Checking

◆ Potential Use in Standards


HTML Document
(good for formatting)
<html><body>
<h2>Student List</h2>

<ul> What is “yes”?


<li> 9906789 </li>
<li>Adam</li> Data and
<li>adam@unl.ac.uk</li> presentation
<li>yes - final </li>
</ul>
logic
<ul> mixed
<li> 9806791 </li>
<li>Adrian</li>
<li>adrian@unl.ac.uk</li>
<li>no</li>
</ul> What is “no”?
</body></html>
XML Document
(good for describing data)
<?xml version = "1.0"?>

<student_list>
<student> Only data
<id> 9906789 </id>
<name>Adam</name>
<email>adam@unl.ac.uk</email>
• Data is self-describing
<bsc level=“final”>yes</bsc>
</student>
• custom tags describe content
(define your own tags)
<student>
<id> 9806791 </id>
• easy to locate data
<name>Adrian</name>
(e.g. all BSC students)
<email>adrian@unl.ac.uk</email>
<bsc>no</bsc>
</student>

</student_list>
The Framework of WWW
HTML
Web Designer External Applications
Authoring Non-HTTP objects
& Publisher
Tools/Editors
• JAVA Servlet
• CGI (Perl)
• ASP & ASP.NET
• Java Server Pages
• Java Applet
• JavaScript

Web Programmer
Web
Browser
Internet
Global Reach
Broad Range Web
Server
Client
End User Web Master
Why Build Pages Dynamically?
◆ The Web page is based on data submitted by the user
➢ E.g., results page from search engines and order-
confirmation pages at on-line stores
◆ The Web page is derived from data that changes
frequently
➢ E.g., a weather report or news headlines page
◆ The Web page uses information from databases or
other server-side sources
➢ E.g., an e-commerce site could use a servlet to build a
Web page that lists the current price and availability of
each item that is for sale

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy