CM2040: Databases, Networks and The Web: Arjun Muralidharan 6th September 2020
CM2040: Databases, Networks and The Web: Arjun Muralidharan 6th September 2020
Summary
Arjun Muralidharan
1
Contents
1 Three-tier web applications 4
1.1 Static vs. Dynamic Web Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.1 Lifetime of a web request . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.2 Accessing remote resources with HTTP . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Three-tier application architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2
9.3 Nested SQL Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
10 Networking concepts 18
10.1 The TCP/IP model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
List of Figures
1 Two-tier vs. three-tier architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Express Routing Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 ER Diagram Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 OSI reference model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
List of Algorithms
List of Listings
3
1 Three-tier web applications
Key Concepts
X Intro to module and lab
Learning Outcomes
X Recognize the tools available for this module to edit a Node file and run it
A web application is a client-server software application in which the user interface runs in a web
browser. It could be a computer program which allows a user to submit and retrieve data to or from
a database over the Internet using their preferred browser. It is an application in which all or some
parts of the software are downloaded from the web each time it is run.
Static web applications are static internet resources with little to no interaction with the user.
Dynamic web applications have more user interactions, and users can input, change and manipulate
data.
Differences between web applications and desktop applications are shown in Table 1.
Desktop Web
When a (static) web application is accessed, the following steps occur in order.
4
1. A URL is typed into the browser
2. Browser cache is checked; if the application is already available, the procedure skips to the last
step
HTTP is a client-server application protocol. The client sends a request and a server provides a
response.
HTTP requests have a fixed format, which consists of three parts: a request line , some optional
message headers , and the request body (also optional). The request line is a formatted string,
which consists of three parts; the HTTP method (e.g. GET or POST), the URL of the requested resource,
and the protocol version. For example, the following request line:
A server responds with an HTTP response message, which is structured in three parts: a status
line , a set of optional headers , and a message body.
The status line consists of the protocol version followed by a numeric status code and its associated
message, e.g.
HTTP is stateless. Each call is treated independently and there is no information maintained
between two successive requests. HTTP cannot maintain a user session.
HTTP is pull-based. Only the client can initiate a request-response action, the server cannot
contact the client.
1. General headers: These apply to both the request and response, e.g. Date
2. Request headers: Specific to requests, e.g. Accept-Language, which sends which response
languages are acceptable
5
3. Response headers: Specific to responses, e.g. WWW-Authenticate which returns if a user is
authorized or not
4. Entity headers: Apply to the content of the request or response body, e.g. Content-Length
specifies the length (in bytes) of the body
However, these notions are valid as of HTTP1.1 and may not be valid in HTTP/2 and newer
versions.
The core principle of three-tier architectures is to provide an intermediate layer between the client
and the data tier, which centralizes middleware services and the business logic of the application.
Three-tier architectures offer a higher degree of scalability than two-tier con- figurations, thanks to
better network utilization and to the virtually unlimited replication and load distribution capabilities
of the middle tier. A comparison is shown in Figure 1.
6
Key Concepts
X Web server and web hosting
Learning Outcomes
A web server is a program that uses HTTP to serve the files that form web pages to users, in response
to their requests, which are forwarded by their computers’ HTTP clients.
A basic web server architecture includes:
• Hardware
• Operating system
• HTTP server
• Database (optional)
The HTTP server compiles the results from databases and scripting language, and controls what
options are available to the end user via configuration files .
2.2 Node.js
7
Node.js PHP
Asynchronous Synchronous
Single-threaded Multi-threaded
Non-blocking Blocking
Highly scalable Less scalable
Memory-efficient Less memory-efficient
Able to share code between browser and server —
Key Concepts
X Routing in dynamic web applications
Learning Outcomes
X Describe and apply Separation of Concern principle of programming in your web application
development
3.1 Routes
Routes are URL schema, which describe the interfaces for making requests to your web app. Com-
bining an HTTP request method (a.k.a. HTTP verb) and a path pattern, you define URLs in your
app.
Each route has an associated route handler, which does the job of performing any action in the
app and sending the HTTP response.
8
Figure 2. Express Routing Mechanism
Separation of Concerns is a design principle for an application such that each module or layer in
an application should only be responsible for one thing and should not contain code that deals with
other things.
SoC is achieved by
2. layering functionality in the application, e.g. such as a three-tier architecture which is loosely
coupled
The objective is to remove as many dependencies as possible between the individual parts.
Benefits of SoC include:
1. Reduces complexity
3. Improves portability
3.3 Templating
A template engine can help produce a dynamic HTML page by inserting variables into the final
output, and running programming logic at run-time before sending the final HTML output to the
browser.
This helps keep the logic in the HTML file at a minimum. Popular template engines include EJS,
Pug and Mustache.
EJS template files use HTML and EJS tags:
9
<%= %>
<% %>
used to evaluate an expression, but not add the return value to the document. An example:
<h1><%=title%></h1>
<ul>
<%for (var i=0; i < supplies.length; i++) {%>
<li>
<a href="supplies/<%=supplies[i]%>"></a>
</li>
<%}%>
</ul>
The parameters, such as title, can be set in a JavaScript file that define the routes. These are usually
located in the main.js file in routes.
Key Concepts
X Form handling and HTTP GET and POST methods
Learning Outcomes
X Describe and apply GET and POST request methods in dynamic web application
The GET HTTP method is used to request a resource on a server, while POST sends data to a resource
on the server.
GET may never manipulate any data. POST is commonly used with web forms to send data to a
server or to retrieve sensitive data (by providing credentials).
10
4.2 GET
Routes for an HTTP GET method can be invoked with the following Express command. The method
requests the resources from the server but can also pass form input fields via the URL. This is useful
for simple data that doesn’t need to be stored and is not sensitive.
Key Concepts
X Introduction to databases
X Relational databases
X MySQL shell
Learning Outcomes
The database approach to managing and accessing data has key characteristics:
Self-describing nature A database system contains a complete definition of the database structure
and constraints, stored in the DBMS catalog and called meta data. NOSQL systems are designed
to not requires a catalog.
Insulation between programs and data, and data abstraction With a DBMS, the structure
of the data can be extended or even changed without programs having to change (e.g., adding a
11
new attribute). Some databases provide an object-oriented system , allowing to define an interface
describing the data names and types, and the implementation contains the actual operation, which
can change without breaking the interface.
Support for multiple views of the data Data can be served in subsets tailored for specific users,
and data can be virtual data, which is derived from the stored data but not actually stored (e.g. a
calculated field).
Sharing of data and multiuser transaction processing A DBMS includes concurrency con-
trol software, ensuring multiple users can interact with the data at the same time. A transaction is
a process with one or multiple database accesses. The isolation property ensures that a transaction
appears to execute in isolation from other transactions. Atomicity ensures that either all database
operations in a transaction are executed, or none are.
The DBMS approach further reduced redundancy by normalizing data, i.e. storing a logical
data item in only a single place.
# Invoke the mysql shell with with user root and ask for password
mysql -u root -p
12
5.3 Creating Tables
SQL tables are created using the CREATE TABLE command specifying the data types and constraints
for each field. The most common constraints to use are NOT NULL to specify that an attribute may not
be NULL, DEFAULT i to specify a default value of i and AUTO INCREMENT to specify integer values that
automatically increment as rows are added.
Inserting data is done using the INSERT command and specifying values with the VALUES keyword.
INSERT INTO
dishes (name, price, is_vegetarian, is_vegan)
VALUES
('pizza margherita', 10.99, 1, 0),
('soya burger', 9.50, 1, 1);
13
--Delete rows
DELETE FROM
dishes
WHERE
price > 1;
--Update rows
UPDATE
dishes
SET
price = 2
WHERE
price = 1;
Key Concepts
X Basic SQL, review create, select and insert into and introduce update and delete
X Primary keys
Learning Outcomes
X Describe and apply primary key and foreign keys in relational databases
X Recognize and apply access to databases from middleware code in Node to build dynamic
web applications
A primary key is a field or combination of fields in a table that uniquely identifies each row or record.
It is declared when creating a table using the PRIMARY KEY directive.
Mostly, a primary key is a numerical field and is set to automatically increment (AUTO INCREMENT).
14
6.1.1 Integrity Constraints
Table fields can be defined with constraints such as UNIQUE, NOT NULL and DEFAULT to support the
structural integrity of the database. Primary keys have some built-in constraints:
Data from an application can be passed to the back end and stored in the database by constructing an
SQL query in code and using the appropriate function to invoke that query with the variable passed
in. In Express, placeholder question marks ()?) are used in the SQL statement, which can then be
filled up by a variable. This variable can be a single value or an array of values.
The input to the query can be sanitized using a sanitizer to clean up data and prevent malicious
code injection.
Key Concepts
X Passing variables from middleware to front- end (templates)
15
X Passing variables from middleware to backend (database)
X Review what you have learned so far for development of a dynamic web application
Learning Outcomes
The web applications built with Express, Node and SQL Server are an example of event-driven
programming as opposed to sequential programming. The flow of the program is determined based
on events, such as user inputs or sensor outputs.
Key Concepts
X Database schema
X SQL Join
Learning Outcomes
X Explain the context behind the relational model and describe the process of schema devel-
opment
X Explain the context behind the junction (bridge) tables in relational model and describe the
process of schema development
To develop a conceptual schema of a database, we first identify the entities, attributes, and data
types in the database.
Next, decide the primary key for each entity, relationships between entities and the respective
foreign keys and, finally, identify the relationship cardinalities to proceed to refining the databse
design.
Entity relationship diagrams are denoted using the notation shown in Figure 3.
16
Figure 3. ER Diagram Notation
Alternative notations use the concept of total relationships, denoting that an entity’s existence
depends on the presence of one or more relationships, or partial relationships, stating that entities
can exist with or without the relationship. These denoted with single or double lines between the
entity and the relationship.
A foreign key is an attribute of an entity that is also the primary key of another entity that is related
to the original entity.
Foreign keys are declared with the respective constraint and ensure the referential integrity of the
data by limiting the content of the foreign key to the values of the referenced primary key.
To combine attributes from two tables, we can use a JOIN statement.
In the example above, the attribute from table 1 is the foreign key, while the attribute from table
2 is usually the primary key, but not restricted to it.
17
9 Querying a database (advanced)
Key Concepts
X Aggregate functions in SQL
Learning Outcomes
X Perform advanced queries on databases using multiple related tables by LEFT and RIGHT
JOIN queries
X Perform advanced queries on databases using multiple related tables by nested SELECT
queries
Aggregate functions allow summarizing tuples of data into a single-tuple summary. In other words,
aggregate functions query a database and summarize the data into a single row.
Common aggregate functions are SUM, COUNT, MIN, MAX and AVG, and they do exactly what one
would expect them to do.
A normal JOIN statement returns only records at the intersection A ∩ B of two tables A and B. This
is also sometimes called INNER JOIN.
A LEFT JOIN returns all the records from A, completed with matching attributes from B. A RIGHT
JOIN does the opposite.
The argument to a WHERE clause can itself be an SQL subquery returning a value for comparison.
SELECT name, price FROM food WHERE price=(SELECT MAX(price) FROM food)
10 Networking concepts
18
Key Concepts
X Basic concepts of computer networking
Learning Outcomes
In packet switching , unlike in circuit switching, each data unit to be transferred by the network
doesn’t know the entire path to the destination. It only knows the destination address.
Intermediate devices in the network locally decide the path and forward to the next known node
that is closer to the destination.
The application layer is the highest layer in the OSI stack and closest to the user. Example protocols
are HTTP, FTP, SMTP, SSH, DNS, DHCP and Telnet.
This layer defines the type of messages exchanged, such as request/response, defines the fields in
the messages and the meaning of fields.
SSL (older) and TLS (newer) are additional layers used to encrypt data at the application layer.
The main protocol at the transport layer is TCP, which handles establishing of connections, checking
their reliability, error checking and a three-way handshake process:
19
2. Server host receives SYN, replies with SYNACK.
3. Client receives SYNACK, replies with ACK, which may also contain data.
An alternative to TCP is UDP, or user datagram protocol which does not provide error-checking
or handshaking, making it suitable for high-speed applications such as live broadcasting, gaming or
video conferencing. TCP is more suited to transfering documents and data reliably.
In addition to an IP address, a port defines which application receives specific data.
Multiplexing allows collecting multiple data segments from various ports and sending them out.
De-multiplexing does the opposite, checking each received datagram and delivering it to the correct
port.
The main protocol at the network layer is the Internet Protocol or IP. IPv4 addresses are 32-bit
numbers, divided into four 8-bit sections ranging from 0 to 255.
IPv6 addresses are 128-bit numbers divided into eight 16-bit sections, each represented by 2 hexa-
decimal octets.
Routing is a mechanism carrying packers from source to destination, using the IP address, by a
hop-by-hop process.
To prevent packets from being passed on infinitely, a TTL (time to live) value is initialised in the
packet, and each time the packet is forwarded, the value is reduced by 1. When it reaches 0, the packet
has lived too long and gets destroyed.
The link layer gets data across one hop, using protocols based on the medium at the physical layer,
such as wire, Wi-Fi, fibre-optic and ethernet.
Frames use hardware-specific addressing such as media access control MAC addresses.
20
Figure 4. OSI reference model
21