CN Mid Syllabus B.tech
CN Mid Syllabus B.tech
Lecture Notes
UNIT-I
UNIT -I
Introduction to Computer Networks
1.1 Data Communication: When we communicate, we are sharing information. This sharing can
be local or remote. Between individuals, local communication usually occurs face to face,
while remote communication takes place over distance.
Computer Network: A computer network is a set of computers connected together for the
purpose of sharing resources. The most common resource shared today is connection to the
Internet. Other shared resources can include a printer or a file server. The Internet itself can be
considered a computer network.
1.1.1 Components:
A data communications system has five components.
Information today comes in different forms such as text, numbers, images, audio, and video.
Text:
In data communications, text is represented as a bit pattern, a sequence of bits (Os or Is).
Different sets of bit patterns have been designed to represent text symbols. Each set is called a
code, and the process of representing symbols is called coding. Today, the prevalent coding system
is called Unicode, which uses 32 bits to represent a symbol or character used in any language in
the world. The American Standard Code for Information Interchange (ASCII), developed some
decades ago in the United States, now constitutes the first 127 characters in Unicode and is also
referred to as Basic Latin.
Numbers:
Numbers are also represented by bit patterns. However, a code such as ASCII is not used
to represent numbers; the number is directly converted to a binary number to simplify
mathematical operations. Appendix B discusses several different numbering systems.
Images:
Images are also represented by bit patterns. In its simplest form, an image is composed of
a matrix of pixels (picture elements), where each pixel is a small dot. The size of the pixel depends
on the resolution. For example, an image can be divided into 1000 pixels or 10,000 pixels. In the
second case, there is a better representation of the image (better resolution), but more memory is
needed to store the image. After an image is divided into pixels, each pixel is assigned a bit pattern.
The size and the value of the pattern depend on the image. For an image made of only blackand-
white dots (e.g., a chessboard), a I-bit pattern is enough to represent a pixel. If an image is not
made of pure white and pure black pixels, you can increase the size of the bit pattern to include
gray scale. For example, to show four levels of gray scale, you can use 2-bit patterns. A black pixel
can be represented by 00, a dark gray pixel by 01, a light gray pixel by 10, and a white pixel by 11.
There are several methods to represent color images. One method is called RGB, so called because
each color is made of a combination of three primary colors: red, green, and blue. The intensity of
each color is measured, and a bit pattern is assigned to it. Another method is called YCM, in which
a color is made of a combination of three other primarycolors: yellow, cyan, and magenta.
Audio:
2
Audio refers to the recording or broadcasting of sound or music. Audio is by nature
different from text, numbers, or images. It is continuous, not discrete. Even when we use a
microphone to change voice or music to an electric signal, we create a continuous signal. In
Chapters 4 and 5, we learn how to change sound or music to a digital or an analog signal.
Video:
Video refers to the recording or broadcasting of a picture or movie. Video can either be
produced as a continuous entity (e.g., by a TV camera), or it can be a combination of images, each
a discrete entity, arranged to convey the idea of motion. Again we can change video to a digital or
an analog signal.
3
Simplex:
In simplex mode, the communication is unidirectional, as on a one-way street. Only one
of the two devices on a link can transmit; the other can only receive (see Figure a). Keyboards and
traditional monitors are examples of simplex devices. The keyboard can only introduce input; the
monitor can only accept output. The simplex mode can use the entire capacity of the channel to
send data in one direction.
Half-Duplex:
In half-duplex mode, each station can both transmit and receive, but not at the same time.
When one device is sending, the other can only receive, and vice versa The half-duplex mode is
like a one-lane road with traffic allowed in both directions.
When cars are traveling in one direction, cars going the other way must wait. In a half-duplex
transmission, the entire capacity of a channel is taken over by whichever of the two devices is
transmitting at the time. Walkie-talkies and CB (citizens band) radios are both half-duplex systems.
The half-duplex mode is used in cases where there is no need for communication in both
directions at the same time; the entire capacity of the channel can be utilized for each direction.
Full-Duplex:
In full-duplex both stations can transmit and receive simultaneously (see Figure c). The
full-duplex mode is like a tW<D-way street with traffic flowing in both directions at the same time.
In full-duplex mode, si~nals going in one direction share the capacity of the link: with signals
going in the other din~c~on. This sharing can occur in two ways: Either the link must contain two
physically separate t:nmsmissiIDn paths, one for sending and the other for receiving;or the capacity
of the ch:arillilel is divided between signals traveling in both directions. One common example of
full-duplex communication is the telephone network. When two people are communicating by a
telephone line, both can talk and listen at the same time. The full-duplex mode is used when
communication in both directions is required all the time. The capacity of the channel, however,
must be divided between the two directions.
Types of Computer Networks:
A network is a set of devices (often referred to as nodes) connected by communication links. A
node can be a computer, printer, or any other device capable of sending and/or receiving data
generated by other nodes on the network.
* A metropolitan area network is a network that covers a larger geographic area by interconnecting a
different LAN to form a larger network.
* Government agencies use MAN to connect to the citizens and private industries.
* In MAN, various LANs are connected to each other through a telephone exchange line.
* The most widely used protocols in MAN are RS-232, Frame Relay, ATM, ISDN, etc.
*It has a higher range than Local Area Network (LAN).
A wide area network, or WAN, spans a large geographical area, often a country or continent. It contains a collection
of machines intended for running user (i.e., application) programs. These machines are called as hosts. In most WANs,
the network contains numerous transmission lines, each one connecting a pair of routers. If two routers that do not
share a transmission line wish to communicate, they must do this indirectly, via other routers. When a packet is sent from
one route to another via one or more intermediate routers, the packet is received at each intermediate router in its
entirety, stored there until the required output line is free, and then forwarded. A subnet organized according to this
principle is called a store-and-forward or packet-switched subnet.
• Personal Area Network is a network arranged within an individual person, typically within a range of 10 meters.
• Personal Area Network is used for connecting the computer devices of personal use is known as Personal Area
Network.
• Thomas Zimmerman was the first research scientist to bring the idea of the Personal Area Network.
• Personal Area Network covers an area of 30 feet.
• Personal computer devices that are used to develop the personal area network are the aptop, mobile phones,
media player and play stations.
Topologies
What is Topology? :
• Topology defines the structure of the network of how all the components are interconnected to each other.
• There are two types of topology: physical and logical topology.
• Physical topology is the geometric representation of all the nodes in a network.
1. Bus Topology:
• The bus topology is designed in such a way that all the stations are connected through a single cable known
as a backbone cable.
• Each node is either connected to the backbone cable by drop cable or directly connected to the backbone cable.
• When a node wants to send a message over the network, it puts a message over the network. All the stations
available in the network will receive the message whether it has been addressed or not.
• The bus topology is mainly used in 802.3 (Ethernet) and 802.4 standard networks.
• The configuration of a bus topology is quite simpler as compared to other topologies.
• The backbone cable is considered as a "single lane" through which the message is broadcast to all the stations.
• The most common access method of the bus topologies is CSMA (Carrier Sense Multiple Access).
2. Ring Topology :
• Ring topology is like a bus topology, but with connected ends.
• The node that receives the message from the previous computer will retransmit to the next node.
• The data flows in one direction, i.e., it is unidirectional.
• The data flows in a single loop continuously known as an endless loop.
• It has no terminated ends, i.e., each node is connected to other node and having no termination point.
• The data in a ring topology flow in a clockwise direction.
• The most common access method of the ring topology is token passing.
Token passing: It is a network access method in which token is passed from one node to another node.
Token: It is a frame that circulates around the network
• Product availability: Many hardware and software tools for network operation and monitoring are
available.
• Cost: Twisted pair cabling is inexpensive and easily available. Therefore, the installation cost is very low.
• Reliable: It is a more reliable network because the communication system is not dependent on the single
host computer.
Disadvantages of Ring topology :
• Difficult troubleshooting: It requires specialized test equipment to determine the cable faults. If any fault
occurs in the cable, then it would disrupt the communication for all the nodes.
• Failure: The breakdown in one station leads to the failure of the overall network.
• Reconfiguration difficult: Adding new devices to the network would slow down the network.
• Delay: Communication delay is directly proportional to the number of nodes. Adding new devices
increases the communication delay.
3. Star Topology:
• Star topology is an arrangement of the network in which every node is connected to the central hub, switch or a
central computer.
• The central computer is known as a server, and the peripheral devices attached to the server are known
as clients.
• Coaxial cable or RJ-45 cables are used to connect the computers.
• Hubs or Switches are mainly used as connection devices in a physical star topology.
• Star topology is the most popular topology in network implementation.
4. Tree topology
• Tree topology combines the characteristics of bus topology and star topology.
• A tree topology is a type of structure in which all the computers are connected with each other in hierarchical
fashion.
• The top-most node in tree topology is known as a root node, and all other nodes are the descendants of the root
node.
• There is only one path exists between two nodes for the data transmission. Thus, it forms a parent -child
hierarch
5. Mesh topology
• Mesh technology is an arrangement of the network in which computers are interconnected with each other
through various redundant connections.
• There are multiple paths from one computer to another computer.
• It does not contain the switch, hub or any central computer which acts as a central point of communication.
• The Internet is an example of the mesh topology.
• Mesh topology is mainly used for WAN implementations where communication failures are a critical concern.
• Mesh topology is mainly used for wireless networks.
• Mesh topology can be formed by using the formula:
6. Hybrid Topology
• The combination of various different topologies is known as Hybrid topology.
• A Hybrid topology is a connection between different links and nodes to transfer the data.
• When two or more different topologies are combined together is termed as Hybrid topology and if similar
topologies are connected with each other will not result in Hybrid topology. For example, if there exist a ring
topology in one branch of ICICI bank and bus topology in another branch of ICICI bank, connecting these two
topologies will result in Hybrid topology.
THE INTERNET
The Internet has revolutionized many aspects of our daily lives. It has affected the way we do
business as well as the way we spend our leisure time. Count the ways you've used the Internet
recently. Perhaps you've sent electronic mail (e-mail) to a business associate, paid a utility bill,
read a newspaper from a distant city, or looked up a local movie schedule-all by using the Internet.
The Internet is a communication system that has brought a wealth of information to our fingertips
and organized it for our use.
A Brief History :
A network is a group of connected communicating devices such as computers and printers. An
internet (note the lowercase letter i) is two or more networks that can communicate with each other.
The most notable internet is called the Internet (uppercase letter I), a collaboration of more than
hundreds of thousands of interconnected networks. The Advanced Research Projects Agency
(ARPA) in the Department of Defense (DoD) was interested in finding a way to connect computers
so that the researchers they funded could share their findings, thereby reducing costs and
eliminating duplication of effort. In 1967, at an Association for Computing Machinery (ACM)
meeting, ARPA presented its ideas for ARPANET, a small network of connected
computers.
Transmission Control Protocol (TCP) and Internetworking Protocol (lP). IP would handle
datagram routing while TCP would be responsible for higher-level functions such as segmentation,
reassembly, and error detection. The internetworking protocol became known as TCPIIP.
The Internet Today
The Internet has come a long way since the 1960s. The Internet today is not a simple hierarchical
structure. It is made up of many wide- and local-area networks joined by connecting devices and
switching stations. It is difficult to give an accurate representation of the Internet because it is
continually changing-new networks are being added, existing networks are adding addresses, and
networks of defunct companies are being removed.
Internet service provider (ISP), company that provides Internet connections and services to
individuals and organizations. In addition to providing access to the Internet, ISPs may also provide
software packages (such as browsers), e-mail accounts, and a personal Web site or home page.
POP protocol is used in the application layer protocol, and it delivers best ability to fetch and
receive all email by users.
International Internet Service Providers:
At the top of the hierarchy are the international service providers that connect nations
together.
National Internet Service Providers:
The national Internet service providers are backbone networks created and
maintained by specialized companies. There are many national ISPs operating in North
America; some of the most well known are SprintLink, PSINet, UUNet Technology, AGIS,
and internet Mel. To provide connectivity between the end users, these backbone networks
are connected by complex switching stations (normally run by a third party) called network
access points (NAPs). Some national ISP networks are also connected to one another by
private switching stations called peering points. These normally operate at a high data rate
(up to 600 Mbps).
Regional Internet Service Providers:
Regional internet service providers or regional ISPs are smaller ISPs that are
connected to one or more national ISPs. They are at the third level of the hierarchy with a
smaller data rate. Local Internet Service Providers:
Local Internet service providers provide direct service to the end users. The local
ISPs can be connected to regional ISPs or directly to national ISPs. Most end users are
connected tothe local ISPs. Note that in this sense, a local ISP can be a company that just
provides Internet services, a corporation with a network that supplies services to its own
employees, or a nonprofit organization, such as a college or a university, that runs its own
network. Each of these local ISPs can be connected to a regional or national service provider.
SWITCHING
A network is a set of connected devices. Whenever we have multiple devices, we have the problem
of how to connect them to make one-to-one communication possible. One solution is to make a
point-to-point connection between each pair of devices (a mesh topology) or between a central
device and every other device (a star topology). These methods, however, are impractical and
wasteful when applied to very large networks. The number and length of the links require too
much infrastructure to be cost-efficient, and the majority of those links would be idle most of the
time. Other topologies employing multipoint connections, such as a bus, are ruled out because the
distances between devices and the total number of devices increase beyond the capacities of the
media and equipment.
A better solution is switching. A switched network consists of a series of interlinked nodes, called
switches. Switches are devices capable of creating temporary connections between two or more
devices linked to the switch. In a switched network, some of these nodes are connected to the end
systems (computers or telephones, for example). Others are used only for routing. Figure
8.1 shows a switched network.
The end systems (communicating devices) are labeled A, B, C, D, and so on, and the switches are
labeled I, II, III, IV, and V. Each switch is connected to multiple links. Traditionally, three
methods of switching have been important: circuit switching, packet switching, and message
switching. The first two are commonly used today. The third has been phased out in general
communications but still has networking applications. We can then divide today's networks into
three broad categories: circuit-switched networks, packet-switched networks, and message-
switched.
CIRCUIT-SWITCHED NETWORKS
A circuit-switched network consists of a set of switches connected by physical links. A connection
between two stations is a dedicated path made of one or more links. However, each connection
uses only one dedicated channel on each link. Each link is normally divided into n channels by
using FDM or TDM. Figure 8.3 shows a trivial circuit-switched network with four switches and
four links. Each link is divided into n (n is 3 in the figure) channels by using FDM or TDM.
The end systems, such as computers or telephones, are directly connected to a switch. We have
shown only two end systems for simplicity. When end system A needs to communicate with end
system M, system A needs to request a connection to M that must be accepted by all switches as
well as by M itself. This is called the setup phase; a circuit (channel) is reserved on each link, and
the combination of circuits or channels defines the dedicated path. After the dedicated path made
of connected circuits (channels) is established, data transfer can take place. After all data have been
transferred, the circuits are torn down.
Three Phases
The actual communication in a circuit-switched network requires three phases: connection setup,
data transfer, and connection teardown.
Setup Phase
Before the two parties (or multiple parties in a conference call) can communicate, a dedicated
circuit (combination of channels in links) needs to be established. The end systems are normally
connected through dedicated lines to the switches, so connection setup means creating dedicated
channels between the switches. For example, in Figure 8.3, when system A needs to connect to
system M, it sends a setup request that includes the address of system M, to switch I. Switch I finds
a channel between itself and switch IV that can be dedicated for this purpose. Switch I then sends
the request to switch IV, which finds a dedicated channel between itself and switch III. Switch III
informs system M of system A's intention at this time. In the next step to making a connection, an
acknowledgment from system M needs to be sent in the opposite direction to system A. Only after
system A receives this acknowledgment is the connection established.
Data Transfer Phase
After the establishment of the dedicated circuit (channels), the two parties can transfer data.
Teardown Phase
When one of the parties needs to disconnect, a signal is sent to each switch to release the resources.
1. PACKET SWITCHED NETWORK
In a Computer Network, the communication between two ends is done in blocks of data called
packets. So instead of continuous communication the exchange takes place in the form of
individual packets between the two computers. This allows us to make the switches function for
both storing and forwarding because a packet is an independent entity that can be stored and
sent later.
a. DATAGRAM NETWORKS
In data communications, we need to send messages from one end system to another. If the message
is going to pass through a packet-switched network, it needs to be divided into packets of fixed or
variable size. The size of the packet is determined by the network and the governing protocol. In
packet switching, there is no resource allocation for a packet. This means that there is no reserved
bandwidth on the links, and there is no scheduled processing time for each packet. Resources are
allocated on demand. The allocation is done on a first come, first-served basis. When a switch
receives a packet, no matter what is the source or destination, the packet must wait if there are
other packets being processed. As with other systems in our daily life, this lack of reservation may
create delay. For example, if we do not have a reservation at a restaurant, we
might have to wait. In a datagram network, each packet is treated independently of all others. Even
if a packet is part of a multipacket transmission, the network treats it as though it existed alone.
Packets in this approach are referred to as datagrams. Datagram switching is normally done at the
network layer. The switches in a datagram network are traditionally referred to as routers.
In this example, all four packets (or datagrams) belong to the same message, but may travel
different paths to reach their destination. This is so because the links may be involved in carrying
packets from other sources and do not have the necessary bandwidth available to carry all the
packets from A to X. This approach can cause the datagrams of a transmission to arrive at their
destination out of order with different delays between them packets. Packets may also be lost or dropped
because of a lack of resources. In most protocols, it is the responsibility of an upper- layer protocol to
reorder the datagrams or ask for lost datagrams before passing them on to the application. The datagram
networks are sometimes referred to as connectionless networks. The term connectionless here means
that the switch (packet switch) does not keep information about the connection state. There are no setup
or teardown phases. Each packet is treated the same by a switch regardless of its source or destination.
b. VIRTUAL-CIRCUIT NETWORKS
A virtual-circuit network is a cross between a circuit-switched network and a datagram network. It has
some characteristics of both.
1. As in a circuit-switched network, there are setup and teardown phases in addition to the data transfer phase.
2. Resources can be allocated during the setup phase, as in a circuit-switched network, or on demand,
as in a datagram network.
3. As in a datagram network, data are packetized and each packet carries an address in the header.
However, the address in the header has local jurisdiction , not end-to-end jurisdiction. The reader may
ask how the intermediate switches know where to send the packet if there is no final destination address
carried by a packet.
4. As in a circuit-switched network, all packets follow the same path established during the connection.
5. A virtual-circuit network is normally implemented in the data link layer, while a circuit- switched
network is implemented in the physical layer and a datagram network in the network layer. But this may
change in the future.
Figure 8.10 is an example of a virtual-circuit network. The network has switches that allow traffic from
sources to destinations. A source or destination can be a computer, packet switch, bridge, or any other
device that connects other networks.
LAYERED TASKS
21
0n the Way: The letter is then on its way to the recipient. On the way to
the recipient's local post office, the letter may actually go through a central
office. In addition, it may be transported by truck, train, airplane, boat, or
a combination of these.
At the Receiver Site
• Lower layer. The carrier transports the letter to the post office.
• Middle layer. The letter is sorted and delivered to the recipient's mailbox.
• Higher layer. The receiver picks up the letter, opens the envelope, and reads it.
22
.
Fig.4: The OSI reference model
The Physical Layer:
The physical layer is concerned with transmitting raw bits over a communication channel. The
design issues have to do with making sure that when one side sends a 1 bit, it is received by the
other side as a 1 bit, not as a 0 bit.
The Data Link Layer:
The main task of the data link layer is to transform a raw transmission facility into a line that
appears free of undetected transmission errors to the network layer. It accomplishes this task by
having the sender break up the input data into data frames (typically a few hundred or a few
thousand bytes) and transmits the frames sequentially. If the service is reliable, the receiver
confirms correct receipt of each frame by sending back an acknowledgement frame.
Another issue that arises in the data link layer (and most of the higher layers as well) is how to
keep a fast transmitter from drowning a slow receiver in data. Some traffic regulation mechanism
is often needed to let the transmitter know how much buffer space the receiver has at the moment.
Frequently, this flow regulation and the error handling are integrated.
22
The Network Layer:
The network layer controls the operation of the subnet. A key design issue is determining how
packets are routed from source to destination. Routes can be based on static tables that are ''wired
into'' the network and rarely changed. They can also be determined at the start of each conversation,
for example, a terminal session (e.g., a login to a remote machine). Finally, they can be highly
dynamic, being determined anew for each packet, to reflect the current network load.
If too many packets are present in the subnet at the same time, they will get in one another's way,
forming bottlenecks. The control of such congestion also belongs to the network layer. More
generally, the quality of service provided (delay, transit time, jitter, etc.) is also a network layer
issue.
When a packet has to travel from one network to another to get to its destination, many problems
can arise. The addressing used by the second network may be different from the first one. The
second one may not accept the packet at all because it is too large. The protocols may differ, and
so on. It is up to the network layer to overcome all these problems to allow heterogeneous networks
to be interconnected. In broadcast networks, the routing problem is simple, so the network layer is
often thin or even nonexistent.
The Transport Layer:
The basic function of the transport layer is to accept data from above, split it up into smallerunits
if need be, pass these to the network layer, and ensure that the pieces all arrive correctly at the
other end. Furthermore, all this must be done efficiently and in a way that isolates the upper layers
from the inevitable changes in the hardware technology. The transport layer also determines what
type of service to provide to the session layer, and, ultimately, to the users of the network. The
most popular type of transport connection is an error-free point-to-point channel that delivers
messages or bytes in the order in which they were sent. However, other possible kinds of transport
service are the transporting of isolated messages, with no guarantee about the order of delivery,
and the broadcasting of messages to multiple destinations. The type of service is determined when
the connection is established.
The transport layer is a true end-to-end layer, all the way from the source to the destination. In
other words, a program on the source machine carries on a conversation with a similar program on
the destination machine, using the message headers and control messages.
The Session Layer:
The session layer allows users on different machines to establish sessions between them. Sessions
offer various services, including dialog control (keeping track of whose turn it is to transmit), token
management (preventing two parties from attempting the same critical operation at the same time),
and synchronization (check pointing long transmissions to allow them to continue from where they
were after a crash).
The Presentation Layer:
The presentation layer is concerned with the syntax and semantics of the information transmitted.
In order to make it possible for computers with different data representations to communicate, the
data structures to be exchanged can be defined in an abstract way, along with a standard encoding
to be used ''on the wire.'' The presentation layer manages these abstract data structures and allows
higher-level data structures (e.g., banking records), to be defined and exchanged.
The Application Layer:
The application layer contains a variety of protocols that are commonly needed by users. One
widely-used application protocol is HTTP (Hypertext Transfer Protocol), which is the basis for the
World Wide Web. When a browser wants a Web page, it sends the name of the page it wants to
the server using HTTP. The server then sends the page back. Other application protocols are used
for file transfer, electronic mail, and network news.
24
3. Transport Layer
4. Application Layer
Application Layer
Transport Layer
Internet Layer Host-to-
Network Layer
Host-to-Network Layer:
The TCP/IP reference model does not really say much about what happens here, except to point
out that the host has to connect to the network using some protocol so it can send IP packets to it.
This protocol is not defined and varies from host to host and network to network.
Internet Layer:
This layer, called the internet layer, is the linchpin that holds the whole architecture together. Its
job is to permit hosts to inject packets into any network and have they travel independently to the
destination (potentially on a different network). They may even arrive in a different order than they
were sent, in which case it is the job of higher layers to rearrange them, if in-order delivery is
desired. Note that ''internet'' is used here in a generic sense, even though this layer is present in the
Internet.
The internet layer defines an official packet format and protocol called IP (Internet Protocol). The
job of the internet layer is to deliver IP packets where they are supposed to go. Packet routing is
clearly the major issue here, as is avoiding congestion. For these reasons, it is reasonable to say
that the TCP/IP internet layer is similar in functionality to the OSI network layer. Fig. shows this
correspondence.
The Transport Layer:
The layer above the internet layer in the TCP/IP model is now usually called the transport layer.
It is designed to allow peer entities on the source and destination hosts to carry on a conversation,
just as in the OSI transport layer. Two end-to-end transport protocols have been defined here. The
first one, TCP (Transmission Control Protocol), is a reliable connection- oriented protocol that
allows a byte stream originating on one machine to be delivered without error on any other machine
in the internet. It fragments the incoming byte stream into discrete messages and passes each one
on to the internet layer. At the destination, the receiving TCP process reassembles the received
messages into the output stream. TCP also handles flow control
25
to make sure a fast sender cannot swamp a slow receiver with more messages than it can handle.
26
The Application Layer:
The TCP/IP model does not have session or presentation layers. On top of the transport layer is the
application layer. It contains all the higher-level protocols. The early ones included virtual terminal
(TELNET), file transfer (FTP), and electronic mail (SMTP), as shown in Fig.6.2. The virtual
terminal protocol allows a user on one machine to log onto a distant machine and work there. The
file transfer protocol provides a way to move data efficiently from one machine to another.
Electronic mail was originally just a kind of file transfer, but later a specialized protocol (SMTP)
was developed for it. Many other protocols have been added to these over the years: the Domain
Name System (DNS) for mapping host names onto their network addresses, NNTP, the protocol
for moving USENET news articles around, and HTTP, the protocol for fetching pageson the World
Wide Web, and many others.
27
Finally, the peer protocols used in a layer are the layer's own business. It can use any protocols it
wants to, as long as it gets the job done (i.e., provides the offered services). It can also change
them at will without affecting software in higher layers.
The TCP/IP model did not originally clearly distinguish between service, interface, and protocol,
although people have tried to retrofit it after the fact to make it more OSI-like. For example, the
only real services offered by the internet layer are SEND IP PACKET and RECEIVE IP
.
2. In OSI model the transport layer 2. In TCP/IP model the transport layer does not
guarantees the delivery of packets. guarantees delivery of packets. Still the TCP/IP
model is more reliable.
4. OSI model has a separate 4. TCP/IP does not have a separate Presentation
Presentation layer and Session layer. layer or Session layer.
2. Network layer of OSI model provides 8. The Network layer in TCP/IP model provides
both connection oriented and connectionless service.
connectionless service.
9. OSI model has a problem of fitting the 9. TCP/IP model does not fit any protocol
protocols into the model.
10. Protocols are hidden in OSI model 10. In TCP/IP replacing protocol is not easy.
and are easily replaced as the technology
changes.
Client-Server Paradigm
o The traditional paradigm is called the client-server paradigm.
o It was the most popular Paradigm.
o In this paradigm, the service provider is an application program, called the
server process; it runs continuously, waiting for another application program,
called the client process, to make a connection through the Internet and ask for
service.
o The server process must be running all the time; the client process is started
when the client needs to receive service.
o There are normally some server processes that can provide a specific type
of service, but there are many clients that request service from any of these
server processes.
Peer-to-Peer(P2P) Paradigm
o A new paradigm, called the peer-to-peer paradigm has emerged to respond
to the needs of some new applications.
o In this paradigm, there is no need for a server process to be running all the
time and waiting for the client processes to connect.
o The responsibility is shared between peers.
o A computer connected to the Internet can provide service at one time and
receive service at another time.
o A computer can even provide and receive services at the same time.
Introduction to Sockets
A socket is one endpoint of a two way communication link between two
programs running on the network. The socket mechanism provides a means of inter-
process communication (IPC) by establishing named contact points between which the
communication take place.
Like ‘Pipe’ is used to create pipes and sockets is created using ‘socket’ system
call. The socket provides bidirectional FIFO Communication facility over the network.
A socket connecting to the network is created at each end of the communication. Each
socket has a specific address. This address is composed of an IP address and a port
number.
Socket are generally employed in client server applications. The server creates a
socket, attaches it to a network port addresses then waits for the client to contact it. The
client creates a socket and then attempts to connect to the server socket. When the
connection is established, transfer of data takes place.
Socket Addresses
The interaction between a client and a server is two-way communication. In a two-way
communication, we need a pair of addresses:
local (sender) and remote (receiver).
The local address in one direction is the remote address in the other direction, and vice
versa. Because communication in the client/server paradigm is between two sockets,
we need a pair of socket addresses for communication:
a local socket address and a remote socket address.
A socket address should first define the computer on which a client or a server is
running. A computer in the Internet is uniquely defined by its IP address, a 32-bit integer
in the current Internet version. An application program can be defined by a port number,
a 16-bit integer. This means that a socket address should be a combination of an IP
address and a port number as shown in Figure 10.7.
Because a socket defines the end-point of the communication, we can say that a socket
is identified by a pair of socket addresses, a local and a remote.
Server Site
The server needs a local (server) and a remote (client) socket address for communication.
Local Socket Address The local (server) socket address is provided by the operating
system. The operating system knows the IP address of the computer on which the server
process is running. The port number of a server process, however, needs to be assigned.
If the server process is a standard one defined by the Internet authority, a port number
is already assigned to it. When a server starts running, it knows the local socket address.
Remote Socket Address The remote socket address for a server is the socket address
of the client that makes the connection. Because the server can serve many clients, it
does not know beforehand the remote socket address for communication. The server
can find this socket address when a client tries to connect to the server. The client
socket address, which is contained in the request packet sent to the server, becomes
the remote socket address that is used for responding to the client.
Client Site
The client also needs a local (client) and a remote (server) socket address for communication.
Local Socket Address The local (client) socket address is also provided by the operating
system. The operating system knows the IP address of the computer on which the client
is running. The port number, however, is a 16- bit temporary integer that is assigned to
a client process each time the process needs to start the communication. The port
number, however, needs to be assigned from a set of integers defined by the Internet
authority and called the ephemeral (temporary) port numbers. The operating system,
however, needs to guarantee that the new port number is not used by any other running
client process.
Remote Socket Address Finding the remote (server) socket address for a client, however,
needs more work. When a client process starts, it should know the socket address of the
server it wants to connect to. We will have two situations in
this case.
Sometimes, the user who starts the client process knows both the server port number and IP
address of the computer on which the server is running. This usually occurs in situations when we have
written client and server applications and we want to test them
Although each standard application has a well-known port number, most of the time, we do not
know the IP address. This happens in situations such as when we need to contact a web page, send an
e-mail to a friend, or copy a file from a remote site. In these situations, the server has a name, an
identifier that uniquely defines the server process. Examples of these identifiers are URLs, such as
www.xxx.yyy, or e-mail addresses, such as xxxx@yyyy.com. The client process should now change
this identifier (name) to the corresponding server socket address.
Application-Layer Paradigms
Two paradigms have been developed for Application Layer
3. Traditional Paradigm : Client-Server
4. New Paradigm : Peer-to-Peer
Client-Server Paradigm
o The traditional paradigm is called the client-server paradigm.
o It was the most popular Paradigm.
o In this paradigm, the service provider is an application program, called the server process; it
runs continuously, waiting for another application program, called the client process, to make
a connection through the Internet and ask for service.
o The server process must be running all the time; the client process is started when the client
needs to receive service.
o There are normally some server processes that can provide a specific type of service, but
there are many clients that request service from any of these server processes.
Peer-to-Peer(P2P) Paradigm
o A new paradigm, called the peer-to-peer paradigm has emerged to respond to the needs of
some new applications.
o In this paradigm, there is no need for a server process to be running all the time and waiting
for the client processes to connect.
o The responsibility is shared between peers.
o A computer connected to the Internet can provide service at one time and receive service at
another time.
o A computer can even provide and receive services at the same time.
Mixed Paradigm
o An application may choose to use a mixture of the two paradigms by combining the
advantages of both.
o For example, a light-load client-server communication can be used to find the address of
the peer that can offer a service.
o When the address of the peer is found, the actual service can be received from the peer by
using the peer-to-peer paradigm.
Types of Application Protocols:
Standard and Nonstandard Protocols
o Each standard protocol is a pair of computer programs that interact with the
user and the transport layer to provide a specific service to the user.
START_LINE <CRLF>
MESSAGE_HEADER <CRLF>
<CRLF> MESSAGE_BODY <CRLF>
where <CRLF> stands for carriage-return-line-feed.
Features of HTTP
o Connectionless protocol:
HTTP is a connectionless protocol. HTTP client initiates a request and waits for a response from the
server. When the server receives the request, the server processes the request and sends back the
response to the HTTP client after which the client disconnects the connection. The connection
between client and server exist only during the current request and response time only.
o Media independent:
HTTP protocol is a media independent as data can be sent as long as both the client and server
know how to handle the data content. It is required for both the client and server to specify the
content type in MIME-type header.
o Stateless:
HTTP is a stateless protocol as both the client and server know each other only during the current
request. Due to this nature of the protocol, both the client and server do not retain the information
between various requests of the web pages.
HTTP Request And Response Messages
• The HTTP protocol defines the format of the request and response messages.
• Request Message: The request message is sent by the client that consists of a request line,
headers, and sometimes a body.
• Response Message: The response message is sent by the server to the client that consists of
a status line, headers, and sometimes a body.
Request Line
• There are three fields in this request line - Method, URL and Version.
• The Method field defines the request types.
• The URL field defines the address and name of the corresponding web page.
• The Version field gives the version of the protocol; the most current version of
HTTP is 1.1.
• Some of the Method types are:
Request Header
• Each request header line sends additional information from the client to the server.
• Each header line has a header name, a colon, a space, and a header value.
• The value field defines the values associated with each header name.
• Headers defined for request message include:
Body
• The body can be present in a request message. It is optional.
• Usually, it contains the comment to be sent or the file to be published on the website when
the method is PUT or POST.
Conditional Request
• A client can add a condition in its request.
• In this case, the server will send the requested web page if the condition is met or inform
the client otherwise.
• One of the most common conditions imposed by the client is the time and date the web
page is modified.
• The client can send the header line If-Modified-Since with the request to tell the server that
it needs the page only if it is modified after a certain point in time.
Response Header
• Each header provides additional information to the client.
• Each header line has a header name, a colon, a space, and a header value.
• Some of the response headers are:
Body
• The body contains the document to be sent from the server to the client.
• The body is present unless the response is an error message.
HTTP CONNECTIONS
• HTTP Clients and Servers exchange multiple messages over the same TCP connection.
• If some of the objects are located on the same server, we have two choices: to retrieve each
object using a new TCP connection or to make a TCP connection and retrieve them all.
• The first method is referred to as a non-persistent connection, the second as a persistent
connection.
• HTTP 1.0 uses non-persistent connections and HTTP 1.1 uses persistent connections .
Non-Persistent Connections
• In a non-persistent connection, one TCP connection is made for each request/response.
• Only one object can be sent over a single TCP connection
• The client opens a TCP connection and sends a request.
• The server sends the response and closes the connection.
• The client reads the data until it encounters an end-of-file marker.
• It then closes the connection.
Persistent Connections
• HTTP version 1.1 specifies a persistent connection by default.
• Multiple objects can be sent over a single TCP connection.
• In a persistent connection, the server leaves the connection open for more requests after
sending a response.
• The server can close the connection at the request of a client or if a time-out has been
reached.
• Time and resources are saved using persistent connections. Only one set of buffers and
variables needs to be set for the connection at each site.
• The round trip time for connection establishment and connection termination is saved.
Http Cookies
• An HTTP cookie (also called web cookie, Internet cookie, browser cookie, or simply cookie)
is a small piece of data sent from a website and stored on the user's computer by the user's web
browser while the user is browsing.
• They can also be used to remember arbitrary pieces of information that the user previously
entered into form fields such as names, addresses, passwords, and credit card numbers.
Components of Cookie
A cookie consists of the following components:
1. Name
2. Value
3. Zero or more attributes (name/value pairs). Attributes store information such as
the cookie's expiration, domain, and flags.
Using Cookies
• When a client sends a request to a server, the browser looks in the cookie directory to see if
it can find a cookie sent by that server.
• If found, the cookie is included in the request.
• When the server receives the request, it knows that this is an old client, not a new one.
• The contents of the cookie are never read by the browser or disclosed to the user. It is a
cookie made by the server and eaten by the server.
Types of Cookies
1. Authentication cookies
These are the most common method used by web servers to know whether the user is logged in or
not, and which account they are logged in with. Without such a mechanism, the site would not know
whether to send a page containing sensitive information, or require the user to authenticate
themselves by logging in.
2. Tracking cookies
These are commonly used as ways to compile individuals browsing histories.
3. Session cookie
A session cookie exists only in temporary memory while the user navigates the website. Web
browsers normally delete session cookies when the user closes the browser.
4. Persistent cookie
Instead of expiring when the web browser is closed as session cookies do, a persistent cookie expires
at a specific date or after a specific length of time. This means that, for the cookie's entire lifespan ,
its information will be transmitted to the server every time the user visits the website that it belongs
to, or every time the user views a resource belonging to that website from another website
Http Caching
HTTP Caching enables the client to retrieve document faster and reduces load on the server.
HTTP Caching is implemented at Proxy server, ISP router and Browser.
Server sets expiration date (Expires header) for each page, beyond which it is not cached.
HTTP Cache document is returned to client only if it is an updated copy by checking against If-
Modified-Since header.
If cache document is out-of-date, then request is forwarded to the server and response is cached
along the way.
A web page will not be cached if no-cache directive is specified.
HTTP SECURITY
HTTP does not provide security.
However HTTP can be run over the Secure Socket Layer (SSL).
In this case, HTTP is referred to as HTTPS.
HTTPS provides confidentiality, client and server authentication, and data integrity.
FTP OBJECTIVES
It provides the sharing of files.
It is used to encourage the use of remote computers.
It transfers the data more reliably and efficiently.
FTP MECHANISM
FTP CONNECTIONS
There are two types of connections in FTP - Control Connection and Data Connection.
The control connection remains connected during the entire interactive FTP session.
The data connection is opened and then closed for each file transfer activity. When a user starts an
FTP session, the control connection opens.
While the control connection is open, the data connection can be opened and closed multiple times
if several files are transferred.
FTP COMMUNICATION
FTP Communication is achieved through commands and responses.
FTP Commands are sent from the client to the server
FTP responses are sent from the server to the client.
FTP Commands are in the form of ASCII uppercase, which may or may not be followed by an
argument.
Some of the most common commands are:
Every FTP command generates at least one response.
A response has two parts: a three-digit number followed by text.
The numeric part defines the code; the text part defines needed parameter.
FTP SECURITY
FTP requires a password, the password is sent in plaintext which is unencrypted. This means it can
be intercepted and used by an attacker.
The data transfer connection also transfers data in plaintext, which is insecure.
To be secure, one can add a Secure Socket Layer between the FTP application layer and the TCP
layer.
In this case FTP is called SSL-FTP.
When the sender and the receiver of an e-mail are on the same system, we need only two User
Agents and no Message Transfer Agent
When the sender and the receiver of an e-mail are on different system, we need two UA, two pairs
of MTA (client and server), and two MAA (client and server).
WORKING OF EMAIL
When Alice needs to send a message to Bob, she runs a UA program to prepare the message
and send it to her mail server.
The mail server at her site uses a queue (spool) to store messages waiting to be sent. The message,
however, needs to be sent through the Internet from Alice’s
site to Bob’s site using an MTA.
Here two message transfer agents are needed: one client and one server.
The server needs to run all the time because it does not know when a client will ask for a
connection.
The client can be triggered by the system when there is a message in the queue to be sent.
The user agent at the Bob site allows Bob to read the received message.
Bob later uses an MAA client to retrieve the message from an MAA server running on the second
server.
GUI-based
o Modern user agents are GUI-based.
o They allow the user to interact with the software by using both the keyboard and the mouse.
o They have graphical components such as icons, menu bars, and windows that make the
services easy to access.
o Some examples of GUI-based user agents are Eudora and Outlook.
Email was extended in 1993 to carry many different types of data: audio, video, images, Word
documents, and so on.
This extended version is known as MIME(Multipurpose Mail Extension).
SMTP also allows the use of Relays allowing other MTAs to relay the mail.
SMTP MAIL FLOW
SMTP Commands
Commands are sent from the client to the server. It consists of a keyword followed by zero or more
arguments. SMTP defines 14 commands.
SMTP Responses
Responses are sent from the server to the client.
A response is a three digit code that may be followed by additional textual information.
SMTP OPERATIONS
Basic SMTP operation occurs in three phases:
1. Connection Setup
2. Mail Transfer
3. Connection Termination
Connection Setup
An SMTP sender will attempt to set up a TCP connection with a target host when it has
one or more mail messages to deliver to that host.
The sequence is quite simple:
1. The sender opens a TCP connection with the receiver.
2. Once the connection is established, the receiver identifies itself with "Service Ready”.
3. The sender identifies itself with the HELO command.
4. The receiver accepts the sender's identification with "OK".
5. If the mail service on the destination is unavailable, the destination host returns a "Service
Not Available" reply in step 2, and the process is terminated.
Mail Transfer
Once a connection has been established, the SMTP sender may send one or more messages to the
SMTP receiver.
There are three logical phases to the transfer of a message:
1. A MAIL command identifies the originator of the message.
2. One or more RCPT commands identify the recipients for this message.
3. A DATA command transfers the message text.
Connection Termination
The SMTP sender closes the connection in two steps.
First, the sender sends a QUIT command and waits for a reply.
The second step is to initiate a TCP close operation for the TCP connection.
The receiver initiates its TCP close after sending its reply to the QUIT command.
Limitations Of Smtp
SMTP cannot transmit executable files or other binary objects.
SMTP cannot transmit text data that includes national language characters, as these are represented
by 8-bit codes with values of 128 decimal or higher, and SMTP is limited to 7-bit ASCII.
SMTP servers may reject mail message over a certain size.
SMTP gateways that translate between ASCII and the character code EBCDIC do not use a
consistent set of mappings, resulting in translation problems.
Some SMTP implementations do not adhere completely to the SMTP standards defined.
Common problems include the following:
1. Deletion, addition, or recording of carriage return and linefeed.
2. Truncating or wrapping lines longer than 76 characters.
3. Removal of trailing white space (tab and space characters).
4. Padding of lines in a message to the same length.
5. Conversion of tab characters into multiple-space characters.
MIME HEADERS
Using headers, MIME describes the type of message content and the encoding used.
Headers defined in MIME are:
• MIME-Version- current version, i.e., 1.1
• Content-Type - message type (text/html, image/jpeg, application/pdf)
• Content-Transfer-Encoding - message encoding scheme (eg base64).
• Content-Id - unique identifier for the message.
• Content-Description - describes type of the message body.
MTA is a mail daemon (send mail) active on hosts having mailbox, used to send an email.
Mail passes through a sequence of gateways before it reaches the recipient mail server.
Each gateway stores and forwards the mail using Simple mail transfer protocol (SMTP).
SMTP defines communication between MTAs over TCP on port 25.
In an SMTP session, sending MTA is client and receiver is server. In each exchange:
Client posts a command (HELO, MAIL, RCPT, DATA, QUIT, VRFY, etc.)
Server responds with a code (250, 550, 354, 221, 251 etc) and an explanation.
Client is identified using HELO command and verified by the server
Client forwards message to server, if server is willing to accept.
Message is terminated by a line with only single period (.) in it.
Eventually client terminates the connection.
OPERATION OF IMAP
The mail transfer begins with the client authenticating the user and identifying the mailbox they
want to access.
Client Commands
LOGIN, AUTHENTICATE, SELECT, EXAMINE, CLOSE, and LOGOUT
Server Responses
OK, NO (no permission), BAD (incorrect command),
When user wishes to FETCH a message, server responds in MIME format.
Message attributes such as size are also exchanged.
Flags are used by client to report user actions. SEEN,
ANSWERED, DELETED, RECENT
IMAP4
The latest version is IMAP4. IMAP4 is more powerful and more complex.
IMAP4 provides the following extra functions:
Advantages Of IMAP
With IMAP, the primary storage is on the server, not on the local machine.
Email being put away for storage can be foldered on local disk, or can be foldered on the IMAP
server.
The protocol allows full user of remote folders, including a remote folder hierarchy and multiple
inboxes.
It keeps track of explicit status of messages, and allows for user-defined status.
Supports new mail notification explicitly.
Extensible for non-email data, like netnews, document storage, etc.
Selective fetching of individual MIME body parts.
Server-based search to minimize data transfer.
Servers may have extensions that can be negotiated.
POST OFFICE
1.8.4.4
PROTOCOL (POP3)
Post Office Protocol (POP3) is an application-layer Internet standard protocol used by local e-mail
clients to retrieve e-mail from a remote server over a TCP/IP connection.
There are two versions of POP.
• The first, called POP2, became a standard in the mid-80's and requires SMTP to send
messages.
• The current version, POP3, can be used with or without SMTP. POP3 uses TCP/IP port 110.
POP is a much simpler protocol, making implementation easier.
POP supports offline access to the messages, thus requires less internet usage time
POP does not allow search facility.
In order to access the messages, it is necessary to download them.
It allows only one mailbox to be created on server.
It is not suitable for accessing non mail data.
POP mail moves the message from the email server onto the local computer, although there is
usually an option to leave the messages on the email server as well.
POP treats the mailbox as one store, and has no concept of folders.
POP works in two modes namely, delete and keep mode.
• In delete mode, mail is deleted from the mailbox after retrieval. The delete mode is normally
used when the user is working at their permanent computer and can save and organize the
received mail after reading or replying.
• In keep mode, mail after reading is kept in mailbox for later retrieval. The keep mode is
normally used when the user accesses her mail away from their primary computer .
POP3 client is installed on the recipient computer and POP server on the mail server.
Client opens a connection to the server using TCP on port 110.
Client sends username and password to access mailbox and to retrieve messages.
POP3 Commands
POP commands are generally abbreviated into codes of three or four letters The
following describes some of the POP commands:
1. UID - This command opens the connection
2. STAT - It is used to display number of messages currently in the mailbox
3. LIST - It is used to get the summary of messages
4. RETR -This command helps to select a mailbox to access the messages
5. DELE - It is used to delete a message
6. RSET - It is used to reset the session to its initial state
7. QUIT - It is used to log off the session
WORKING OF DNS
The following six steps shows the working of a DNS. It maps the host name to an IP address:
1. The user passes the host name to the file transfer client.
2. The file transfer client passes the host name to the DNS client.
3. Each computer, after being booted, knows the address of one DNS server. The DNS client
sends a message to a DNS server with a query that gives the file transfer server name using
the known IP address of the DNS server.
4. The DNS server responds with the IP address of the desired file transfer server.
5. The DNS server passes the IP address to the file transfer client.
6. The file transfer client now uses the received IP address to access the file transfer server.
NAME SPACE
To be unambiguous, the names assigned to machines must be carefully selected from a name space
with complete control over the binding between the names and IP address.
The names must be unique because the addresses are unique.
A name space that maps each address to a unique name can be organized in two ways: flat (or)
hierarchical.
Each node in the tree has a label, which is a string with a maximum of 63 characters.
The root label is a null string (empty string). DNS requires that children of a node (nodes that
branch from the same node) have different labels, which guarantees the uniqueness of the domain
names.
Domain Name
• Each node in the tree has a label called as domain name.
• A full domain name is a sequence of labels separated by dots (.)
• The domain names are always read from the node up to the root.
• The last label is the label of the root (null).
• This means that a full domain name always ends in a null label, which means the last
character is a dot because the null string is nothing.
• If a label is terminated by a null string, it is called a fully qualified domain name (FQDN).
• If a label is not terminated by a null string, it is called a partially qualified domain name
(PQDN).
Domain
• A domain is a subtree of the domain name space.
• The name of the domain is the domain name of the node at the top of the sub- tree.
• A domain may itself be divided into domains.
ZONE
What a server is responsible for, or has authority over, is called a zone.
The server makes a database called a zone file and keeps all the information for every node under
that domain.
If a server accepts responsibility for a domain and does not divide the domains into smaller
domains, the domain and zone refer to the same thing.
But if a server divides its domain into sub domains and delegates parts of its authority to other
servers, domain and zone refer to different things.
The information about the nodes in the sub domains is stored in the servers at the lower levels,
with the original server keeping some sort of references to these lower level servers.
But still, the original server does not free itself from responsibility totally.
It still has a zone, but the detailed information is kept by the lower level servers.
ROOT SERVER
A root sever is a server whose zone consists of the whole tree.
A root server usually does not store any information about domains but delegates its authority to
other servers, keeping references to those servers.
Currently there are more than 13 root servers, each covering the whole domain
name space.
The servers are distributed all around the world.
Country Domains
The country domains section follows the same format as the generic domains but uses two
characters for country abbreviations
E.g.; in for India, us for United States etc) in place of the three character organizational
abbreviation at the first level.
Second level labels can be organizational, or they can be more specific, national designation.
India for example, uses state abbreviations as a subdivision of the country domain us. (e.g., ca.in.)
Inverse Domains
Mapping an address to a name is called Inverse domain.
The client can send an IP address to a server to be mapped to a domain name and it is called PTR(Pointer)
query.
To answer queries of this kind, DNS uses the inverse domain.
DNS RESOLUTION
Mapping a name to an address or an address to a name is called name address resolution.
DNS is designed as a client server application.
A host that needs to map an address to a name or a name to an address calls a DNS client named a
Resolver.
The Resolver accesses the closest DNS server with a mapping request.
If the server has the information, it satisfies the resolver; otherwise, it either refers the resolver to
other servers or asks other servers to provide the information.
After the resolver receives the mapping, it interprets the response to see if it is a real resolution or
an error and finally delivers the result to the process that requested it.
A resolution can be either recursive or iterative.
Recursive Resolution
• The application program on the source host calls the DNS resolver (client) to find the IP
address of the destination host. The resolver, which does not know this address, sends the
query to the local DNS server of the source (Event 1)
• The local server sends the query to a root DNS server (Event 2)
• The Root server sends the query to the top-level-DNS server(Event 3)
• The top-level DNS server knows only the IP address of the local DNS server at the
destination. So it forwards the query to the local server, which knows the IP address of the
destination host (Event 4)
• The IP address of the destination host is now sent back to the top-level DNS server(Event 5)
then back to the root server (Event 6), then back to the source DNS server, which may cache it
for the future queries (Event 7), and finally back to the source host (Event 8)
Iterative Resolution
• In iterative resolution, each server that does not know the mapping, sends the IP address of
the next server back to the one that requested it.
• The iterative resolution takes place between two local servers.
• The original resolver gets the final answer from the destination local server.
• The messages shown by Events 2, 4, and 6 contain the same query.
• However, the message shown by Event 3 contains the IP address of the top- level domain
server.
• The message shown by Event 5 contains the IP address of the destination local DNS server
• The message shown by Event 7 contains the IP address of the destination.
• When the Source local DNS server receives the IP address of the destination, it sends it to
the resolver (Event 8).
DNS CACHING
Each time a server receives a query for a name that is not in its domain, it needs to search its
database for a server IP address.
DNS handles this with a mechanism called caching.
When a server asks for a mapping from another server and receives the response, it stores this
information in its cache memory before sending it to the client.
If the same or another client asks for the same mapping, it can check its cache memory
and resolve the problem.
However, to inform the client that the response is coming from the cache memory and not from an
authoritative source, the server marks the response as unauthoritative.
Caching speeds up resolution. Reduction of this search time would increase efficiency, but it can
also be problematic.
If a server caches a mapping for a long time, it may send an outdated mapping to the client.
To counter this, two techniques are used.
First, the authoritative server always adds information to the mapping called time to live (TTL). It
defines the time in seconds that the receiving server can cache the information. After that time, the
mapping is invalid and any query must be sent again to the authoritative server.
Second, DNS requires that each server keep a TTL counter for each mapping it caches. The cache
memory must be searched periodically and those mappings with an expired TTL must be purged.
DNS MESSAGES
DNS has two types of messages: query and response.
Both types have the same format.
The query message consists of a header and question section.
The response message consists of a header, question section, answer section, authoritative
section, and additional section .
Header
• Both query and response messages have the same header format with
some fields set to zero for the query messages.
• The header fields are as follows:
• The identification field is used by the client to match the response with the query.
• The flag field defines whether the message is a query or response. It also includes status of
error.
• The next four fields in the header define the number of each record type in the message.
Question Section
• The question section consists of one or more question records. It is present in both query and
response messages.
Answer Section
• The answer section consists of one or more resource records. It is present only in response
messages.
Authoritative Section
• The authoritative section gives information (domain name) about one or more authoritative
servers for the query.
Additional Information Section
• The additional information section provides additional information that may help the
resolver.
DNS CONNECTIONS
DNS can use either UDP or TCP.
In both cases the well-known port used by the server is port 53.
UDP is used when the size of the response message is less than 512 bytes because most UDP
packages have a 512-byte packet size limit.
If the size of the response message is more than 512 bytes, a TCP connection is used.
DNS REGISTRARS
New domains are added to DNS through a registrar. A fee is charged.
A registrar first verifies that the requested domain name is unique and then enters it into the DNS
database.
Today, there are many registrars; their names and addresses can be found at http://www.intenic.net
To register, the organization needs to give the name of its server and the IP address of the server.
For example, a new commercial organization named wonderful with a server named ws and IP
address 200.200.200.5, needs to give the following information to one of the registrars: Domain name:
ws.wonderful.com IP address: 200.200.200.5.
To protect DNS, IETF has devised a technology named DNS Security (DNSSEC) that provides
message origin authentication and message integrity using a security service called digital signature.
DNSSEC, however, does not provide confidentiality for the DNS messages.
There is no specific protection against the denial-of-service attack in the specification of
DNSSEC. However, the caching system protects the upper- level servers against this attack to some
extent.