1467004418ComputerNetworks Mod6DNS Q1 Etext
1467004418ComputerNetworks Mod6DNS Q1 Etext
Quadrant 1 – e-text
In the earlier modules, we have discussed a number of common network applications, and
their associated protocols. In this module, we will examine a very important application
protocol – that is actually required for other applications – yes – the Domain Name System
(Service) protocol. All applications have to let the network know the “IP” address of the
node that they need to connect to. They only have the URL, and do not know the
corresponding IP address. They make use of a special application, DNS, which provides
the mapping from the URL to the IP address. It is this application and the associated
protocol that we discuss below.
The objectives for this module are as follows.
Learning Objectives
To understand the details of the Domain Name System (DNS) and the protocol including
• The domain name-space
• The hierarchy of DNS servers
• DNS look-up
• The DNS resource-records
• DNS message formats
6.1 Introduction
Just as we humans have different identities for different purposes and for convenience, so
too, the hosts and devices in the network have multiple identities. While IP addresses are
used in the network for the hosts and routers to send data to each other, when human
beings have to access them, they need an easy to remember alphabetic or alphanumeric
name. These are referred to as domain names.
6.1.1 Domain names
Domain names are alphanumeric names for IP addresses e.g., www.google.com,
epgp.inflibnet.ac.in, www.wikipedia.org, and so on. The reason for having such names to
refer to systems is obvious – it is easy for us human beings to remember names rather
than large numbers (as in IP addresses). So we prefer to interact with the network nodes
using names and leave it to the network to figure out the corresponding IP address. The
networks answer to this challenge is the Domain Name System (DNS).
6.1.2 Domain name system
The domain name system (DNS) is an Internet-wide distributed database that translates
between domain names and IP addresses. This task is also referred to as name
resolution. The distributed database is implemented in a hierarchy of many name servers,
which have the mapping of names and addresses. It also includes an application-layer
protocol – for hosts, routers, and name servers to communicate to resolve names, i.e, to
do this address/name translation.
It is intriguing to note that a core Internet function is implemented as an application-layer
protocol. The reason for this decision is that this is a complex task and it is easier to
handle this complexity at the networks “edge” rather than at the core.
Thus any network application which needs to use the network, typically first contacts the
DNS system to get the name resolved to a network (IP) address. On getting the IP
address it can proceed with its work. This tells us the importance of DNS.
Another practical experience of understanding the importance of DNS, is to remember the
situation when the local DNS server is down. I am sure all of us would have faced this
situation more than once. We would not be able to get through to the system we want to
connect to !
A little bit of history …
Before there was DNS, how did things work ?
There was a file called the HOSTS.TXT file. Before DNS (until 1985), the name-to-
IP address resolution was done by downloading a single file (hosts.txt) from a
central server with FTP. This worked because the number of entries was limited.
The hosts.txt file still works on most operating systems !
It can be used to define local names. The names in the hosts.txt file are not
structured.
Coming back to the current DNS system, it is a distributed set of servers that have the
name-to-IP-address mapping. Now the question that comes up is – why distributed ? why
not a centralized DNS server ? The answer is simple – just remember the usual debate
that we have between centralized and distributed systems. Centralized systems are
characterized by single-point of failure. The central server fails, and all systems get stuck.
Also, in this case, the amount of traffic that the centralized server has to handle is
enormous, and can overwhelm the server. Maintenance of information in the server can
also become complex. The distance to the centralized server can also be a cause of
concern. Basically, it doesn‟t scale !
For all these reasons, and more, the DNS has been designed to be a distributed database
spread out over a vast number of servers. Let us look at how this is done.
6.2 Design principle of DNS
The domain names are arranged in a hierarchical and logical tree structure called the
domain namespace. The structure shown in Fig. 6.1 shows the distribution of domain
name servers starting at the root. Below the root, we have what are called as top-level-
domain (TLD) servers, catering to different domains. These domains are divided based on
the purpose for which they are used as in edu (for education), org (for organization), com
(for commerce), net (for network-related), mil (military), gov (government), or based on
countries as in „in‟ (for India), au (for Australia), uk ( for the United Kingdom), and so on.
Under these domains, sub-categories are assigned as and how required. The names at
lower layers of the hierarchy can be assigned without regard to the location on a link layer
network, IP network or autonomous system. However, in practice, allocation of the domain
names generally follows the allocation of IP addresses. For example, all hosts with
network prefix 178.143/16 have same domain name suffix, say e-IndianUniv.edu; all hosts
on network 178.143.136/24 are in the Computer Science Department of the University,
and so on.
The full domain name can be traced by starting at its location in the hierarchy, and going
up in the tree to the root. The servers responsible for the hosts under a particular domain
are called the authoritative DNS servers. For instance, in Fig. 6.1, the server shown as
bk.edu would be the authoritative server for the domain bk.edu, and have the IP addresses
for machines in that domain, say cs.bk.edu. In that sense, to locate an authoritative server,
we just need to start at the root, and keep traversing down the tree in accordance with the
domain name parsed from right to left. For example, to get the IP address for cs.bk.edu,
we start at the root, then come down to the .edu server, and then to the bk.edu server
which would provide the IP address for the cs.bk.edu server.
6.2.1 DNS Servers
As we can see from the above discussion, there are a number of distributed servers where
the information required for name-address resolution – root servers, top-level domain
servers, and authoritative servers.
Although, theoretically speaking, there is one root server, practically there are many root
servers with the same information. This is done to avoid overloading a single server, and
to avoid single-point failures.
There are more than 250 top-level domain servers. There are three types of top-level
domains:
(i) Organizational: 3-character code indicates the function of the organization. This
is primarily used within the USA. Examples of this category are gov, mil, edu,
org, com, and net.
(ii) Geographical: 2-character country or region code. Examples include in, va, jp,
de, uk, etc.
(iii) Reverse domains: This is a special domain (in-addr.arpa) used for IP address-
to-name mapping that gives the reverse mapping. That is, given the IP address,
we can get the corresponding domain name from these servers.
Root and top-level domains are administered by an Internet central name registration
authority (ICANN).
Authoritative DNS servers are an organization‟s DNS servers, which provide authoritative
hostname to IP mappings for the organization‟s servers (e.g., Web and mail). These can
be maintained by the organization or the service provider.
In addition to these, there is a local DNS server. This is strictly not part of the hierarchy.
But each ISP (residential ISP, company, university) has one server of this kind. This is also
called as the “default name server”. When a host makes a DNS query, the query is first
sent to its local DNS server. This server acts as a proxy, and forwards the query into
hierarchy. On receiving the answer (mapping) to its query, it caches the mapping, so that
subsequent queries asking for the same information can be served from this server itself. It
is for this reason that it gets the name “local DNS server”.
In an iterative query, when the local name server of a host cannot resolve a query, it
issues a query to the root server. If the root server cannot answer the query, it sends a
referral to another server that may have the answer, to the local server. The local server
now contacts this server. If it has the mapping, it gives the answer, else, it gives a referral
to another server (typically the next in the hierarchy). The local server contacts this server
and the process continues until the authoritative server is contacted which gives the
authoritative answer. An example of this is shown in Fig. 6.3.
Figure 6.3 Iterative querying
When recursive querying is used, there is a lot of load on the contacted name server. It
has to keep track of the request, and forward the answer back. This could lead to high
overheads, especially at the top of the hierarchy (root and TLDs). On the other hand, in
iterative querying the responsibility is more on the local server. But, the local server
collects more information about the hierarchy (at every step of the iterative process), which
it can use to successfully answer subsequent queries. We can also use a combination of
the iterative and recursive processes and use a hybrid process. This is possible because
the choice of the type of query is determined by a bit in the DNS query, in the DNS
message. Hence, it is possible to alternate between the two modes of querying.
6.3.1 DNS Caching
To reduce the DNS traffic, name servers cache information on domain name/IP address
mappings. When an entry for a query is in the cache, the server does not contact other
servers. If an entry is sent from a cache, the reply from the server is marked as
“unauthoritative”. TLD servers are typically cached in local name servers. Thus root name
servers are not often visited. This reduces the load on the top of the tree.
Cache entries timeout (disappear) after some time. The time duration for which the
mapping is valid is specified in a “TTL – time to live” field in the DNS record. This
information is used to identify the expiry time and drop entries from the cache.
But there is a catch here. Cache entries may be out of date. If a named host changes its IP
address, it would not be known until all the cache entries for that host are removed (i.e.
until the TTL expires). To take care of such situations, an update mechanism (RFC 2136)
has been added to the DNS protocol using which latest updates can be conveyed to the
name servers.
6.4 DNS records
We have so far been talking about the mapping and the transactions that take place
between the DNS servers at a high-level. Now let us look at the format in which the
mappings are stored and then at the message formats.
All information at the servers are stored as "Resource Records (RR)" in what are called as
zone files. Thus, it is a distributed database of RRs that constitute the DNS.
There are many types of RRs that are used for various functions related to the
management of names and addresses. We will consider 4 primary types of RRs that are
essential for the functioning of the DNS. The RR is essentially a 5-tuple with the following
fields :
(name, value, type, class, TTL).
The type field specifies the type of RR, and the name and value fields are interpreted
based on the value of this type field. The class field indicates the protocol family (IN - for
internet protocol). TTL is the Time-To-Live field that gives the validity of this entry when it
is cached (as mentioned earlier).
Table 6.1 gives the type and corresponding interpretation of the name and value fields,
along with an example.
Table 6.1 Types of DNS Resource Records
A Hostname IP address
Address Record (www.xyz.com) (202.1.2.3)
NS Domain Hostname of authoritative
server for this domain
Next Server record (xyz.com)
(dnserver.xyz.com)
CNAME Alias name Canonical name for the alias
Canonical Name Record (www.xyz.com) (www.actualNameXYZ.com)
MX Domain name Mail server associated with
the domain name
Mail Server Record (www.xyz.com)
(mail1.xyz.com)
The A type record provides the actual mapping - IPv4 address - for a given domain name.
There is a AAAA (quad-A) record used for IPv6 addresses (128-bit addresses).
The NS record is used to point to the next server (authoritative server for the domain) if a
queried server does not have the mapping. Along with the NS record, a A type record for
that server is also returned in order to contact that server.
The CNAME record is used to specify aliases. For instance, you may look for
cs.myuniv.edu, whereas the actual name could be computerScience.mail.edu. So when a
query is sent for cs.myuniv.edu, the system automatically has to search for
computerScience.mail.edu. To enable this functionality, the CNAME type allows us to
specify aliases and their associated canonical (real) names.
The MX record is used to point to the mail server at a given domain name. Often, we need
not know the actual name of the mail server of a given domain name. We may say, we
want to send mail to aaa@xyz.com. The mail transfer agent now has to find the IP address
of the mail server of xyz.com. How would it do that? It is here that the MX record is useful.
The mail transfer agent would query for the MX record, get the name of the mail server,
and then proceed to get the IP address of that mail server.
There are many other records as well, which have been added as the DNS system
continues to evolve. It would be an interesting exercise for you to check them out.
A practical question -
What do we do if we want to register a new domain name and get an IP address assigned,
say for instance "NetworkPathshala"?
We will have to register the name networkPathasahala.com at a DNS registrar (there are
many registrars who provide this service). We will have to provide the names and IP
addresses of authoritative name server for our domain. Actually, a primary and secondary
server are to be specified. If one fails, the other acts as a backup. The registrar will then
insert two RRs into the .com TLD server:
(networkPathasahala.com, dns1.networkPathasahala.com, NS) &
(dns1.networkPathasahala.com, 212.212.212.1, A).
This will successfully redirect any requests for networkPathasahala.com to our domain's
DNS server. Now at our server, we create authoritative server type A record for
www.networkuptopia.com, and a type MX record for networkutopia.com for our mail
server. And we are done.
References
1. Computer Networking: A Top Down Approach Featuring the Internet, 6th
edition. Jim Kurose, Keith Ross, Addison-Wesley, 2012.
2. Computer Networks: A systems Approach, 5th edition, David Peterson, Davie,
Morgan Kauffman, 2012.