Unit Iv CN R18
Unit Iv CN R18
(Transport Layer: Transport Services, Elements of Transport protocols, Connection management, TCP and
UDP protocols)
The main difference is that the network service is intended to model the service offered by
real networks. Real networks can lose packets, so the network service is generally unreliable.
1
The connection-oriented transport service, in contrast, is reliable. Of course, real networks are
not error-free, but that is precisely the purpose of the transport layer—to provide a reliable
service on top of an unreliable network. The transport layer can also provide unreliable
(datagram) service, ‘‘its datagrams,’’, there are some applications, such as client-server
computing and streaming multimedia, that build on a connectionless transport service.
A second difference between the network service and transport service is whom the services
are intended for. The network service is used only by the transport entities. Few users write
their own transport entities, and thus few users or programs ever see the bare network service.
In contrast, many programs see the transport primitives. Consequently, the transport service
must be convenient and easy to use.
To get an idea of what a transport service might be like, consider the five primitives listed in Fig.
This transport interface is truly bare bones, but it gives the essential flavor of what a connection-
oriented transport interface has to do. It allows application programs to establish, use, and then release
connections, which is sufficient for many applications.
Client-Server Example:
To start with, the server executes a LISTEN primitive, typically by calling a library procedure that makes a
system call that blocks the server until a client turns up. When a client wants to talk to the server, it executes
a CONNECT primitive. The transport entity carries out this primitive by blocking the caller and sending a
packet to the server. Encapsulated in the payload of this packet is a transport layer message for the server’s
transport entity. A quick note on terminology, we will use the term segment for messages sent from transport
entity to transport entity. TCP, UDP and other Internet protocols use this term. Some older protocols used the
ungainly name TPDU (Transport Protocol Data Unit).
Getting back to our client-server example, the client’s CONNECT call causes a CONNECTION REQUEST
segment to be sent to the server. When it arrives, the transport entity checks to see that the server is blocked
on a LISTEN. If so, it then unblocks the server and sends a CONNECTION ACCEPTED segment back to
the client. When this segment arrives, the client is unblocked and the connection is established. Data can now
be exchanged using the SEND and RECEIVE primitives. In the simplest form, either party can do a
(blocking) RECEIVE to wait for the other party to do a SEND. When the segment arrives, the receiver is
unblocked. It can then process the segment and send a reply. As long as both sides can keep track of whose
turn it is to send, this scheme works fine. To the transport users, a connection is a reliable bit pipe: one user
stuffs bits in and they magically appear in the same order at the other end. This ability to hide complexity is
the reason that layered protocols are such a powerful tool. When a connection is no longer needed, it must be
released to free up table space within the two transport entities. Disconnection has two variants: asymmetric
and symmetric. In the asymmetric variant, either transport user can issue a DISCONNECT primitive, which
results in a DISCONNECT segment being sent to the remote transport entity. Upon its arrival, the connection
is released. In the symmetric variant, each direction is closed separately, independently of the other one.
2
When one side does a DISCONNECT, that means it has no more data to send but it is still willing to accept
data from its partner. In this model, a connection is released when both sides have done a DISCONNECT.
A state diagram for connection establishment and release for these simple primitives is given in Fig below.
Each transition is triggered by some event, either a primitive executed by the local transport user or an
incoming packet. For simplicity, we assume here that each segment is separately acknowledged. We also
assume that a symmetric disconnection model is used, with the client going first.
Thus, segments (exchanged by the transport layer) are contained in packets (exchanged by the network
layer). In turn, these packets are contained in frames (exchanged by the data link layer). When a frame
arrives, the data link layer processes the frame header and, if the destination address matches for local
delivery, passes the contents of the frame payload field up to the network entity. The network entity
similarly processes the packet header and then passes the contents of the packet payload up to the
transport entity. This nesting is illustrated in Fig below (Nesting of segments, packets, and frames).
c) Berkeley Sockets:
Another set of transport primitives, the socket primitives as they are used for TCP. Sockets were first
released as part of the Berkeley UNIX 4.2BSD software distribution in 1983. They quickly became
3
popular. The primitives are now widely used for Internet programming on many operating systems,
especially UNIX-based systems, and there is a socket-style API for Windows called ‘‘winsock.’’ The
primitives are listed in Fig. 6-5 below.
The first four primitives in the list are executed in that order by servers. The SOCKET primitive creates a
new endpoint and allocates table space for it within the transport entity. The parameters of the call specify the
addressing format to be used, the type of service desired, and the protocol. A successful SOCKET call
returns an ordinary file descriptor for use in succeeding calls, the same way an OPEN call on a file does.
Newly created sockets do not have network addresses. These are assigned using the BIND primitive. Once a
server has bound an address to a socket, remote clients can connect to it. The reason for not having the
SOCKET call create an address directly is that some processes care about their addresses, whereas others do
not. Next comes the LISTEN call, which allocates space to queue incoming calls for the case that several
clients try to connect at the same time. In contrast to LISTEN in our first example, in the socket model
LISTEN is not a blocking call. To block waiting for an incoming connection, the server executes an
ACCEPT primitive. When a segment asking for a connection arrives, the transport entity creates a new
socket with the same properties as the original one and returns a file descriptor for it. The server can then
fork off a process or thread to handle the connection on the new socket and go back to waiting for the next
connection on the original socket. ACCEPT returns a file descriptor, which can be used for reading and
writing in the standard way, the same as for files. Now let us look at the client side. Here, too, a socket must
first be created using the SOCKET primitive, but BIND is not required since the address used does not matter
to the server. The CONNECT primitive blocks the caller and actively starts the connection process. When it
completes, the client process is unblocked and the connection is established. Both sides can now use SEND
and RECEIVE to transmit and receive data over the full-duplex connection. The standard UNIX READ and
WRITE system calls can also be used if none of the special options of SEND and RECEIVE are required.
Connection release with sockets is symmetric. When both sides have executed a CLOSE primitive, the
connection is released. Sockets have proved tremendously popular and are the de-facto standard for
abstracting transport services to applications.
The socket API is often used with the TCP protocol to provide a connection-oriented service called a
reliable byte stream, which is simply the reliable bit pipe that we described. However, other protocols
could be used to implement this service using the same API. It shouldall be the same to the transport
service users.
A strength of the socket API is that is can be used by an application for other transport services. For
instance, sockets can be used with a connectionless transport service. In this case, CONNECT sets the
address of the remote transport peer and SEND and RECEIVE send and receive datagrams to and from
the remote peer.
4
Sockets can also be used with transport protocols that provide a message stream rather than a byte
stream and that do or do not have congestion control. For example, DCCP (Datagram Congestion
Controlled Protocol) is a version of UDP with congestion control. It is up to the transport users to
understand what service they are getting. However, sockets are not likely to be the final word on
transport interfaces. For example, applications often work with a group of related streams, such as a
Web browser that requests several objects from the same server. With sockets, the most natural fit is
for application programs to use one stream per object. This structure means that congestion control is
applied separately for each stream, not across the group, which is suboptimal. It punts to the application
the burden of managing the set.
Newer protocols and interfaces have been devised that support groups of related streams more
effectively and simply for the application. Two examples are SCTP (Stream Control Transmission
Protocol) defined in RFC 4960 and SST (Structured Stream Transport). These protocols must
change the socket API slightly to get the benefits of groups of related streams, and they also support
features such as a mix of connection-oriented and connectionless traffic and even multiple network
paths. Time will tell if they are successful.
The transport service is implemented by a transport protocol used between the two transport entities.
Transport protocols as the data link protocols have to deal with error control, sequencing, and flow
control. The differences between these protocols are due to major dissimilarities between the
environments in which the two protocols operate.
(a) Environment of the data link layer. (b) Environment of the transport layer.
Differences between data link layer and transport layer:
1. At data link layer, two routers communicate directly via a physical channel, whereas at the transport
layer, this physical channel is replaced by the entire subnet.
2. In the data link layer, it is not necessary for a router to specify which router it wants to talk to – each
outgoing line uniquely specifies a particular router. In the transport layer, explicit addressing of
destinations is required.
3. In the data link layer, the process of establishing a connection over the wire is smple: the other end is
always there (unless it has crashed, in which case it is not there). In the transport layer, initial
connection establishment is more complicated.
4. In the data link layer, when a router sends a frame, it may arrive or lost, but it cannot bounce around for
a while, etc. In the transport layer, if the subnet uses datagrams and adaptive routing inside, there is a
no negligible probability that a packet may be stored for a number of seconds and then delivered later.
5
5. A final difference between the data link layer and transport layers is following: Buffering and flow
control are needed in both layers, but the presence of a large and dynamically varying number of
connections in the transport layer may require a different approach than used in data link layer.
Addressing
Connection Establishment
Connection Release
Error Control
Flow Control and Buffering
Multiplexing
Crash Recovery
Addressing:
When an application process wishes to set up a connection to a remote application process, it must specify
which one to connect to. The method normally used is to define transport addressing to which processes can
listen for connection requests. In Internet, these end points are called ports. We will use generic term TSAP,
(Transport Service Access Point). The analogous end points in the network layer are then called NSAPs. IP
addresses are examples of NSAPs.
Application processes, both clients and servers, can attach themselves to a TSAP to establish a connection
to a remote TSAP. These connections run through NSAPs on each host. The purpose of having TSAPs is
that in some networks, each computer has a single NSAP, so some way is needed to distinguish multiple
transport end points that share that NSAP.
6
2) An application process on host 1 wants to find out the time-of-day, so it issues a CONNECT request
specifying TSAP 1208 as the source and TSAP 1522 as destination
This action ultimately results in a transport connection being established between the application process on
host 1 and server 1 on host 2.
3) The application process then sends over a request for the time.
4) The time server process responds with the current time.
5) The transport connection is then released.
Connection Establishment
At first glance, it would seem sufficient for one transport entity to just send a CONNECTION REQUEST
TPDU to the destination and wait for a CONNECTION ACCEPTED reply. The problem occurs when the
network can lose, store, and duplicate packets. This behaviour causes serious complications. Imagine a
subnet that is so congested that acknowledgements hardly ever get back in time and each packet times out is
retransmitted two or more times. Suppose that the subnet uses datagrams inside and that every packet follows
a different route. Some of the packets might get stuck in a traffic jam inside the subnet and take a long time
to arrive, that is, they are stored in the subnet and pop out much later. The crux of the problem is the
existence of delayed duplicates; it can be attacked in various ways.
To get around the problem of a machine losing all memory of where it was after a crash, Tomlinson proposed
equipping each host with a time-of-day clock. The basic idea is to ensure that two identically numbered
TPDUs are never outstanding at the same time.
Fig: How a user process in host 1 establishes a connection with a time-of-day server in host 2.
7
Fig: (a) TPDUs may not enter the forbidden region. (b) The resynchronization problem.
Three protocol scenarios for establishing a connection using a three-way handshake. CR denotes
CONNECTION REQUEST.
Connection Release
Releasing a connection is easier than establishing one.
There are two styles of terminating a connection: asymmetric release and symmetric release.
Asymmetric release is abrupt and may result in data loss
One way to avoid data loss is to use symmetric release, in which each direction is released
independently of the other one.
One can envision a protocol in which host 1 says: I am done. Are you done too? If host 2 responds: I
am done too. Goodbye, the connection can be safely released.
Unfortunately, this protocol does not always work. There is a famous problem that illustrates this
issue. It is called the two-army problem.
8
Fig: Abrupt disconnection with loss of data.
White army in valley. Blue arm in hills on either side of valley. White army can defeat either blue army in
isolation, but blue armies together can defeat white army. How do the blue armies coordinate an attack on the
white army? Their only communication is via messenging through the valley where messengers may be lost
(i.e. an unreliable channel).
Blue army #1 sends message: attack at time X. Blue army #2 receives this message and sends an
acknowledgment to it. Does the attack happen at time X? No, since blue army #2 can't know that it's ack was
received. Adding an ack to the ack (three-way handshake) doesn't help, since now blue army #1 doesn't know
if his ack to the ack got through, and if it didn't blue army #2 won't attack, so blue army #1 shouldn't attack
either.
You can prove that no protocol can solve this problem. Suppose such a protocol existed. The last message
sent is either essential or it is not. If it is not, it can be lost or dropped with no adverse affect. Drop all non-
essential messages. Now all messages remaining are essential. What if the last message is lost? Since it is
essential, the protocol fails. So we have a contradiction. Timers are used in practice to make conclusions
about when it is safe to drop connections.
9
Fig: Four protocol scenarios for releasing a connection.
RD (DISCONNECTION REQUEST)
Error control is ensuring that the data is delivered with the desired level of reliability, usually that all of the
data is delivered without any errors. Flow control is keeping a fast transmitter from overrunning a slow
receiver.
1. A frame carries an error-detecting code (e.g., a CRC or checksum) that is used to check if the information
was correctly received.
2. A frame carries a sequence number to identify itself and is retransmitted by the sender until it receives an
acknowledgement of successful receipt from the receiver. This is called ARQ (Automatic Repeat
reQuest).
10
3. There is a maximum number of frames that the sender will allow to be outstanding at any time, pausing if
the receiver is not acknowledging frames quickly enough. If this maximum is one packet the protocol is
called stop-and-wait. Larger windows enable pipelining and improve performance on long, fast links.
4. The sliding window protocol combines these features and is also used to support bidirectional data
transfer.
In some ways the flow control problem in the transport layer is the same as in the data link layer, but in other
ways it is different. The main difference is that a router usually has relatively few lines, whereas a host may
have numerous connections. This difference makes it impractical to implement the data link buffering
strategy in the transport layer. If the network service is unreliable, the sender must buffer all TPDUs sent.
However, with reliable network service, other trade-off become possible. If the sender knows that the
receiver always has buffer size, it need not retain copies of the TPDUs it sends. However, if the receiver
cannot guarantee that every incoming TPDU will be accepted, the sender will have to buffer anyway. Even
if the receiver has agreed to do the buffering, there still remains the question of the buffer size.
Fig: (a) Chained fixed-size buffers. (b) Chained variable-sized buffers. (c) One large circular buffer per connection.
For low-bandwidth bursty traffic, it is better to buffer at the sender, and for high bandwidth smooth traffic, it
is better to buffer at the receiver.
Dynamic buffer management: The sender requests a certain number of buffers, based on its perceived
needs. The receiver then grants as many of these as it can afford.
11
Dynamic buffer allocation. The arrows show the direction of transmission. An ellipsis (…) indicates a lost
TPDU.
Multiplexing
Multiplexing several conversations onto connections, virtual circuits, and physical links plays a role in
several layers of the network architecture. In the transport layer the need for multiplexing can arise in a
number of ways. For example, if only one network address is available on a host, all transport connections
on that machine have to use it.
For multiplexing the following two main strategies are followed:
1. Upward multiplexing and
2. Downward multiplexing
connection.
These transport connections are grouped by the transport layer as per their destinations.
It then maps the groups with the minimum number of network connections possible.
The upward multiplexing is quite useful where the network connections come very expensive.
Downward Multiplexing
It is only used when the connections with high bandwidth are required.
In case of the downward multiplexing, the multiple network connections are opened by the transport
Crash Recovery
If hosts and routers are subject to crashes, recovery from these crashes becomes an issue. If the transport
entity is entirely within the hosts, recovery from network and router crashes is straightforward. If the network
layer provides datagram service, the transport entities expect lost TPDUs all the time and know how to cope
with them. If the network layer provides connection oriented service, then loss of a virtual circuit is handled
by establishing a new one and then probing the remote transport entity to ask it which TPDUs it has received
and which ones it has not received. The latter ones can be retransmitted. A more troublesome problem is how
to recover from host crashes.
Three events are possible at the server: sending an acknowledgement (A), writing to the output process (W),
and crashing (C). The three events can occur in six different orderings: AC(W), AWC, C(AW), C(WA), WAC,
and WC(A), where the parentheses are used to indicate that neither A nor W can follow C (i.e., once it has
crashed, it has crashed). Figure 6-18 shows all eight combinations of client and server strategies and the valid
event sequences for each one. Notice that for each strategy there is some sequence of events that causes the
protocol to fail. For example, if the client always retransmits, the AWC event will generate an undetected
duplicate, even though the other two events work properly.
Different combinations of client and server strategy
13
CONGESTION CONTROL
If the transport entities on many machines send too many packets into the network too quickly, the network
will become congested, with performance degraded as packets are delayed and lost. Controlling congestion
to avoid this problem is the combined responsibility of the network and transport layers. Congestion occurs
at routers, so it is detected at the network layer. However, congestion is ultimately caused by traffic sent into
the network by the transport layer. The only effective way to control congestion is for the transport protocols
to send packets into the network more slowly. The Internet relies heavily on the transport layer for
congestion control, and specific algorithms are built into TCP and other protocols.
1. Desirable Bandwidth Allocation: It is to find a good allocation of bandwidth to the transport entities that
are using the network. A good allocation will deliver good performance because it uses all the available
bandwidth but avoids congestion, it will be fair across competing transport entities, and it will quickly
track changes in traffic demands.
a) Efficiency and Power: An efficient allocation of bandwidth across transport entities will use all of
the network capacity that is available. However, it is not quite right to think that if there is a 100-
Mbps link, five transport entities should get 20 Mbps each. They should usually get less than 20 Mbps
for good performance. The reason is that the traffic is often bursty. This curve and a matching curve
for the delay as a function of the offered load are given in Fig. below.
Figure 6-19. (a) Goodput and (b) delay as a function of offered load.
As the load increases in Fig. 6-19(a) goodput initially increases at the same rate, but as the load
approaches the capacity, goodput rises more gradually. This falloff is because bursts of traffic can
occasionally mount up and cause some losses at buffers inside the network. If the transport protocol is
poorly designed and retransmits packets that have been delayed but not lost, the network can enter
congestion collapse. In this state, senders are furiously sending packets, but increasingly little useful
work is being accomplished.
The corresponding delay is given in Fig. 6-19(b) Initially the delay is fixed, representing the
propagation delay across the network. As the load approaches the capacity, the delay rises, slowly at
first and then much more rapidly. This is again because of bursts of traffic that tend to mound up at
high load. The delay cannot
14
really go to infinity, except in a model in which the routers have infinite buffers. Instead, packets will
be lost after experiencing the maximum buffering delay. For both goodput and delay, performance
begins to degrade at the onset of congestion. Intuitively, we will obtain the best performance from the
network if we allocate bandwidth up until the delay starts to climb rapidly. This point is below the
capacity. To identify it, Kleinrock (1979) proposed the metric of power, where
power = load / delay
Power will initially rise with offered load, as delay remains small and roughly constant, but will reach
a maximum and fall as delay grows rapidly. The load with the highest power represents an efficient
load for the transport entity to place on the network.
b) Max-Min Fairness: The form of fairness that is often desired for network usage is max-min
fairness. An allocation is max-min fair if the bandwidth given to one flow cannot be increased
without decreasing the bandwidth given to another flow with an allocation that is no larger. That is,
increasing the bandwidth of a flow will only make the situation worse for flows that are less well off.
Let us see an example. A max-min fair allocation is shown for a network with four flows, A, B, C, and
D, in Fig. 6-20. Each of the links between routers has the same capacity, taken to be 1 unit, though in
the general case the links will have different capacities. Three flows compete for the bottom-left link
between routers R4 and R5. Each of these flows therefore gets 1/3 of the link. The remaining flow, A,
competes with B on the link from R2 to R3. Since B has an allocation of 1/3, A gets the remaining 2/3
of the link. Notice that all of the other links have spare capacity. However, this capacity cannot be
given to any of the flows without decreasing the capacity of another, lower flow. For example, if
more of the bandwidth on the link between R2 and R3 is given to flow B, there will be less for flow A.
This is reasonable as flow A already has more bandwidth. However, the capacity of flow C or D (or
both) must be decreased to give more bandwidth to B, and these flows will have less bandwidth than
B. Thus, the allocation is max-min fair.
c) Convergence: A good congestion control algorithm should rapidly converge to the ideal operating
point, and it should track that point as it changes over time. If the convergence is too slow, the
algorithm will never be close to the changing operating point. If the algorithm is not stable, it may fail
to converge to the right point in some cases, or even oscillate round the right point. An example of a
bandwidth allocation that changes over time and converges quickly is shown in Fig. 6-21. Initially,
flow 1 has all of the bandwidth. One second later, flow 2 starts. It needs bandwidth as well. The
allocation quickly changes to give each of these flows half the bandwidth. At 4 seconds, a third flow
15
joins. However, this flow uses only 20% of the bandwidth, which is less than its fair share (which is a
third). Flows 1 and 2 quickly adjust, dividing the available bandwidth to each have 40% of the
bandwidth. At 9 seconds, the second flow leaves, and the third flow remains unchanged. The first
flow quickly captures 80% of the bandwidth. At all times, the total allocated bandwidth is
approximately 100%, so that the network is fully used, and competing flows get equal treatment
2. Regulating the Sending Rate: Now it is time for the main course. How do we regulate the sending rates
to obtain a desirable bandwidth allocation? The sending rate may be limited by two factors. The first is flow
control, in the case that there is insufficient buffering at the receiver. The second is congestion, in the case
that there is insufficient capacity in the network.
In Fig. 6-22, we see this problem illustrated hydraulically. In Fig. 6-22(a), we see a thick pipe leading to a
small-capacity receiver. This is a flow-control limited situation. As long as the sender does not send more
water than the bucket can contain, no water will be lost. In Fig. 6-22(b), the limiting factor is not the bucket
capacity, but the internal carrying capacity of the network. If too much water comes in too fast, it will back
up and some will be lost (in this case, by overflowing the funnel). These cases may appear similar to the
sender, as transmitting too fast causes packets to be lost. However, they have different causes and call for
different solutions. We have already talked about a flow-control solution with a variable-sized window. Now
we will consider a congestion control solution. Since either of these problems can occur, the transport
protocol will in general need to run both solutions and slow down if either problem occurs. The way that a
transport protocol should regulate the sending rate depends on the form of the feedback returned by the
network. Different network layers may return different kinds of feedback. The feedback may be explicit or
implicit, and it may be precise or imprecise. An example of an explicit, precise design is when routers tell
the sources the rate at which they may send. Designs in the literature such as XCP (eXplicit Congestion
Protocol) operate in this manner. An explicit, imprecise design is the use of ECN (Explicit Congestion
Notification) with TCP. In this design, routers set bits on packets that experience congestion to warn the
senders to slow down, but they do not tell them how much to slow down.
16
Figure 6-22. (a) A fast network feeding a low-capacity receiver. (b) A slow network feeding a high-capacity receiver.
In other designs, there is no explicit signal. FAST TCP measures the roundtrip delay and uses that metric as a
signal to avoid congestion (Wei et al., 2006). Finally, in the form of congestion control most prevalent in the
Internet today, TCP with drop-tail or RED routers, packet loss is inferred and used to signal that the network
has become congested. There are many variants of this form of TCP, including CUBIC TCP, which is used in
Linux (Ha et al., 2008). Combinations are also possible. For example, Windows includes Compound TCP
that uses both packet loss and delay as feedback signals (Tan et al., 2006). These designs are summarized in
Fig. 6-23. If an explicit and precise signal is given, the transport entity can use that signal to adjust its rate to
the new operating point. For example, if XCP tells senders the rate to use, the senders may simply use that
rate. In the other cases, however, some guesswork is involved. In the absence of a congestion signal, the
senders should decrease their rates. When a congestion signal is given, the senders should decrease their
rates. The way in which the rates are increased or decreased is given by a control law.
17
AIMD (Additive Increase Multiplicative Decrease) is the appropriate control law to arrive at the efficient
and fair operating point. To argue this case, they constructed a graphical argument for the simple case of two
connections competing for the bandwidth of a single link. The graph in Fig. 6-24 shows the bandwidth
allocated to user 1 on the x-axis and to user 2 on the y-axis. When the allocation is fair, both users will
receive the same amount of bandwidth. This is shown by the dotted fairness line. When the allocations sum
to 100%, the capacity of the link, the allocation is efficient. This is shown by the dotted efficiency line. A
congestion signal is given by the network to both users when the sum of their allocations crosses this line.
The intersection of these lines is the desired operating point, when both users have the same bandwidth and
all of the network bandwidth is used.
Consider what happens from some starting allocation if both user 1 and user 2 additively increase their
respective bandwidths over time. For example, the users may each increase their sending rate by 1 Mbps
every second. Eventually, the operating point crosses the efficiency line and both users receive a congestion
signal from the network. At this stage, they must reduce their allocations. However, an additive decrease
would simply cause them to oscillate along an additive line. This situation is shown in Fig. 6-24. The
behavior will keep the operating point close to efficient, but it will not necessarily be fair. Similarly, consider
the case when both users multiplicatively increase their bandwidth over time until they receive a congestion
signal. For example, the users may increase their sending rate by 10% every second. If they then
multiplicatively decrease their sending rates, the operating point of the users will simply oscillate along a
multiplicative line. This behavior is also shown in Fig. 6-24. The multiplicative line has a different slope than
the additive line. (It points to the origin, while the additive line has an angle of 45 degrees.) But it is
otherwise no better. In neither case will the users converge to the optimal sending rates that are both fair and
efficient. Now consider the case that the users additively increase their bandwidth allocations and then
multiplicatively decrease them when congestion is signaled. This behavior is the AIMD control law, and it is
shown in Fig. 6-25. It can be seen that the path traced by this behavior does converge to the optimal point
that is both fair and efficient. This convergence happens no matter what the starting point, making AIMD
broadly useful. By the same argument, the only other combination, multiplicative increase and additive
decrease, would diverge from the optimal point.
18
Figure 6-25. Additive Increase Multiplicative Decrease (AIMD) control law.
AIMD is the control law that is used by TCP, based on this argument and another stability argument (that it is
easy to drive the network into congestion and difficult to recover, so the increase policy should be gentle and
the decrease policy aggressive). It is not quite fair, since TCP connections adjust their window size by a
given amount every round-trip time. Different connections will have different round-trip times. This leads to
a bias in which connections to closer hosts receive more bandwidth than connections to distant hosts, all else
being equal. In Sec. 6.5, we will describe in detail how TCP implements an AIMD control law to adjust the
sending rate and provide congestion control. This task is more difficult than it sounds because rates are
measured over some interval and traffic is bursty. Instead of adjusting the rate directly, a strategy that is often
used in practice is to adjust the size of a sliding window. TCP uses this strategy. If the window size is W and
the round-trip time is RTT, the equivalent rate is W/RTT. This strategy is easy to combine with flow control,
which already uses a window, and has the advantage that the sender paces packets using acknowledgements
and hence slows down in one RTT if it stops receiving reports that packets are leaving the network.
Wireless Issues: Transport protocols such as TCP that implement congestion control should be
independent of the underlying network and link layer technologies. That is a good theory, but in practice
there are issues with wireless networks. The main issue is that packet loss is often used as a congestion
signal, including by TCP as we have just discussed. Wireless networks lose packets all the time due to
transmission errors. With the AIMD control law, high throughput requires very small levels of packet loss.
Analyses by Padhye et al. (1998) show that the throughput goes up as the inverse square-root of the packet
loss rate. What this means in practice is that the loss rate for fast TCP connections is very small; 1% is a
moderate loss rate, and by the time the loss rate reaches 10% the connection has effectively stopped
working. However, for wireless networks such as 802.11 LANs, frame loss rates of at least 10% are
common. This difference means that, absent protective measures, congestion control schemes that use
packet loss as a signal will unnecessarily
throttle connections that run over wireless links to very low rates. To function well, the only packet losses
that the congestion control algorithm should observe are losses due to insufficient bandwidth, not losses
due to transmission errors. One solution to this problem is to mask the wireless losses by using
retransmissions over the wireless link. For example, 802.11 uses a stop and- wait protocol to deliver each
frame, retrying transmissions multiple times if need be before reporting a packet loss to the higher layer.
In the normal case, each packet is delivered despite transient transmission errors that are not visible to the
higher layers. Fig. 6-26 shows a path with a wired and wireless link for which the masking strategy is
used. There are two aspects to note. First, the sender does not necessarily know that the path includes a
19
wireless link, since all it sees is the wired link to which it is attached. Internet paths are heterogeneous and
there is no general method for the sender to tell what kind of links comprise the path. This complicates the
congestion control problem, as there is no easy way to use one protocol for wireless links and another
protocol for wired links.
The figure shows two mechanisms that are driven by loss: link layer frame retransmissions, and transport
layer congestion control.
Introduction to UDP
The User Datagram Protocol (UDP) is simplest Transport Layer communication protocol available of the
TCP/IP protocol suite. It involves minimum amount of communication mechanism. UDP is said to be an
unreliable transport protocol but it uses IP services which provides best effort delivery mechanism.
In UDP, the receiver does not generate an acknowledgement of packet received and in turn, the sender does
not wait for any acknowledgement of packet sent. This shortcoming makes this protocol unreliable as well
as easier on processing.
Requirement of UDP
A question may arise, why do we need an unreliable protocol to transport the data? We deploy UDP where
the acknowledgement packets share significant amount of bandwidth along with the actual data. For
example, in case of video streaming, thousands of packets are forwarded towards its users. Acknowledging
all the packets is troublesome and may contain huge amount of bandwidth wastage. The best delivery
mechanism of underlying IP protocol ensures best efforts to deliver its packets, but even if some packets in
video streaming get lost, the impact is not calamitous and can be ignored easily. Loss of few packets in
video and voice traffic sometimes goes unnoticed.
Features
UDP is used when acknowledgement of data does not hold any significance.
UDP is good protocol for data flowing in one direction.
UDP is simple and suitable for query based communications.
UDP is not connection oriented.
UDP does not provide congestion control mechanism.
UDP does not guarantee ordered delivery of data.
20
UDP is stateless.
UDP is suitable protocol for streaming applications such as VoIP, multimedia streaming.
UDP Header
UDP header is as simple as its function. UDP header contains four main parameters:
Source Port - This 16 bits information is used to identify the source port of the packet.
Destination Port - This 16 bits information, is used identify application level service on destination
machine.
Length - Length field specifies the entire length of UDP packet (including header). It is 16-bits field
and minimum value is 8-byte, i.e. the size of UDP header itself.
Checksum - This field stores the checksum value generated by the sender before sending. IPv4 has
this field as optional so when checksum field does not contain any value it is made 0 and all its bits
are set to zero.
UDP application
Here are few applications where UDP is used to transmit data:
Domain Name Services
Simple Network Management Protocol
Trivial File Transfer Protocol
Routing Information Protocol
Kerberos
The transmission Control Protocol (TCP) is one of the most important protocols of Internet Protocols suite. It
is most widely used protocol for data transmission in communication network such as internet.
21
Features
TCP is reliable protocol. That is, the receiver always sends either positive or negative
acknowledgement about the data packet to the sender, so that the sender always has bright clue about
whether the data packet is reached the destination or it needs to resend it.
TCP ensures that the data reaches intended destination in the same order it was sent.
TCP is connection oriented. TCP requires that connection between two remote points be established
before sending actual data.
TCP provides error-checking and recovery mechanism.
TCP provides end-to-end communication.
TCP provides flow control and quality of service.
TCP operates in Client/Server point-to-point mode.
TCP provides full duplex server, i.e. it can perform roles of both receiver and sender.
Header
The length of TCP header is minimum 20 bytes long and maximum 60 bytes.
Source Port (16-bits) - It identifies source port of the application process on the sending device.
Destination Port (16-bits) - It identifies destination port of the application process on the receiving
device.
Sequence Number (32-bits) - Sequence number of data bytes of a segment in a session.
Acknowledgement Number (32-bits) - When ACK flag is set, this number contains the next
sequence number of the data byte expected and works as acknowledgement of the previous data
received.
Data Offset (4-bits) - This field implies both, the size of TCP header and the offset of data in
current packet in the whole TCP segment.
Reserved (3-bits) - Reserved for future use and all are set zero by default.
Flags (1-bit each)
o NS - Nonce Sum bit is used by Explicit Congestion Notification signaling process.
22
o CWR - When a host receives packet with ECE bit set, it sets Congestion Windows Reduced
to acknowledge that ECE received.
o ECE -It has two meanings:
If SYN bit is clear to 0, then ECE means that the IP packet has its CE (congestion
experience) bit set.
If SYN bit is set to 1, ECE means that the device is ECT capable.
o URG - It indicates that Urgent Pointer field has significant data and should be processed.
o ACK - It indicates that Acknowledgement field has significance. If ACK is cleared to 0, it
indicates that packet does not contain any acknowledgement.
o PSH - When set, it is a request to the receiving station to PUSH data (as soon as it comes) to
the receiving application without buffering it.
o RST - Reset flag has the following features:
It is used to refuse an incoming connection.
It is used to reject a segment.
It is used to restart a connection.
o SYN - This flag is used to set up a connection between hosts.
o FIN - This flag is used to release a connection and no more data is exchanged thereafter.
Because packets with SYN and FIN flags have sequence numbers, they are processed in
correct order.
Windows Size - This field is used for flow control between two stations and indicates the amount of
buffer (in bytes) the receiver has allocated for a segment, i.e. how much data is the receiver
expecting.
Checksum - This field contains the checksum of Header, Data and Pseudo Headers.
Urgent Pointer - It points to the urgent data byte if URG flag is set to 1.
Options - It facilitates additional options which are not covered by the regular header. Option field
is always described in 32-bit words. If this field contains data less than 32-bit, padding is used to
cover the remaining bits to reach 32-bit boundary.
23
TCP Connection Establishment (Three-way handshake)
Connections are established in TCP by means of the three-way handshake.
24
To establish a connection, one side, say, the server passively waits for an incoming connection by
executing the LISTEN and ACCEPTS primitives in that order, either specifying a specific source or
nobody in particular.
The other side, say, the client, executes a CONNECT primitive, specifying the IP address and port to
which it wants to connect, the maximum TCP segment size it is willing to accept, and optionally
some user data (e.g., a password).
The CONNECT primitive sends a TCP segment with the SYN bit on and ACK bit off and waits for a
response. When this segment arrives at the destination, the TCP entity there checks to see if there is a
process that has done a LISTEN on the port given in the Destination port field. If not, it sends a reply
with the RST bit on to reject the connection.
If some process is listening to the port, that process is given the incoming TCP segment. It can either
accept or reject the connection. If it accepts, an acknowledgement segment is sent back. The sequence
of TCP segments sent in the normal case is shown in Fig. 6-37(a).
In the event that two hosts simultaneously attempt to establish a connection between the same two sockets,
the sequence of events is as illustrated in Fig. 6- 37(b). The result of these events is that just one connection
is established, not two, because connections are identified by their end points. If the first setup results in a
connection identified by (x, y) and the second one does too, only one table entry is made, namely, for (x, y).
The steps required establishing and release connections can be represented in a finite state machine with the
11 states listed in Fig. 6-38. In each state, certain events are legal. When a legal event happens, some action
may be taken. If some other event happens, an error is reported. Each connection starts in the CLOSED state.
It leaves that state when it does either a passive open (LISTEN) or an active open (CONNECT). If the other
side does the opposite one, a connection is established and the state becomes ESTABLISHED. Connection
release can be initiated by either side. When it is complete, the state returns to CLOSED. The finite state
machine itself is shown in Fig. 6-39. The common case of a client actively connecting to a passive server is
shown with heavy lines—solid for the client, dotted for the server. The lightface lines are unusual event
sequences.
Each line in Fig. 6-39 is marked by an event/action pair. The event can either be a user-initiated system call
(CONNECT, LISTEN, SEND, or CLOSE), a segment arrival (SYN, FIN, ACK, or RST), or, in one case, a
timeout of twice the maximum packet lifetime. The action is the sending of a control segment ( SYN, FIN, or
RST) or nothing, indicated by —. Comments are shown in parentheses. One can best understand the diagram
by first following the path of a client (the heavy solid line), then later following the path of a server (the
heavy dashed line). When an application program on the client machine issues a CONNECT request, the
local TCP entity creates a connection record, marks it as being in the SYN SENT state, and shoots off a SYN
segment. Note that many connections may be open (or being opened) at the same time on behalf of multiple
applications, so the state is per connection and recorded in the connection record. When the SYN+ACK
arrives, TCP sends the final ACK of the three-way handshake and switches into the ESTABLISHED state.
26
Figure 6-39. TCP connection management finite state machine. The heavy solid line is the normal path for a client.
The heavy dashed line is the normal path for a server. The light lines are unusual events. Each transition is labeled with
the event causing it and the action resulting from it, separated by a slash.
Data can now be sent and received. When an application is finished, it executes a CLOSE primitive, which
causes the local TCP entity to send a FIN segment and wait for the corresponding ACK (dashed box marked
‘‘active close’’). When the ACK arrives, a transition is made to the state FIN WAIT 2 and one direction of the
connection is closed. When the other side closes, too, a FIN comes in, which is acknowledged. Now both
sides are closed, but TCP waits a time equal to twice the maximum packet lifetime to guarantee that all
packets from the connection have died off, just in case the acknowledgement was lost. When the timer goes
off, TCP deletes the connection record. Now let us examine connection management from the server’s
viewpoint. The server does a LISTEN and settles down to see who turns up. When a SYN comes in, it is
acknowledged and the server goes to the SYN RCVD state. When the server’s SYN is itself acknowledged, the
three-way handshake is complete and the server goes to the ESTABLISHED state. Data transfer can now
occur. When the client is done transmitting its data, it does a CLOSE, which causes a FIN to arrive at the
server (dashed box marked ‘‘passive close’’). The server is then signaled. When it, too, does a CLOSE, a FIN
is sent to the client. When the client’s acknowledgement shows up, the server releases the connection and
deletes the connection record.
THE END
27