Networks Chapter 2
Networks Chapter 2
TCP/IP Fundamentals
CHAPTER OBJECTIVES
After completing this chapter, the reader should be able to:
Most of the performance issues in TCP/IP networks arise from various interactions
between the TCP engine and the surrounding communication environment. To
understand these performance issues and the techniques to address them, the
reader must be familiar with some of the basic details of TCP/IP protocols. This
chapter reviews the TCPiIP protocol fundamentals necessary for understanding
the subsequent chapters in the book. Many details, not directly referenced in the
rest of the book, are deliberately left out. For more comprehensive coverage of
TCP/P, readers should consuit books dedicated to TCP/IP protocols, such as that
by Comer [113].
2.1 TCP
TCP provides several useful services to its applications. These services are briefly
described in this section.
34
Section 2.'l TCP 35
Streaming Service. TCP provides a streamlng service to its applications.
Once a TCP connection is established between two application processes (one is
a sending process, the other is a receiving process), the sender writes a stream of
bvtes (or characters) into the connection and the receiver reads these bytes out of
the connection. The stream-oriented abstraction is visible only to the applications;
the TCP layer itself operates on a packet mode. The sending TCP accumulates a
certain amount of application bytes, forms a packet called a TCP segment, and sends
the segment to the receiving TCP. The receiving TCP extracts application bytes
ttom the segment, orders them if necessary, and delivers them as a stream of bytes
to the appropriate receiving application process. The format of a TCP segment is
explained later in the section.
31
Acknowledsment number
Header
iength Unusedlt,llAiPlRlSlF Receiver window slze
Options (variable)
Source port number (16 bits). Each TCP appiication at the source host is uniquely
identifled by the source port number. The port identification allows rnultiplex-
ing and demultipiexing multiple TCP connections over the same TCP protocol
process.
Destination poil number (16 bits). it identifies a TCP application at the destination
host. When a TCP segment is received at the destination host, this port number
is used to deliver the segment data to the correct application.
Sequence number (32 bits). The 32-bit sequence number fleld contains the sequence
number of the first byte of data carried in the TCP segment. As an example, if
the preceding segment started with a sequence number of 2001 and contained
1460 bytes of data, then the sequence number of the next TCP segment is set
ro 346L.
Acknowledgment number (32 bits). The destination uses this fleld to acknowledge
the correctly received data.
Header length (4 bits). This fieid is used to indicate the length of TCP header in
multiples of 32-bit words. In most cases the header length of a TCP segment
is 20 bytes: however, this may vary if the options field is used. Because the
header can be of variable length, the length field also helps to identify the start
of the payload.
Reserved (6 bits). These six bits are reserved for future or experimental use.
Flags (6 hits). A TCP segment mav carry several different tvpes of protocol mes-
sages, such as ACK, start signal of a connection. end signal of a connection,
5ection 2.1 TCP 37
Flag Description
ACK (A) Acknowledgment fleld valid
FrN (F) Final segment from sender
PSH (P) Push operation invoked. Receiving process needs notification
RSr (R) Connection to be reset
sYN (S) Starl of a new connection
URG (U) Urgent pointer fleld valid
and so on. Each bit in the flag field is used to identify a given type. Table 2.1
shows the purpose of each of the six flag bits. The multiple flag bits may be set
at the same time. For exarnple. if an end signal is carried along with an ACK.
both ACK and final (FIN) flags must be set in that segment.
Receiver window size (16 bits). The receiver advertises its window (available buffer
space) to the sender using this fleld. The receiver window is used by the sender
for the purposes of flow control.
Checksum (16 bits). The chc'cksum fleld is computecl over the TCP header, the TCP
payload. and the pseudoheader consisting of the source and destination IP
addresses as rvell as the length field of the IP header. The checksum field
protects the header and the payload of the TCP segment.
Urgent pointer (L6 bits). A TCP segment may carry data that need priority treat-
ment (the urgent [URG] flag would be set for this segment). For exampie, an
URG pointer mav be used to pass escape characters to cancei an operation on
a remote computer. The URG data is processed before any other data waiting
in the buffer. The 16-bit URG pointer points to the last byte of URG data
in the segment, so that the receiving TCP can easily locate the URG data for
immediate processing.
Options (variatrle). Options are to be specified using muitiples of bytes. There are
two extra bytes preceding each option. The first byte indicates the option
type followed by the second b-vte indicating the length of the option in bytes
(including these trvo preceding frltes). Examples of options are:
o Nlaximum Segment Size (MSS) (16 bits). This optron is used b.v the
originating TCF during connection establishment (in the start-of-a-new-
connection [SYN] segment) to negotiate the MSS to be used for the
connectlon. The 16 bits used for this held limit the MSS to 6,1K8.
o Timestamp (8 bytes). Thc timestamp option is to be used for more
accurate round-trip tirne (RTT) calculations. Two four-bvte timestamp
flelds are used for this option. The sending TCP f]lls the first held with the
current time. The receiver echoes back the timestamp vaiue received in
the seconcl lie1cl in air ACK se-qment. This facilitates the sender for more
itccurate ciilculation of the RTT.
38 Chapter 2 TCP/IP Fundamentals
TCP
h;;;' i rcP PaYload TCF segment
IP
IP PaYload
heaaer ] IP datagram
2.1.3 Encapsulation in lp
Once a TCP segment is ready for transmission, it is passed on to the IP layer. The Ip
laver encapsulates the entire TCP segment, the TCP header, and the TCp payload
into the IP datagram pairload. Figure 2.2 illustrates the encapsulation of a TCp
segment in an IP datagram" Given this encapsulation rnethod, the first 20 bytes of an
IP datagram payload contain all fields of a standard (no options used) TCi header.
Delayed ACK. The receiving TCP has the choice of either generating an
ACK as soon as it receives a segment or delaying the ACK for a whilJ. Ry deliying
the ACK, the receiver may be able to acknowledge two segments at a time
ald reduce
ACK traffic; however, delaying an ACK for too long may &ur" * timeout and retrans-
mission at the sender. A TCP ieceiver should not delay ACKs more than 500
rns.
Duplicate ACK. If a segment gets lost in the network. but the following
segment arrives safely at the receiver, it is possible for a receiving TCp to receive
data with a sequence number beyoncl the expected range. In that cise. the receiving
Section 2.1 TCP 39
TCP buffers the incorning bytes and regenerates the ACK for the bytes received so
far in sequence. The regeneration of the same ACK number causes the dupiicate
ACK phenomenon at the sender, that is, the sender can receive the same ACK more
than once. In the originai TCP, the sender simply ignores the duplicate ACK. As we
wili see in a later chapter, some later variants of TCP take special actions based on
duplicate ACKs.
1. The client sends a SYN segment (SYN-bit set in the header) to the server with
an initial sequence number (e.g., SeqNo : 88) that it is going to use for this
connection.
2. The server sends a segment that has both SYN and ACK bits set in the
flag (SYN + ACK, AckNo : 89, SeqNo : 155). The ACK number (AckNo)
indicates that the server has received bytes up to 88 correctly and the next
byte it expects has sequence number 89. The sequence number (SeqNo)
tells the client that the server will use 155 as the starting sequence number
for its data. The client and the server may use different initial sequence
numbers.
SY\ SeqNo - 88
ACK.AckNo : 156
Client Server
L. The client sends a FIN segment (FIN-bit set in the header) to the server to
indicate that it wishes to terminate the connection.
2. The server sends an ACK to confirm the receipt of the FIN segment. At this
stage the TCP client stops communication in the client-server direction. The
server, however, may need to continue the communication in the server-client
direction (e.g., part of a file is yet to be transmitted).
3. When the server is ready to close the connection, it sends a FIN segment to the
client. Because the server is not necessarily ready to terminate the server-client
communication when it receives a FIN segment from the client, steps 2 and 3
may not be combined.
4. The client acknowledges the receipt of the FIN segment r.vith an ACK segment.
Nov,'the connection is terminated from both ends.
Each handshake introduces some delays (the SYN or FIN segments need
to travel to the other ends). The handshaking is the major source of delay in
establishing and terminating TCP connections fcr long-distance communications
(e.g., in satellite TCP/IP networks).
Client Serl'er
o Step 2.TCP has sent bytes 6, 7, and 8, and it is waiting for ACK for all segments
in its current window. A rvindow full of data is in transitl no more data can be
sent at this stage.
o Step 3. ACK for bytes 3 and 4 has been received. At this stage, the sliding
window slides by two to the right, making bytes 9-and 10 eligibie to be sent.
o Step 4. TCP sends bytes 9 and 10 and again starts waiting for ACK.
In summary, the right-hand side of the window slides when a byte is sent,
whereas the left-hand side of the window slides when an ACK is received. The
maximum number of bytes waiting for ACK is determined by the window size.
St"p Z
:l st"P:
S'"P +
Time
m
Waiting for acknowledg-"nt lffi Cannot be sent
re
Slow Start. The principle behind the slow-start mechanism is to start with a
small window size and increase it "slowly" (we rvili later see that it is not so slow)
when ACKs arrive. This has the effect of probing the available buffer space in the
network. The actual window increase mechanism is as follows.
44 Chapter 2 TCP/IP Fundamentals
Sender Receiver
A
Fi
F
x
i
Congestion Avoidance. We have seen in the example of Figure 2.6 that after 22 UDP
each RTT. the window size practically gets doubled, allowing twice as many segments --a
>- l)
Congestion avoidance
'o
>10
Slow start threshold
;8
o0
34
Round-trip time (RTT)
; lS
the algorithm that controls this variable. With multiplicative decrease, TCP sets
ssthresh to half of the current CongestionWindow each time a timeout occurs (at
:_ .
timeout CongestionWindow itselfis set to one segment to force a slow-start) down
to a minimum of two segments. Therefore, if there are consecutive timeouts (severe
network congestion), multiplicative decrease reduces the sending rate exponentially.
The additive increase of the CongestionWindow during the congestion avoidance
phase and the multiplicative decrease of ssthresh is often referred to as the additive
increase, multiplicative decrease (AIMD) algorithm.
2,2 UDP
- -)
In addition to TCP, the TCP/IP protocol stack provides another transport protocol
called User Datagram Protocol (UDP). In this section, we present an overview
) of IIDP.
^
2.2.1 UDP Services
Unlike TCP, UDP provides a much simpler, bare minimum service to the applica-
tions. All UDP provides is a mechanism for the application to send a short message
to a given destination. UDP is connectionless, unreliable, and not stream-oriented
(it is datagram-oriented). With the datagram-oriented service, UDP cannot accept
a stream of data from the application and segment them for transmission. The
application is supposed to supply segmented data to UDP for transportation as an
independent datagram.
Because UDP is connectionless, it does not implement connection establish-
ment and connection termination. Lack of reliability means that there is no ACK
and retransmission mechanisms and no sequence nurnbers to identify each data-
gram; therefore, a UDP sender will not know if a datagram was lost on the way.
46 Chapter 2 TCP/IP Fundamentals
There is no flow control either, meaning that a UDP receiver may experience buffer
overflow. Table2.2 summarizes the key differences between TCp and uDp.
One might be wondering about the practical uses of UDP given its simplicity.
The simplicity of UDP actually turns out to be its strength for many applications
that do not require the heavyweight services of TCP. Some of the traditional and
emerging uses of UDp are:
-11
Length Checksum
Source and destination port numbers (16 bits each). UDP provides port
numbers to let multiple application processes share the same UDP services on
the same host. With 16 bits, there are a total of 65,535 possible ports.
Length (16 bits). The length field represents the total length of the UDP
datagram (including header) in bytes.
o Checksum (16 bits). UDP provides a checksum field to check the integrity of
its data. A packet with incorrect checksum is simply discarded at the receiver,
with no further actions taken.
2.2.3 Encapsulation in lP
Like TCP, UDP datagrams also travel in the payload of IP datagrams. The entire
UDP datagram, the header and the payload, is inserted in the IP payload; thus, the
first eight bytes of the IP payload contain the UDP header.
2,3 tP
IP is the network layer protocol used by both TCP and UDP. In this section, we
provide an overview of the IP protocol.
2.3.1 lP Services
The IP protocol provides a connectionless unreliable datagram model of commu-
nication. IP encapsulates the higher layer protocol units, such as TCP segments
and UDP datagrams, within the IP datagram payload, creates the IP header, and
fbrwards the complete IP datagram to the next hop router toward the destination.
Each intermediate router processes the IP datagram header and forwards it to the
next router along the path until it reaches the destination.
The connectionless model used by IP in the Internet has several advantages.
First of all, there is no need for explicit connection establishment and termination.
This simplifies the router design as the routers do not need to maintbin any
connection-related information: therefore, the connectionless model scales rvell for
a large number of hosts in the Internet. Routers also have flexibility of choosing
48 Chapter 2 Tcpltp Fundamentals
an appropriate path for each IP datagram based
on the congestion level or link
avaiiability in the Internet.
The connection-less model of.Ip, h_owever, has
its price. Ip cannot guarantee
the delivery of data to the destination. The
r"."ir" 1i f[vides is ofren referred to
as the "best-effort" service. This means
that routers will try their best to deliver
a datagram, but if there is congestion and route.,
.unrro, process datagrams fast
enough' they may drop them. IP does not implem"nl
u"y retransmission of lost
datagrams' There is also the possibility of
datagramr-u.irrlng out of order at the
destination. as routers may send diffeient datalrams
uiu arr"r"r, roures. Higher
layer protocols, such as TCP, are used to buill
a reliable service on top of the
unreliable Ip.
IP pavload
lr
*+
,rfamer
i herrder J l*l
Franre par lord I hru.T"
rr trailer
I
___ I
31
.
tersloni He:ider
l..nqth TOS Length
" lI
16-bit idenrilier Flass Fragment
otfset
lTL Protocol Header
checksum
32-brl source Ip address
f 3\'ir l.l.l
Flags and fragment offset. These two fields are used for fragmentation
and reassem-
bly. The flags field consists of three bits. The Don,t Fr-agment (DF) bit is
set by
a source to indicate that this datagram should not be lragmented.
The More
Fragments (MF) bit indicates the last fragment of the ditagram
to facilitate
reassembly at the destination. The third bit is currently unused.
The Frag-
ment Offset field indicates the exact position of the fragment in the
originll
datagram.
options: The option field can extend the Ip header. As the name suggests, this fleld
is not compulsory. This field can be used to support options
,r.t ,, security,
source routing, route reordering, and timestamping. irri, field
is of variabie
length as the number of options used in a datagram-is not fixed.
5. What is the purpose of TCP timeouts, and why is the timeout duration
important?
6. When a timeout occurs, TCP sets its slow-start threshold to half its current
congestion window size (multiplicative decrease). Can you think of the conse-
quences if multiplicative decrease were replaced by additive decrease (say, for
example, that current congestion window is decreased by one)?
7. In most cases, TCP retransmission timer expires whenever a router drops a
packet due to buffer overflow (the packet never reaches receiver). Can you
think of situations when RTO occurs even though packets reach receiver?
8. Fragmentation can be done at the source or any intermediate routers, but
reassembly is done only at the destination. Why do intermediate routers not
reassemble IP fragments?
9. Can you think of any disadvantages of IP fragmentation?
10. What role can TCP play to avoid IP fragmentation?
Increased load on memory. As each PC loads four protocol stacks, very little mem-
ory is left for running applications.
Reduced performance. Multiple protocols draw more CPU cycles, causing adverse
effect on performance.
Multiple address managernent. Different protocols use different addresses for iden-
tifying and communicating between computers. When multiple protocol stacks
are loaded on a PC. multiple addresses have to be assigned and managed for
each PC, making the address management much harder. Communication
errors caused by incorrect address assignment become difficult to isolate and
correct.
Multiple routing systems. Different stacks use different routing protocols and sys-
tems. With multiple stacks, routers must maintain multiple routing systems.
These multiprotocol routers are very costly to purchase and maintain'
Because of the above costs associated with multiple stacks, WCORP has
decided to adopt a single protocol strategy to meet its interconnecting require-
ments. As cliscussed in the previous case study (see Chapter 1), the four major
stacks currently in use are SNA, IPX/SPX, DECnet, and TCP/IP. To adopt a single
protocol strategy, WCORP must select one of these four stacks. As a first step
towarcl making this selection, the network administrator identifies six important
communication requirements to be fulfilled by the single protocol stack: native con-
nectivity to the public Internet. nonproprietary ownership, reliable communication,
connectionless communication, client-server communication, and routing between
different subnets. Table 2.3 shows the comparison of different stacks against these
.::,ltln Section 2.7 Case Study: WCORP Adopts Tcp/tp 53
TABLE 2.3: Comparison of different protocol stacks.
_:::nf
Protocol Internet Client-
: i,--rf
Stack Connectivity Ownership Reliable Connectionless Server Routing
SNA Difflcult IBM Yes Yes Yes Yes
: :: il IPX/SPX Difficult Novell Yes Yes Yes Yes
:i .,-tg DECnet Difficult Digital Yes Yes Yes Yes
:: TCP/IP Easy Open Yes Yes Yes Yes
rut
not six requirements. After careful consideration, WCORP has finall,v decided to adopt
TCPAP as the single stack to support interconnectivity. The driving factors for this
selection were the open standard of TCp/Ip (not owned by any rp".ifi. vendor) and
seamless connectivitv to the public Internet.
: :: ,..-i
- -i -
- ---'1C
: :,-\:
.:U -I
- :.,:-
^ l\
- : -lf
, r-:i
':
-.
-:il
;--l-
I -:ll
i-::i