0% found this document useful (0 votes)
11 views105 pages

6Transport-Part2

The document outlines the concepts and functionalities of the TCP transport layer, including reliability, flow control, connection management, and congestion control. It details TCP's mechanisms such as checksums, sequence numbers, acknowledgments, and the structure of TCP headers. Additionally, it discusses the importance of round trip time and timeout settings for efficient data transmission.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views105 pages

6Transport-Part2

The document outlines the concepts and functionalities of the TCP transport layer, including reliability, flow control, connection management, and congestion control. It details TCP's mechanisms such as checksums, sequence numbers, acknowledgments, and the structure of TCP headers. Additionally, it discusses the importance of round trip time and timeout settings for efficient data transmission.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 105

COMP 3331/9331:

Computer Networks and


1.
2.
TCP Reliability
TCP Flow Control Applications
3.
4.
TCP Connection management
TCP Congestion Control
Week 5
Transport Layer (Continued)
Reading Guide: Chapter 3, Sections: 3.5 – 3.7
Transport Layer Outline
3.1 transport-layer services 3.5 connection-oriented
3.2 multiplexing and transport: TCP
demultiplexing § segment structure
3.3 connectionless transport: § reliable data transfer
UDP § flow control
§ connection management
3.4 principles of reliable data
transfer 3.6 principles of congestion
control
3.7 TCP congestion control

2
Recall: Components of a solution for reliable
transport
v Checksums (for error detection)
v Timers (for loss detection)
v Acknowledgments
§ Cumulative
§ Selective
v Sequence numbers (duplicates, windows)
v Sliding Windows (for efficiency)
§ Go-Back-N (GBN)
§ Selective Repeat (SR)
3
What does TCP do?
Many of our previous ideas, but some key differences
vChecksum

4
TCP Header

Source port Destination port

Sequence number
Acknowledgment
Computed
over header HdrLen 0 Flags Receive window
and data
Checksum Urgent pointer
(SAME AS UDP)
Options (variable)

Data

5
What does TCP do?

Many of our previous ideas, but some key


differences
vChecksum
vSequence numbers are byte offsets

6
TCP “Stream of Bytes” Service ..
Application @ Host A
Byte 0
Byte 1
Byte 2
Byte 3

Byte 80
Byte 0
Byte 1
Byte 2
Byte 3

Byte 80

Application @ Host B
7
.. Provided Using TCP “Segments”
Host A
Byte 0
Byte 1
Byte 2
Byte 3

Byte 80 Segment sent when:


TCP Data 1. Segment full (Max Segment Size),
2. Not full, but instructed by the Application e.g.,
1 byte in Telnet

TCP Data
Host B
Byte 0
Byte 1
Byte 2
Byte 3

Byte 80

8
TCP Maximum Segment Size
MTU
IP Data
TCP Data (segment) TCP Hdr IP Hdr

v IP packet
§ No bigger than Maximum Transmission Unit (MTU) of link layer
§ E.g., up to 1500 bytes with Ethernet
v TCP packet
§ IP packet with a TCP header and data inside
§ TCP header ³ 20 bytes long
v TCP segment
§ No more than Maximum Segment Size (MSS) bytes
§ E.g., up to 1460 consecutive bytes from the stream
§ MSS = MTU – 20 (minimum IP header ) – 20 ( minimum TCP header )
9
Sequence Numbers
ISN (initial sequence number)
k bytes

Host A

Sequence number
= 1st byte in segment =
ISN + k

Sequence numbers:
• byte stream “number” of first byte in
segment’s data

10
Sequence & Ack Numbers
ISN (initial sequence number)
k

Host A

Sequence number TCP Data TCP


HDR
= 1st byte in segment =
ACK sequence number
ISN + k
= next expected byte
= seqno + length(data)
TCP
TCP Data HDR

Host B

11
TCP Header

Acknowledgment Source port Destination port


gives seqno just
beyond highest Sequence number
seqno received in
order Acknowledgment
(“What Byte HdrLen 0 Flags Receive window
is Next”)
Checksum Urgent pointer
Options (variable)

Data

12
What does TCP do?

Most of our previous tricks, but a few differences


vChecksum
vSequence numbers are byte offsets
vReceiver sends cumulative acknowledgements (like GBN)

14
ACKing and Sequence Numbers

15
An Example
Host A Host B

ISN=100

Seq=100, Data=50 Seq 100 to Seq 149

Seq=150, Data=50 ACK=150, Received 100-149


Seq=200, Data=50 ACK=200, Received 150-199

Seq=250, Data=50 ACK=250, Received 200-249

ACK=300, Received 250-299

Seq=???, Data=50

Seq = 300 (new segment)


Since TCP uses cumulative ACKs, the receipt of ACK 300 before a timeout (for seg with sequence number 200) implies
the receiver has received all 4 segments sent above 16
Another Example
Host A Host B
ISN=100

Seq=100, Data=50

Seq=150, Data=50 ACK=150, Received 100-149


Seq=200, Data=50

Seq=250, Data=50 ACK=???, Received 200-249

ACK=???, Received 250-299

Both ACKs will be 150 as the


receiver has received
everything up to 149 in correct
sequence. The two out of
order segments are buffered.

17
Piggybacking Piggybacking
Client Server Client Server
v So far, we’ve assumed
• So far, we’ve distinct
assumed
“sender” and “receiver”
distinct roles
“sender” and
“receiver” roles
v Usually both sides of a
connection (i.e. the application
• Insend
processes) some
reality, data both
usually
sides of a connection
send some data
– request/response is a
… …
common pattern Without With
Piggybacking Piggybacking

18
Example

Note: Connection establishment not shown. Alice’s end point selects the initial
sequence number as 0 while Bob’s end point selects the initial sequence number as 10

19
Another Example

Note: Connection establishment not shown. Alice’s end point selects the initial
sequence number as 0 while Bob’s end point selects the initial sequence number as 10

HTTP response split into 3 segments (MSS = 1500 bytes) 20


Quiz
Seq
= 101,
2 KB
ytes
of data

C K =? f data ACK =101 + 2048 = 2149


A o
1 KByte
4,
Seq = 102

Seq
= ?, 2
KBy
ACK tes of Seq = 2149
=? data
ACK =1024 + 1024 = 2048
21
TCP seq. numbers, ACKs
Host A Host B

Seq=42, ACK = 79, Data=571 bytes

Seq=c, ACK =d, Data=435 bytes


Seq = a, ACK=b, data = 0 byte

Seq = e, ACK=f, data = 0 byte

Seq=g, ACK =h

22
What does TCP do?

Most of our previous tricks, but a few differences


vChecksum
vSequence numbers are byte offsets
vReceiver sends cumulative acknowledgements (like GBN)
vReceivers can buffer out-of-sequence packets (like SR)

23
Loss with cumulative ACKs

v Sender sends packets with 100 bytes and sequence numbers:


§ 100, 200, 300, 400, 500, 600, 700, 800, 900, …

v Assume the fifth packet (seq. no. 500) is lost, but no others

v 6th packet onwards are buffered

v Stream of ACKs will be:


§ 200, 300, 400, 500, 500, 500, 500,…
24
What does TCP do?

Most of our previous tricks, but a few differences


vChecksum
vSequence numbers are byte offsets
vReceiver sends cumulative acknowledgements (like GBN)
vReceivers do not drop out-of-sequence packets (like SR)
vSender maintains a single retransmission timer (like GBN) and retransmits
on timeout (how much?)

25
TCP round trip time, timeout

26
TCP round trip time, timeout
Q: how to set TCP timeout Q: how to estimate RTT?
value? § SampleRTT:measured time
§ longer than RTT, but RTT varies! from segment transmission until
ACK receipt
§ too short: premature timeout,
• ignore retransmissions
unnecessary retransmissions
§ SampleRTT will vary, want
§ too long: slow reaction to estimated RTT “smoother”
segment loss • average several recent
measurements, not just current
SampleRTT

27
TCP round trip time, timeout
EstimatedRTT = (1- a )*EstimatedRTT + a *SampleRTT
§ exponential weighted moving average (EWMA)
§ influence of past sample decreases exponentially fast
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
§ typical value: a = 0.125 350

RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

RTT (milliseconds)
300

RTT (milliseconds) 250

200

sampleRTT
150
EstimatedRTT

100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

time (seconds)
time (seconnds)

SampleRTT Estimated RTT


28
TCP round trip time, timeout
§ timeout interval: EstimatedRTT plus “safety margin”
• large variation in EstimatedRTT: want a larger safety margin
TimeoutInterval = EstimatedRTT + 4*DevRTT

estimated RTT “safety margin”

§ DevRTT: EWMA of SampleRTT deviation from EstimatedRTT:


DevRTT = (1-b)*DevRTT + b*|SampleRTT-EstimatedRTT|
(typically, b = 0.25)

Practice Problem:
http://wps.pearsoned.com/ecs_kurose_compnetw_6/216/55463/14198700.cw/index.html
29
TCP round trip time, timeout
(EstimatedRTT+4*DevRTT)

RTT
EstimatedRTT

DevRTT

TimeoutInterval = EstimatedRTT + 4*DevRTT

estimated RTT “safety margin”

Figure: Credits Prof David Wetherall UoW 30


Why exclude retransmissions in RTT computation?

v How do we differentiate between the real ACK, and ACK of the


retransmitted packet?
v Sender cannot differentiate between the two scenarios shown below

Sender Receiver Sender Receiver

Origin Origin
al Tran al Tran
smissi smissi
on on

ACK
Retra Retra
nsmis SampleRTT nsmis
sion sion
SampleRTT

ACK

31
PUTTING IT

TCP Sender (simplified) TOGETHER

event: data received from event: timeout


application § retransmit segment that
caused timeout
§create segment with seq #
§ restart timer
§seq # is byte-stream number of
first data byte in segment
event: ACK received
§start timer if not already
running §if ACK acknowledges previously
unACKed segments
• think of timer as for oldest
unACKed segment • update what is known to be
ACKed
• expiration interval:
TimeOutInterval • start timer if there are still
unACKed segments

32
Note: You may neglect delayed ACKs in the
TCP ACK generation [RFC 1122, RFC 2581] exam unless explicitly told to consider it

event at receiver TCP receiver action


arrival of in-order segment with delayed ACK. Wait up to 500ms
expected seq #. All data up to for next segment. If no next segment,
expected seq # already ACKed send ACK

arrival of in-order segment with immediately send single cumulative


expected seq #. One other ACK, ACKing both in-order segments
segment has ACK pending

arrival of out-of-order segment immediately send duplicate ACK,


higher-than-expect seq. # . indicating seq. # of next expected byte
Gap detected

arrival of segment that immediate send ACK, provided that


partially or completely fills gap segment starts at lower end of gap

33
TCP: retransmission scenarios
Host A Host B Host A Host B

SendBase=92
Seq=92, 8 bytes of data Seq=92, 8 bytes of data

Seq=100, 20 bytes of data


timeout

timeout
ACK=100
X
ACK=100
ACK=120

Seq=92, 8 bytes of data Seq=92, 8


SendBase=100 bytes of data
SendBase=120
ACK=100
ACK=120

SendBase=120

lost ACK scenario premature timeout


34
TCP: retransmission scenarios
Host A Host B Host A Host B

Seq=92, 8 bytes of data

Seq=92, 8 bytes of data X


Seq=100, 20 bytes of data Seq=100, 20 bytes of data

timeout
ACK=100
timeout

X
ACK=120 ACK=?

ACK = 92
Seq=120, 15 bytes of data

cumulative ACK cumulative ACK


35
What does TCP do?

Most of our previous tricks, but a few differences


vChecksum
vSequence numbers are byte offsets
vReceiver sends cumulative acknowledgements (like GBN)
vReceivers may not drop out-of-sequence packets (like SR)
vSender maintains a single retransmission timer (like GBN) and retransmits
on timeout
vIntroduces fast retransmit: optimisation that uses duplicate
ACKs to trigger early retransmission

36
TCP fast retransmit
Host A Host B
TCP fast retransmit
if sender receives 3 additional
ACKs for same data (“triple Seq=92
Seq=1
, 8 byte
s of da
ta
duplicate ACKs”), resend unACKed 00, 20
bytes
of data
segment with smallest seq # X
§ likely that unACKed segment lost,
=100
so don’t wait for timeout ACK

=100

timeout
ACK
=100
ACK
=100
Receipt of three duplicate ACKs ACK

indicates 3 segments received Seq=100, 20 bytes of data

after a missing segment – lost


segment is likely. So retransmit!

37
Quiz: TCP Sequence Numbers?
A TCP Sender is about to send a segment of size 100 bytes with
sequence number 1234 and ack number 436. What is the highest
sequence number up to (and including) which this sender has received
from the receiver?
A.1233
B.436
C.435
D.1334
E.536

38
Quiz: TCP Sequence Numbers?

A TCP Sender is about to send a segment of size 100 bytes with


sequence number 1234 and ack number 436. Is it possible that the
receiver has received byte number 1335?
A.Yes
B.No

39
Quiz: TCP Sequence Numbers?

The following statement is true about the TCP sliding window


protocol for implementing reliable data transfer
A.It exclusively uses the ideas of Go-Back-N
B.It exclusively uses the ideas of Selective Repeat
C.It uses a combination of ideas of Go-Back-N and Selective-Repeat
D.It uses none of the ideas of Go-Back-N and Selective-Repeat

40
Transport Layer Outline
3.1 transport-layer services 3.5 connection-oriented
3.2 multiplexing and transport: TCP
demultiplexing § segment structure
3.3 connectionless transport: § reliable data transfer
UDP § flow control
§ connection management
3.4 principles of reliable data
transfer 3.6 principles of congestion
control
3.7 TCP congestion control

41
TCP flow control
application
Q: What happens if network Application removing
process
layer delivers data faster than data from TCP socket
buffers
application layer removes TCP socket
data from socket buffers? receiver buffers

TCP
code
Network layer
delivering IP datagram
payload into TCP
socket buffers IP
code

from sender

receiver protocol stack

42
TCP flow control
application
Q: What happens if network Application removing
process
layer delivers data faster than data from TCP socket
buffers
application layer removes TCP socket
data from socket buffers? receiver buffers

TCP
code
Network layer
delivering IP datagram
payload into TCP
socket buffers IP
code

from sender

receiver protocol stack

43
TCP flow control
application
Q: What happens if network Application removing
process
layer delivers data faster than data from TCP socket
buffers
application layer removes TCP socket
data from socket buffers? receiver buffers

TCP
code

receive window
flow control: # bytes
receiver willing to accept IP
code

from sender

receiver protocol stack

44
TCP flow control
application
Q: What happens if network Application removing
process
layer delivers data faster than data from TCP socket
buffers
application layer removes TCP socket
data from socket buffers? receiver buffers

TCP
flow control code

receiver controls sender, so


sender won’t overflow IP
code
receiver’s buffer by
transmitting too much, too fast
from sender

receiver protocol stack

45
TCP flow control
§ TCP receiver “advertises” free buffer
space in rwnd field in TCP header to application process
• RcvBuffer size set via socket
options (typical default is 4096 bytes) RcvBuffer buffered data
• many operating systems autoadjust
rwnd free buffer space
RcvBuffer
§ sender limits amount of unACKed
(“in-flight”) data to received rwnd TCP segment payloads

§ guarantees receive buffer will not TCP receiver-side buffering


overflow

46
TCP flow control
flow control: # bytes receiver willing to accept

§ TCP receiver “advertises” free buffer


space in rwnd field in TCP header
• RcvBuffer size set via socket
receive window
options (typical default is 4096 bytes)
• many operating systems autoadjust
RcvBuffer
§ sender limits amount of unACKed
(“in-flight”) data to received rwnd
§ guarantees receive buffer will not
overflow
TCP segment format

47
TCP flow control
v What if rwnd = 0?
§ Sender would stop sending data
§ Eventually the receive buffer would have space when the application process
reads some bytes
§ But how does the receiver advertise the new rwnd to the sender?
v Sender keeps sending TCP segments with one data byte to the
receiver
v These segments are dropped but acknowledged by the receiver
with a zero-window size
v Eventually when the buffer empties, non-zero window is advertised

48
Transport Layer Outline
3.1 transport-layer services 3.5 connection-oriented
3.2 multiplexing and transport: TCP
demultiplexing § segment structure
3.3 connectionless transport: § reliable data transfer
UDP § flow control
§ connection management
3.4 principles of reliable data
transfer 3.6 principles of congestion
control
3.7 TCP congestion control

49
TCP connection management
before exchanging data, sender/receiver “handshake”:
§agree to establish connection (each knowing the other willing to establish connection)
§agree on connection parameters (e.g., starting seq #s)

application application

connection state: ESTAB connection state: ESTAB


connection variables: connection Variables:
seq # client-to-server seq # client-to-server
server-to-client server-to-client
rcvBuffer size rcvBuffer size
at server,client at server,client

network network

Socket clientSocket = Socket connectionSocket =


newSocket("hostname","port number"); welcomeSocket.accept();
50
Agreeing to establish a connection
2-way handshake:

Q: will 2-way handshake always


Let’s talk work in network?
ESTAB
ESTAB
OK § variable delays
§ retransmitted messages (e.g.
req_conn(x)) due to message loss
§ message reordering
choose x
req_conn(x) § can’t “see” other side
ESTAB
acc_conn(x)
ESTAB

51
2-way handshake scenarios
choose x
req_conn(x)
ESTAB
acc_conn(x)

ESTAB
data(x+1) accept
ACK(x+1) data(x+1)

connection
x completes

No problem!

52
2-way handshake scenarios

choose x
req_conn(x)
ESTAB
retransmit acc_conn(x)
req_conn(x)

ESTAB
req_conn(x)

connection
client x completes server
terminates forgets x

ESTAB
acc_conn(x)
Problem: half open
connection! (no client)
53
2-way handshake scenarios
choose x
req_conn(x)
ESTAB
retransmit acc_conn(x)
req_conn(x)

ESTAB
data(x+1) accept
data(x+1)
retransmit
data(x+1)
connection
x completes server
client
terminates forgets x
req_conn(x)
ESTAB
data(x+1) accept
data(x+1)
Problem: dup data
accepted!
TCP 3-way handshake
SYN Consumes 1 Sequence No
Server state
Client state
serverSocket = socket(AF_INET,SOCK_STREAM)
serverSocket.bind((‘’,serverPort))
serverSocket.listen(1)
clientSocket = socket(AF_INET, SOCK_STREAM) connectionSocket, addr = serverSocket.accept()
LISTEN
clientSocket.connect((serverName,serverPort)) LISTEN
choose init seq num, x
send TCP SYN msg
SYNSENT SYNbit=1, Seq=x
choose init seq num, y
send TCP SYNACK
msg, acking SYN SYN RCVD
SYNbit=1, Seq=y
ACKbit=1; ACKnum=x+1
received SYNACK(x)
ESTAB indicates server is live;
send ACK for SYNACK;
this segment may contain ACKbit=1, ACKnum=y+1
client-to-server data
Seq =x+1 received ACK(y)
indicates client is live
ESTAB

55
What if the SYN Packet Gets Lost?
v Suppose the SYN packet gets lost
§ Packet is lost inside the network, or:
§ Server discards the packet (e.g., it’s too busy)

v Eventually, no SYN-ACK arrives


§ Sender sets a timer and waits for the SYN-ACK
§ … and retransmits the SYN if needed

v How should the TCP sender set the timer?


§ Sender has no idea how far away the receiver is
§ Hard to guess a reasonable length of time to wait
§ SHOULD (RFCs 1122,2988) use default of 3 second,
RFC 6298 use default of 1 second

58
TCP: closing a connection
v client, server each close their side of connection
§ send TCP segment with FIN bit = 1
v respond to received FIN with ACK
§ on receiving FIN, ACK can be combined with own FIN
v simultaneous FIN exchanges can be handled

59
Normal Termination, One at a Time
FIN Consumes 1 Sequence No

client state server state


ESTAB ESTAB
clientSocket.close()
FIN_WAIT_1 can no longer FINbit=1, seq=x
send but can
receive data CLOSE_WAIT
ACKbit=1; ACKnum=x+1
can still
FIN_WAIT_2 wait for server send data
close

LAST_ACK
FINbit=1, seq=y
TIMED_WAIT can no longer
send data
ACKbit=1; ACKnum=y+1
timed wait
for 2*max CLOSED
segment lifetime

CLOSED
TIMED_WAIT: Can retransmit ACK if last ACK is lost
60
Normal Termination, Both Together
client state server state
ESTAB ESTAB
clientSocket.close()
FIN_WAIT_1 can no longer FINbit=1, seq=x
send but can
receive data CLOSE_WAIT
ACKbit=1; ACKnum=x+1 FIN + ACK
wait for server together LAST_ACK
TIMED_WAIT close FINbit=1, seq=y can no longer
send data

timed wait ACKbit=1; ACKnum=y+1


for 2*max
segment lifetime CLOSED

CLOSED

61
Simultaneous Closure
client state server state
ESTAB ESTAB
clientSocket.close()
FIN_WAIT_1 can no longer FIN_WAIT_1
can no longer send data
FINbit=1, seq=x
send but can
receive data Send Ack
FINbit=1, seq=y CLOSING
wait for server
close
TIMED_WAIT
CLOSING ACKbit=1,
ACKnum=x+1
TIMED_WAIT
ACKbit=1,
ACKnum=y+1

CLOSED
CLOSED

62
Abrupt Termination
B

SYN A

A CK

RST
A CK

RST
SYN

Data
Data
CK
A
time

v A sends a RESET (RST) to B


§ E.g., because application process on A crashed
v That’s it
§ B does not ack the RST
§ Thus, RST is not delivered reliably
§ And: any data in flight is lost
§ But: if B sends anything more, will elicit another RST
63
TCP SYN Attack (SYN flooding)
v Miscreant creates a fake SYN packet
§ Destination is IP address of victim host (usually some server)
§ Source is some spoofed IP address
v Victim host on receiving creates a TCP connection state i.e allocates buffers,
creates variables, etc and sends SYN ACK to the spoofed address (half-open
connection)
v ACK never comes back
v After a timeout connection state is freed
v However for this duration the connection state is unnecessarily created
v Further miscreant sends large number of fake SYNs
§ Can easily overwhelm the victim
v Solutions:
§ Increase size of connection queue
§ Decrease timeout wait for the 3-way handshake
§ Firewalls: list of known bad source IP addresses
§ TCP SYN Cookies (explained on next slide)

64
TCP SYN Cookie
v On receipt of SYN, server does not create connection state
v It creates an initial sequence number (init_seq) that is a hash of
source & dest IP address and port number of SYN packet (secret
key used for hash)
§ Replies back with SYN ACK containing init_seq
§ Server does not need to store this sequence number
v If original SYN is genuine, an ACK will come back
§ Same hash function run on the same header fields to get the initial sequence
number (init_seq)
§ Checks if the ACK is equal to (init_seq+1)
§ Only create connection state if above is true
v If fake SYN, no harm done since no state was created

65
Quiz: TCP Connection Management?

Assume that one end of a TCP connection selects an initial sequence


number 120. The first TCP segment containing data sent by this end
point will have a sequence number of ____
A.120
B.121
C.122
D.128
E.0

66
Quiz: TCP Connection Management
Assume that one end point of the TCP connection sends a FIN
segment. If it never receives an ACK, what should it do?

A.Assume that the connection is closed and do nothing

B.Retransmit the FIN

C.Transmit an ACK

D.Start crying

67
Transport Layer: Outline
3.1 transport-layer services 3.5 connection-oriented
3.2 multiplexing and transport: TCP
demultiplexing § segment structure
3.3 connectionless transport: § reliable data transfer
UDP § flow control
§ connection management
3.4 principles of reliable data
transfer 3.6 principles of congestion
control
3.7 TCP congestion control

68
Principles of congestion control

congestion:
v informally: “too many sources sending too much data too
fast for network to handle”
v different from flow control!
v manifestations:
§ lost packets (buffer overflow at routers)
§ long delays (queueing in router buffers)
v a top-10 problem!

69
Congestion Congestion
Ugh. I so
can’t deal
Trash with this right
now!
Router

Router’s buffer.
Incoming rate is faster than
outgoing link can support.
70
Congestion Collapse
Congestion Collapse

… Link A Link B

71
Congestion
Congestion Collapse Collapse

… Link A Link B

One sender starts,
but there’s still
capacity at link A.

S1

72
Congestion Collapse
Congestion Collapse
S2

Another sender starts


up. Link A is showing

slight delay, but still
doing ok.

… Link A Link B


S1
73
Congestion Collapse
Congestion Collapse
S2

… Unrelated traffic
passes through and
congests link B.

… Link A Link B


S1
74
Congestion Collapse
Congestion Collapse
S2’s traffic is being dropped at
S2 Link B, so it starts retransmitting
on top of what it was sending.

… Link A Link B


S1
(This is very bad. S2 is now sending lots of traffic over link A
that has no hope of crossing link B.) 75
Congestion Collapse
Congestion Collapse
S2

… Link A Link B

Increased traffic from S2
causes Link A to become
congested. S1 starts

retransmitting.
S1
76
Congestion Collapse
Congestion Collapse
S2


Congestion
Link A Link B

propagates
backwards…

S1
77
Without congestion control

congestion:
v Increases delays
§ If delays > RTO, sender retransmits
v Increases loss rate
§ Dropped packets also retransmitted
v Increases retransmissions, many unnecessary
§ Wastes capacity of traffic that is never delivered
§ Increase in load results in decrease in useful work done
v Increases congestion, cycle continues …

78
Cost of Congestion
packet
knee cliff
loss

Throughput
v Knee – point after which
§ Throughput increases slowly congestion
§ Delay increases fast collapse

Load

Delay
v Cliff – point after which
§ Throughput starts to drop to zero
(congestion collapse)
§ Delay approaches infinity

Load
79
Congestion Collapse

This happened to the Internet (then NSFnet) in 1986


v Rate dropped from a blazing 32 Kbps to 40bps
v This happened on and off for two years
v In 1988, Van Jacobson published “Congestion Avoidance and Control”
v The fix: senders voluntarily limit sending rate

80
Approaches towards congestion control

two broad approaches towards congestion control:

end-end congestion network-assisted


control: congestion control:
v no explicit feedback v routers provide
from network feedback to end systems
v congestion inferred § single bit indicating
from end-system congestion (SNA,
observed loss, delay DECbit, TCP/IP ECN,
v approach taken by ATM)
TCP § explicit rate for
sender to send at

81
Transport Layer: Outline
3.1 transport-layer services 3.5 connection-oriented
3.2 multiplexing and transport: TCP
demultiplexing § segment structure
3.3 connectionless transport: § reliable data transfer
UDP § flow control
§ connection management
3.4 principles of reliable data
transfer 3.6 principles of congestion
control
3.7 TCP congestion control

82
TCP’s Approach in a Nutshell
v TCP connection maintains a window
§ Controls number of packets in flight

v TCP sending rate:


§ roughly: send cwnd bytes, wait RTT for ACKs, then
send more bytes sender sequence number space
cwnd

cwnd
rate ~
~ bytes/sec
RTT
last byte last byte
ACKed sent, not- sent
yet ACKed
(“in-
flight”)

v Vary window size to control sending rate


83
All These Windows…

v Congestion Window: CWND


§ How many bytes can be sent without overflowing routers
§ Computed by the sender using congestion control algorithm

v Flow control window: Advertised / Receive Window (RWND)


§ How many bytes can be sent without overflowing receiver’s buffers
§ Determined by the receiver and reported to the sender

v Sender-side window = minimum{CWND, RWND}


• Assume for this discussion that RWND >> CWND

84
CWND

v This lecture will talk about CWND in units of MSS


§ (Recall MSS: Maximum Segment Size, the amount of payload data in a TCP
packet)
§ This is only for pedagogical purposes

v Keep in mind that real implementations maintain CWND in bytes

85
Two Basic Questions

v How does the sender detect congestion?

v How does the sender adjust its sending rate?

86
Detection Congestion: Infer Loss
v Duplicate ACKs: isolated loss
§ dup ACKs indicate network capable of delivering some segments

v Timeout: much more serious


§ Not enough dup ACKs
§ Must have suffered several losses

v Will adjust rate differently for each case

87
RECAP: TCP fast retransmit (dup acks)
Host A Host B

Seq=92, 8 bytes of data


Seq=100, 20 bytes of data
X

timeout ACK=100
ACK=100
ACK=100
ACK=100
Seq=100, 20 bytes of data

fast retransmit after sender


receipt of triple duplicate ACK
88
Rate Adjustment

v Basic structure:
§ Upon receipt of ACK (of new data): increase rate
§ Upon detection of loss: decrease rate

v How we increase/decrease the rate depends on the


phase of congestion control we’re in:
§ Discovering available bottleneck bandwidth vs.
§ Adjusting to bandwidth variations

89
TCP Slow Start (Bandwidth discovery)
Host A Host B
v when connection begins, increase
rate exponentially until first loss
event: one segm
ent

RTT
§ initially cwnd = 1 MSS
§ double cwnd every RTT (all ACKs) two segm
ents
§ Simpler implementation achieved by
incrementing cwnd for every ACK
received four segm
ents
§ cwnd += 1 for each ACK
v summary: initial rate is slow but
ramps up exponentially fast
time

90
Adjusting to Varying Bandwidth
v Slow start gave an estimate of available bandwidth

v Now, want to track variations in this available


bandwidth, oscillating around its current value
§ Repeated probing (rate increase) and back-off (rate decrease)
§ Known as Congestion Avoidance (CA)

v TCP uses: “Additive Increase Multiplicative Decrease”


(AIMD)

91
AIMD
v approach: sender increases transmission rate (window size), probing for usable
bandwidth, until another congestion event occurs
§ additive increase: increase cwnd by 1 MSS every RTT until loss detected
• For each successful RTT (all ACKs), cwnd = cwnd +1 (in multiples of MSS)
• Simple implementation: for each ACK, cwnd = cwnd + 1/cwnd (since
there are cwnd/MSS packets in a window)
§ multiplicative decrease: cut cwnd in half after loss

additively increase window size …


…. until loss occurs (then cut window in half)
congestion window size
cwnd: TCP sender

AIMD saw tooth


behavior: probing
for bandwidth

time 92
Leads to the TCP “Sawtooth”

Window Loss

Loss Loss Loss


Loss

Exponential t
“slow start”

93
Slow-Start vs. AIMD
v When does a sender stop Slow-Start and start Congestion Avoidance?

v Introduce a “slow start threshold” (ssthresh)


§ Initialized to a large value

v Convert to CA when cwnd = ssthresh, sender switches from slow-


start to AIMD-style increase
§ On loss, ssthresh = CWND/2

94
Implementation
v State at sender
§ CWND (initialized to a small constant)
• the slides use multiple of MSS
§ ssthresh (initialized to a large constant)
§ [Also dupACKcount and timer, as before]

v Events
§ ACK (new data)
§ dupACK (duplicate ACK for old data)
§ Timeout
95
Event: ACK (new data)
v If CWND < ssthresh • Hence after one RTT (All ACKs
§ CWND += 1 with no drops):
CWND = 2xCWND

96
Event: ACK (new data)
v If CWND < ssthresh
Slow start phase
§ CWND += 1

v Else
§ CWND = CWND + “Congestion
1/CWND Avoidance” phase
(additive increase)
• Hence after one RTT (All ACKs
with no drops):
CWND = CWND + 1

97
Event: dupACK

v dupACKcount ++

v If dupACKcount = 3 /* fast retransmit */


§ ssthresh = CWND/2
§ CWND = CWND/2

98
Event: TimeOut

v On Timeout
§ ssthresh ß CWND/2
§ CWND ß 1

99
Example
Window
Timeout SSThresh
Fast
Retransmission Set to Here

Slow start in operation until


it reaches half of previous
CWND, I.e., SSTHRESH t

Slow-start restart: Go back to CWND = 1 MSS, but take


advantage of knowing the previous value of CWND
100
TCP Flavours
vTCP-Tahoe
§ cwnd =1 on triple dup ACK & timeout
vTCP-Reno
§ cwnd =1 on timeout
§ cwnd = cwnd/2 on triple dup ACK
vTCP-newReno
§ TCP-Reno + improved fast recovery (SKIPPED AND NOT
ON EXAM)
vTCP-SACK (NOT COVERED IN THE COURSE)
§ incorporates selective acknowledgements
101
Quiz: TCP Congestion Control?
In the figure how many congestion avoidance intervals
can you identify?
A.0
B.1
C.2
D.3
E.4

Note: the transition at round 17 is


not entirely acurrate, the window
should reduce to 21 (currently 24)
102
Quiz: TCP Congestion Control?

In the figure how many slow start intervals can you identify?
A.0
B.1
C.2
D.3
E.4

Note: the transition at round 17 is


not entirely acurrate, the window
should reduce to 21 (currently 24) 103
Quiz: TCP Congestion Control
In the figure after the 16th transmission round, segment loss
is detected by _______ ?
A. Triple Dup Ack
B. Timeout

Note: the transition at round 17 is


not entirely acurrate, the window
should reduce to 21 (currently 24)
104
Quiz: TCP Congestion Control

In the figure what is the initial value of sstresh (steady state


threshold)?
A. 0
B. 28
C. 32
D. 42
E. 64

Note: the transition at round 17 is


not entirely acurrate, the window
should reduce to 21 (currently 24)
105
Quiz: TCP Congestion Control

In the figure what is the value of sstresh (steady state


threshold) at the 18th round?
A. 1
B. 32
C. 42
D. 21
E. 20

Note: the transition at round 17 is


not entirely acurrate, the window
should reduce to 21 (currently 24)
106
Quiz: TCP Reliability

TCP uses cumulative ACKs like Go-Back-N but does


not retransmit the entire window of outstanding
packets upon a timeout. What mechanism lets TCP
get away with this?
A.Per-byte sequence and acknowledgement numbers
B.Triple Duplicate ACKs
C.Receiver window-based flow control
D.Timeout estimation algorithm

107
Quiz: TCP Timeout

A TCP Sender maintains an EstimatedRTT of 100ms. Suppose the


next SampleRTT is 108. Which of the following is true about the
sender?
A.It will increase EstimatedRTT but leave timeout unchanged
B.It will increase the timeout
C.Whether it increases EstimatedRTT will depend on the deviation
D.Whether it increases the timeout will depend on the deviation

108

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy