Lecture 17-18 Congestion Control
Lecture 17-18 Congestion Control
Transport
Layer
A note on the use of these PowerPoint slides:
We’re making these slides freely available to all (faculty, students,
readers). They’re in PowerPoint form so you see the animations; and
can add, modify, and delete slides (including this one) and slide content
to suit your needs. They obviously represent a lot of work on our part.
In return for use, we only ask the following:
If you use these slides (e.g., in a class) that you mention their source
(after all, we’d like people to use our book!)
If you post any slides on a www site, that you note that they are
adapted from (or perhaps identical to) our slides, and note our
copyright of this material.
Computer Networking: A
For a revision history, see the slide note for this page.
Top-Down Approach
Thanks and enjoy! JFK/KWR 8th edition
Jim Kurose, Keith Ross
All material copyright 1996-2020
J.F Kurose and K.W. Ross, All Rights Reserved Pearson, 2020
Transport Layer: 3-1
Chapter 3: roadmap
Transport-layer services
Multiplexing and demultiplexing
Connectionless transport: UDP
Principles of reliable data transfer
Connection-oriented transport: TCP
Principles of congestion control
TCP congestion control
Evolution of transport-layer
functionality
Sender
Sender
Feedback:
“Not much
Feedback:
getting through”
“Receiver
overflowing”
Receiver
Receiver
Approaches towards congestion
control
two broad approaches towards congestion control:
Transport Layer 5
Approaches towards congestion
control
two broad approaches towards congestion control:
Transport Layer 6
TCP congestion control
• Each sender limits the rate at which it sends traffic into its
connection as a function of perceived network congestion.
Transport Layer 7
TCP congestion control
Transport Layer 8
Simplified Network Model
The entire network is abstracted as a single router – “black box”
RcvWin
cwnd rwnd
CongWin
Receiver resources
Represented by
“Receive Window Size”
(b)
Network resources
Represented by “Congestion Window Size”
Simplified Network Model
MTU=1040 MTU=1520
MSS MSS
1000 1480
MTU=1020 MTU=1500
TCP Handshake (2)
Window size 10,000
MSS 1480
Initial sequence number Y
1 byte
Acknowledge number X+1
MSS MSS
1000 1480
MTU=1020 MTU=1500
TCP Handshake (3)
Window size 2,000
MSS MSS
1000 1480
MTU=1020 MTU=1500
Classic TCP (send all
segments)
MSS MSS
1000 10 segments (where 1 MSS is equal to 1000 bytes)=10,000 1480
MTU=1020 MTU=1500
Classic TCP (send all
segments)
MSS MSS
1000 Packets may drop at Network layer due to congestion 1480
Congestion
MTU=1020 MTU=1500
Classic TCP (send all
secgments)
cwnd
sender limits transmission: rate ~ bytes/sec
~
RTT
LastByteSent - LastByteAcked Example: MSS = 500
< min{cwnd, rwnd} bytes & RTT = 200 msec
cwnd is dynamic, function of perceived network initial rate = MSS/RTT=? kbps
congestion
Transport Layer 17
MSS (recap)
The maximum segment size option is used in
connection setup to define the largest allowable TCP
segment. The value of MSS is determined during
connection establishment and does not change
during the connection.
Transport Layer 18
TCP congestion control
Transport Layer 19
How does sender perceive congestion?
20
TCP Slow Start
When connection begins, When connection begins,
cwnd = 1 MSS increase rate
• Example: MSS = 500 bytes & exponentially fast until
RTT = 200 msec first loss event
• initial rate = MSS/RTT=2.5 kbps
available bandwidth may be
>> MSS/RTT
• desirable to quickly ramp up
to respectable rate
Transport Layer 21
TCP Slow Start (more)
Host A Host B
When connection begins,
increase rate exponentially
until first loss event: one s e gm
ent
RTT
• double cwnd every RTT
• done by incrementing cwnd two segm
en ts
for every ACK received
Summary: initial rate is slow
but ramps up exponentially four segm
ents
fast
time
Transport Layer 22
TCP Slow Start (more)
If we look at the size of the cwnd in terms of round-trip
times (RTTs), we find that the growth rate is
exponential as shown below:
23
• Slow start cannot continue indefinitely. There must
be a threshold to stop this phase.
• The sender keeps track of a variable named
ssthresh (slow start threshold). When the
size of window in bytes reaches this threshold, slow
start stops and the next phase starts.
24
When does the Slow Start end?
First, if there is a loss event (i.e., congestion) indicated by a timeout,
• TCP sender sets the value of cwnd to 1, restart Slow Start
• ssthresh variable (“slow start threshold”) to cwnd/2
The second way in which slow start may end is directly tied to the value of ssthresh
• When cwnd equals ssthresh, slow start ends and TCP transitions into congestion avoidance
mode.
• Why not keep doubling?
Transport Layer 25
ssthresh
Transport Layer 26
ssthresh
Transport Layer 27
When does the Slow Start end?
The final way in which slow start can end is if three duplicate
ACKs are detected,
• TCP performs a fast retransmit and enters the fast recovery state.
Transport Layer 28
TCP Congestion Control - Congestion Avoidance
Rather than doubling the value of cwnd every RTT, TCP adopts a
more conservative approach and increases the value of cwnd by just
a single MSS every RTT
Transport Layer 29
TCP Congestion Control - Congestion Avoidance
Transport Layer 30
TCP Congestion Control - Congestion Avoidance
Transport Layer 31
Slow start with Congestion
avoidance
When does it end?
TCP’s congestion-avoidance algorithm
behaves the same when a timeout occurs.
Transport Layer 32
Slow start with Congestion
avoidance
Congestion avoidance
Transport Layer 33
Drawbacks
Transport Layer 34
TCP Congestion Control - Fast Recovery
Transport Layer 35
Fast Retransmit
Coarse timeouts remained a problem, and Fast retransmit
was added with TCP Tahoe.
Since the receiver responds every time a packet arrives,
this implies the sender will see duplicate ACKs.
Basic Idea:: use duplicate ACKs to signal lost packet.
Fast Retransmit
Upon receipt of three duplicate ACKs, the TCP Sender
retransmits the lost packet.
36
TCP Congestion Control - Fast Recovery
What does TCP do?
• TCP halves the value of cwnd (adding in 3 MSS)
• ssthresh half the value of cwnd
• cwnd is increased by 1 MSS for every duplicate ACK received for the missing segment that caused TCP to
enter the fast-recovery state.
Transport Layer 37
Fast Retransmit
38
TCP Congestion Control - Fast Recovery
39
TCP congestion control: AIMD
approach: senders can increase sending rate until packet loss
(congestion) occurs, then decrease sending rate on loss event
Additive Increase Multiplicative Decrease
increase sending rate(cwnd) by cut sending rate(cwnd) in half
1 maximum segment size every at each loss event
RTT until loss detected
TCP sender Sending rate
AIMD sawtooth
behavior: probing
for bandwidth
41
Summary: TCP Congestion Control
42
Summary: TCP Congestion Control
New
New ACK!
duplicate ACK
ACK! new ACK
.
cwnd = cwnd + MSS (MSS/cwnd)
dupACKcount++ new ACK dupACKcount = 0
cwnd = cwnd+MSS transmit new segment(s), as allowed
dupACKcount = 0
L transmit new segment(s), as allowed
cwnd = 1 MSS
ssthresh = 64 KB cwnd > ssthresh
dupACKcount = 0
slow L congestion
start timeout
avoidance
ssthresh = cwnd/2
cwnd = 1 MSS duplicate ACK
timeout dupACKcount = 0 dupACKcount++
retransmit missing segment
ssthresh = cwnd/2
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment
timeout
New
ACK!
ssthresh = cwnd/2
cwnd = 1 New ACK
dupACKcount = 0
retransmit missing segment cwnd = ssthresh dupACKcount == 3
dupACKcount == 3 dupACKcount = 0
ssthresh= cwnd/2 ssthresh= cwnd/2
cwnd = ssthresh + 3.MSS cwnd = ssthresh + 3.MSS
retransmit missing segment retransmit missing segment
fast
recovery
duplicate ACK
cwnd = cwnd + MSS
transmit new segment(s), as allowed
43
TCP congestion control:
additive increase multiplicative decrease
approach: sender increases transmission rate (window size),
probing for usable bandwidth, until loss occurs
additive increase: increase cwnd by 1 MSS every RTT
until loss detected
multiplicative decrease: cut cwnd in half after loss
44
TCP CUBIC
Is there a better way than AIMD to “probe” for usable bandwidth?
Insight/intuition:
• Wmax: sending rate at which congestion loss was detected
• congestion state of bottleneck link probably (?) hasn’t changed much
• after cutting rate/window in half on loss, initially ramp to to Wmax faster, but then
approach Wmax more slowly
time
t0 t1 t2 t3 t4
Transport Layer: 3-46
TCP and the congested “bottleneck link”
TCP (classic, CUBIC) increase TCP’s sending rate until packet loss occurs
at some router’s output: the bottleneck link
source destination
application application
TCP TCP
network network
link link
physical physical
packet queue almost
never empty, sometimes
overflows packet (loss)
ECN=10 ECN=11
IP datagram
Transport Layer: 3-51
TCP fairness
Fairness goal: if K TCP sessions share same bottleneck link of
bandwidth R, each should have average rate of R/K
TCP connection 1
bottleneck
TCP connection 2 router
capacity R
Connection 1 throughput R
Transport Layer: 3-53
Fairness: must all network apps be “fair”?
Fairness and UDP Fairness, parallel TCP
multimedia apps often do not connections
use TCP application can open multiple
• do not want rate throttled by parallel connections between two
congestion control hosts
instead use UDP: web browsers do this , e.g., link of
• send audio/video at constant rate, rate R with 9 existing connections:
tolerate packet loss • new app asks for 1 TCP, gets rate R/10
there is no “Internet police” • new app asks for 11 TCPs, gets
policing use of congestion roughly R/2
control
Network IP IP
TCP handshake
(transport layer) QUIC handshake
data
TLS handshake
(security)
data
HTTP HTTP
GET GET HTTP
application
GET
HTTP HTTP
GET GET
HTTP
GET QUIC QUIC QUIC QUIC QUIC QUIC
encrypt encrypt encrypt encrypt encrypt encrypt
QUIC QUIC QUIC QUIC QUIC QUIC
TLS encryption TLS encryption RDT RDT RDT RDT
error!
RDT RDT
if (y > SendBase) {
SendBase = y
/* SendBase–1: last cumulatively ACKed byte */
if (there are currently not-yet-acked segments)
start timer
else stop timer
}
Transport Layer: 3-65
TCP 3-way handshake FSM
closed
Socket connectionSocket =
welcomeSocket.accept();
L Socket clientSocket =
newSocket("hostname","port number");
SYN(x)
SYNACK(seq=y,ACKnum=x+1) SYN(seq=x)
create new socket for
communication back to client
listen
SYN
SYN sent
rcvd
SYNACK(seq=y,ACKnum=x+1)
ESTAB
ACK(ACKnum=y+1) ACK(ACKnum=y+1)
L
LAST_ACK
FINbit=1, seq=y
TIMED_WAIT can no longer
send data
ACKbit=1; ACKnum=y+1
timed wait
for 2*max CLOSED
segment lifetime
CLOSED
W/2
TCP over “long, fat pipes”
example: 1500 byte segments, 100ms RTT, want 10 Gbps throughput
requires W = 83,333 in-flight segments
throughput in terms of segment loss probability, L [Mathis 1997]:
1.22 . MSS
TCP throughput =
RTT L
➜ to achieve 10 Gbps throughput, need a loss rate of L = 2·10-10 – a very
small loss rate!
versions of TCP for long, high-speed scenarios