DCTCP Talk
DCTCP Talk
Center
TCP
(DCTCP)
Mohammad
Alizadeh,
Albert
Greenberg,
David
A.
Maltz,
Jitendra
Padhye
Parveen
Patel,
Balaji
Prabhakar,
Sudipta
Sengupta,
Murari
Sridharan
Roadmap
Whats
really
going
on?
Interviews
with
developers
and
operators
Analysis
of
applicacons
Switches:
shallow-buered
vs
deep-buered
Measurements
Picasso
Art is
1.
Deadline
50ms
2.
Art
is
=a
2
lie
..
3.
Picasso
T ime is money
Deadline = 50ms
1. Art is a lie
2. The chief
..
Missed deadline
MLA MLA
..
1.
Strict
deadlines
(SLAs)
2.
3.
3.
It
iArt
s
Computers
InspiraHon
yclike
hief
our
Bad
is
taw
o
e
nemy
lork
lie
aive
yrHsts
ou
that
ian
dcs
re
ooes
an
life
af
cmu
opy.
cpiakes
seless.
reaHvity
magine
toor
ehat
xist,
m
uis
s
an
the
is
i
s
Deadline
=The
Everything
1I'd
0ms
They
but
cit
with
ulHmate
an
Good
m
realize
o
ust
good
nly
lots
areal.
rHsts
gnd
sive
o
tseducHon.
he
ense.
f
ym
ou
yts
ou
oney.
ruth.
teal.
waorking.
nswers.
Worker Nodes
Generality
of
ParHHon/Aggregate
The
foundacon
for
many
large-scale
web
applicacons.
Web
search,
Social
network
composicon,
Ad
seleccon,
etc.
Example: Facebook
Internet
ParHHon/Aggregate
~
MulHget
Aggregators:
Web
Servers
Workers:
Memcached
Servers
Web
Servers
Memcached
Protocol
Memcached
Servers
7
Workloads
Parccon/Aggregate
(Query)
Delay-sensiHve
Delay-sensiHve
Throughput-sensiHve
Impairments
Incast
Queue
Buildup
Buer
Pressure
Incast
Worker
1
Caused
by
ParHHon/Aggregate.
Aggregator
Worker 2
Worker
3
RTOmin
=
300
ms
Worker
4
TCP Hmeout
10
Queue
Buildup
Sender
1
Sender 2
2.
Low
Latency
Short
ows,
queries
3.
High
Throughput
Concnuous
data
updates,
large
le
transfers
Low Latency
Deep
Buers:
Queuing
Delays
Increase
Latency
Shallow
Buers:
Bad
for
Bursts
&
Throughput
Reduced
RTOmin
(SIGCOMM
09)
Doesnt
Help
Latency
AQM
RED:
Avg
Queue
Not
Fast
Enough
for
Incast
ObjecHve:
Low
Queue
Occupancy
&
HDCTCP
igh
Throughput
14
15
Receiver
Sender 2
16
Cwnd
Buer
Size
B
Throughput
100%
17
Cwnd
Buer
Size
B
Throughput
100%
17
17
17
TCP
DCTCP
1 0 1 1 1 1 0 1 1 1
0 0 0 0 0 0 0 0 0 1
Cut window by 5%
2. Mark
based
on
instantaneous
queue
length.
Fast
feedback
to
beper
deal
with
bursts.
18
Switch
side:
Mark
packets
when
Queue
Length
>
K.
Mark
Sender
side:
Maintain
running
average
of
frac%on
of
packets
marked
().
In
each
RTT:
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your
computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
19
(Kbytes)
DCTCP in AcHon
20
Why
it
Works
1. High
Burst
Tolerance
Large
buer
headroom
bursts
t.
Aggressive
marking
sources
react
before
packets
are
dropped.
2.
Low
Latency
Small
buer
occupancies
low
queuing
delay.
3.
High
Throughput
ECN
averaging
smooth
rate
adjustments,
low
variance.
21
Analysis
How
low
can
DCTCP
maintain
queues
without
loss
of
throughput?
How
do
we
set
the
DCTCP
parameters?
Need
to
quanHfy
queue
size
oscillaHons
(Stability).
Window
Size
W*+1
W*
(W*+1)(1-/2)
Time
22
Analysis
How
low
can
DCTCP
maintain
queues
without
loss
of
throughput?
How
do
we
set
the
DCTCP
parameters?
Need
to
quanHfy
queue
size
oscillaHons
(Stability).
Window
Size
W*+1
W*
(W*+1)(1-/2)
Time
22
Analysis
How
low
can
DCTCP
maintain
queues
without
loss
of
throughput?
How
do
we
set
the
DCTCP
parameters?
Need
to
quanHfy
queue
size
oscillaHons
(Stability).
EvaluaHon
Implemented
in
Windows
stack.
Real
hardware,
1Gbps
and
10Gbps
experiments
90
server
testbed
Broadcom
Triumph
48
1G
ports
4MB
shared
memory
Cisco
Cat4948
48
1G
ports
16MB
shared
memory
Broadcom
Scorpion
24
10G
ports
4MB
shared
memory
Numerous
micro-benchmarks
Throughput
and
Queue
Length
Fairness
and
Convergence
MulH-hop
Incast
StaHc
vs
Dynamic
Buer
Mgmt
Queue
Buildup
Buer
Pressure
Metric:
Flow
complecon
cme
for
queries
and
background
ows.
We
use
RTOmin
=
10ms
for
both
TCP
&
DCTCP.
24
Baseline
Background
Flows
Query Flows
25
Baseline
Background
Flows
Query Flows
Baseline
Background
Flows
Query Flows
Baseline
Background
Flows
Query Flows
25
Short messages
Query
26
Conclusions
DCTCP
sacses
all
our
requirements
for
Data
Center
packet
transport.
Handles
bursts
well
Keeps
queuing
delays
low
Achieves
high
throughput
Features:
Very
simple
change
to
TCP
and
a
single
switch
parameter.
Based
on
mechanisms
already
available
in
Silicon.
27