0% found this document useful (0 votes)
331 views89 pages

BGP Scalling PDF

Uploaded by

jnahamed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
331 views89 pages

BGP Scalling PDF

Uploaded by

jnahamed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 89

Scaling BGP

Luc De Ghein – Technical Leader Services


BRKRST-3321
Agenda
• Introduction
• Goal
• Scale Challenges
• Memory Utilization
• Full mesh iBGP
• Update Groups
• Slow Peer
• RR Problems & Solutions
• Deployment
• Multi-Session
• MPLS VPN
• OS Enhancements
• Key Takeaways
“We’re Gonna Need a Bigger Boat”

Jaws

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
Goal of this Session
Covered Not covered

• Causes of scale challenges • Scaling numbers


• # neighbors, # prefixes, #convergence
• Solutions for scaling BGP time
• What you control • Buy a bigger box
• Pick the right BGP feature
• Design the network properly

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
Success of BGP - Scale Challenges
• BGP has been around forever
• Very robust
• Scales the Internet’s growth
• More features
• More multipath, faster convergence

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
More Services by BGP
1990 1995 1999 2002 2009 2012 2015

IPv4 IDR IPv4 enterprise MPLS VPN BGP FC PIC BGP flowspec

IPv6 IDR 6PE 6VPE BGP FC BGP monitoring


Path Diversity protocol
services

multicast mVPN mVPN Auto Discovery BGP - ORR


C-signalling

BGP PW Signalling (VPLS) BGP - LS

BGP MAC Signaling (EVPN) AIGP


scaling

Inter-AS MPLS DMVPN Unified MPLS


VPN

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
For Your
Service Address Families Reference

IPv4 unicast vpn Layer 2


IPv6 multicast multicast in overlay linkstate

IPv4 unicast IPv6 unicast vpnv4 unicast nsap unicast IPv4 Flowspec

IPv4 multicast IPv6 multicast vpnv4 multicast l2vpn vpls IPv6 Flowspec

IPv4 MVPN IPv6 MVPN vpnv6 unicast l2vpn evpn vpnv4 Flowspec

IPv4 MDT vpnv6 multicast l2vpn mspw vpnv6 Flowspec

IPv4 tunnel rtfilter unicast linkstate

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
Memory Utilization
High Memory Utilization - Solutions

# Prefixes # Paths # Attributes


possibly many per prefix possibly many per path

reduce # BGP filter (extended)


aggregate
peers communities

filter prefixes do not use limit own


iBGP full mesh attributes

partial routing
table

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
High Memory Utilization
soft reconfiguration inbound route refresh

BGP Table Pre-filter BGP Table


BGP Table
AS 10 BGP updates AS 20
AS 10 BGP updates AS 20

inbound filter
inbound filter
• Filtered prefixes are dropped
• Filtered prefixes are stored: much more memory used
• Support needed on peer, but this a very old feature
• Support only on router itself
• Changed filter: router sends out route refresh request to
• Changed filter: re-apply policy to table with filtered prefixes peer to get the full table from peer again

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Full Mesh iBGP
Is Full Mesh iBGP Scalable?
• Per BGP standard: iBGP needs to be full mesh
• Total iBGP sessions = n * (n-1) / 2
• Sessions per BGP speaker = n - 1

• Two solutions
1. Confederations
2. Route reflectors

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
AS 100
Confederations subAS 65001
subAS 65003
• Create # of sub-AS inside the larger subAS 65002
R1 R2
confederation R8
R13 R9
• Conferation AS looks like normal AS to
R5 R6
R6 R3 R4
the outside
R12 R10

R7 subAS 65004 R11

• Full mesh iBGP still needed inside subAS


R14
• No full mesh needed between subAS (it’s
eBGP)

• Every BGP peer needs to be in a subAS R15


R16

• Each subAS can have different IGP with


next-hop-self within confed confed eBGP
• Flexible confed eBGP peerings iBGP
• No connectivity needed between any eBGP
subAS’s • Redundancy needed vs increased memory/CPU

• But full mesh between subAS’s is not needed

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Route Reflectors
• A route reflector is an iBGP speaker that reflects routes learned from iBGP RR
RR
peers to other iBGP peers, called RR clients
• iBGP full mesh is turned into hub-and-spoke iBGP
eBGP
• RR is the hub in a hub-and-spoke design

AS 101 AS 100

R5 non-client

Any router can peer


RR RR
eBGP

RR clients are regular


iBGP peers R1 R2 R3 R4 clients
cluster cluster

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
Hierarchical Route Reflectors RR
RR
• Chain RRs to keep the full mesh between RRs and non-clients small
RR RR & RR client
• Make RRs clients of other RRs
• RR is a RR and RR client at the same time
• iBGP topology should follow physical topology
• Prevents suboptimal routing, blackholing, and routing loops

RR RR RR

• RRs in top tier need to Tier 1


be fully meshed
RR RR RR RR

Tier 2
• There is no limit to the RR RR

amount of tiers
Tier 3
BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
Route Reflector – Same Cluster-ID or Not?
RR1 and RR2 have different cluster-ID (default) RR1 and RR2 have the same cluster-ID

RR1 RR2 RR1 RR2


RR2 RR2

RRC1 RRC2 RRC1 RRC2


RRC1 RRC2 RRC1 RRC2

• RR1 stores the path from RR2 • RR1 has only 1 path for routes from RRC2
• RR1 uses additional CPU and memory • If one link RR to RR-client fails
– iBGP session remains up, it is between loopback IP
• Potentially for many routes addresses
• Additional memory and processor overhead on RR • Less redundant paths

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Picking RRs
How many? Where? Which kind?

Redundancy Location Dedicated RR


Sets of two • Geo No forwarding (no FIB)
• Datacenter RIB and BGP/IGP
• Region
Services Needed Resources
• To scale: sets (per group of ) Memory
address families CPU

7200 ASR1K

primary backup primary backup


Virtual router
service 1 service 2
(one or more AFs) (one or more AFs) • Mobility
• Manageability ASR9Kv
CSR1000V
• Same BGP implementation and software version (vRR)
as deployed on the Edge (XE/XR)
• Reduced physical footprint (power/cooling/cabling)
• Performance (multi-core) / memory (64-bit)

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
BGP RR Scale - Selective RIB Download
• To block some or all of the BGP prefixes into the RIB (and FIB) • For AFs IPv4/6
• not needed for AFs vpnv4/6
• Only for RR which is not in the forwarding path
• Benefit
• Saves on memory and CPU • ASR1k testing indicated 300% of RR-
client session scaling (in order of 1000s)
• Implemented as filter extension to table-map command

configuration no BGP prefixes in RIB no BGP prefixes in FIB

router bgp 1 RR1#show ip cef


RR1#show ip route bgp
address-family ipv4
table-map block-into-fib filter
RR1# RR1#
route-map block-into-fib deny 10

configuration IOS-XR
route-policy block-into-fib router bgp 1
if destination in (...) then
drop address-family ipv4 unicast
else table-policy block-into-fib
pass
end-if
BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
Multi-Cluster ID
router bgp 1
no bgp client-to-client reflection intra-cluster cluster-id 0.0.0.1
no bgp client-to-client reflection intra-cluster cluster-id 0.0.0.2

• An RR can belong to multiple clusters


• On IBGP neighbor of RR: cluster IDs on a per- cluster ID 1 cluster ID 2
neighbor basis
• The global cluster ID is still there
• Intra-cluster client-to-client reflection can be disabled, PE1 PE3
when clients are meshed
• Can be disabled for all clusters or per cluster
RR
• More work - sending more updates - for RR clients
• Less work - sending fewer updates - for RRs
no reflection no reflection
PE2 PE4

• Each set of peers in cluster ID has its own update group


• Loop-prevention mechanism is modified
• Taking into account multiple cluster IDs

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Full Mesh eBGP
router bgp 999

BGP Route Server route-server-context rs-context


!
address-family ipv4 unicast
import-map rs-import-map
• Alternative to eBGP full mesh !
neighbor 10.1.1.1 remote-as 100
• Used by IX (Internet eXchange) providers !
address-family ipv4
• Operational simplicity neighbor 10.1.1.1 route-server-client context rs-context
!
ip as-path access-list 100 permit ^200$
• Reduces CPU/memory/configuration !
route-map rs-import-map permit 10
• Context policy can be used match as-path 100

AS 200 AS 300 AS 200 AS 300


no bgp enforce-first-as
R2 R3 R2 R3

AS 100 AS 400 AS 100 AS 999 AS 400

R1 R4 R1 R7 R4

Transparent AS
R6 R5 R6 R5
Next-hop preserved
AS 600 AS 500 AS 600 AS 500

eBGP BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Update Groups
Grouping of BGP Neighbors: Optimization
Configuration/administration Performance/scalability

• peer groups • update groups


• templates, session-groups, af-groups, neighbor-group

• CLI only • Dynamic grouping BGP of peers according to common outbound


policy
• Networks that have the same best-path attributes can be grouped
into the same message improving packing efficiency
• BGP formats the update messages once and then replicates to
• BGP neighbors with same outbound policy will be put in all members of the update group
the same update group regardless if • replication instead of formatting updates per neighbor: efficiency
• peer-groups are defined
• dynamic = policy changes, update group membership changes
• templates are defined
• AF independent : a peer can belong to different update groups in
• neighbors are individually defined
different address families

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
IOS

Update Group Replication RR

• Update groups are very usefull on all BGP speakers


– but mostly on RR due to 1 BGP update
format replicate
• # of peers
• equal outbound policy BGP update
BGP update
• iBGP typically has no outbound policy BGP update
– RRs have large number of iBGP peers in one update
group Same outbound policy BGP update

...
RR#show ip bgp replication 2 n BGP update
Current
Next
Index Members Leader MsgFmt MsgRepl Csize Version
Version
2 101 10.100.1.2 2013 24210 0/2000 3201/0
update total # of formatting # of
# of size of
group 2 members according to formatted
replications cache
leader’s policy messages

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
Update Groups in IOS
• Cache = place to store formatted BGP message, before they are send
• Cache is adaptive -> faster convergence
• queue depth from 100 to 5000
• Number of peers in an update groups
• Installed system memory
• Type of address family
• Type of peers in an update group

• Parallel processing of Route-Refresh/new BGP peers


• By tracking the (re-)starting BGP peers: process full update to these peers, while maintaining
transient updates to established peers
• By using special refresh update groups for (re-)starting peers

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
IOS-XR

Update Groups in IOS XR


RP/0/6/CPU0:router#show bgp vpnv4 unicast update-group

Update group for VPNv4 Unicast, index 0.2:


Attributes:
address family Internal
Common admin
First neighbor AS: 1
update groups Send communities
Send extended communities
Route Reflector Client
sub-groups 4-byte AS capable
Send AIGP
Minimum advertisement interval: 0 secs
refresh sub-groups Update group desynchronized: 0
Sub-groups merged: 5
Number of refresh subgroups: 0
filter groups Messages formatted: 36, replicated: 68
All neighbors are assigned to sub-group(s)
Neighbors in sub-group: 0.2, Filter-Groups num:3
neighbors Neighbors in filter-group: 0.3(RT num: 3)
10.1.100.1
Neighbors in filter-group: 0.1(RT num: 3)
10.1.100.2
Neighbors in filter-group: 0.2(RT num: 3)
10.1.100.8

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
Slow Peer
IOS
Slow Peer update group 1
detection phase track peer queue
protection phase “slow” update group
• slow peer = cannot keep up with the
rate at which we are generating
recovery phase update messages over a prolonged
slow update group is period of time (order of minutes)
no longer slow • filled up cache: blocking all peers
RR

Possible causes
• High CPU
convergence • Transport issues (packet loss/loaded
speed of OK links/TCP)
update goup

%BGP-5-SLOWPEER_DETECT: Neighbor IPv4 Unicast 10.100.1.1 has been detected as a slow peer

%BGP-5-SLOWPEER_RECOVER: Slow peer IPv4 Unicast 10.100.1.1 has recovered

Allows for fast and slow peers to proceed at the their own speed

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
Slow Peer CLI per AF
configuration detection per VRF
per peer
per peer policy template

per AF
static per peer(-group)
protection
per peer policy template

dynamic per VRF


per peer
optional: permanent = peer is not moved
back automatically to the update group per peer policy template

show commands show bgp ... slow command

This is a forced clear of the slow-peer status; the peer is


clear commands clear bgp ... slow command
moved to the original update group

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
Old Slow Peer Solution
Solution before this feature: manual movement

• Create a different outbound policy for the slow peer

• Policy must be different than any other


• You do not want the slow peer to move to another already existing update group

• Use something that does not affect the actual policy


• For example: change minimum advertisement interval (MRAI) of the peer (under AF)
• Also avoiding the cause for a full update (equivalent of a route-refresh)

router bgp 1
address-family vpnv4
neighbor 10.100.1.1 advertisement-interval 1

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
For Your
Slow Peer Mechanism Details Reference
Identifying Slow Peer

RR#show bgp ipv4 unicast update-group 1 summary


Summary for Update-group 1, Address Family IPv4 Unicast
BGP router identifier 10.100.1.5, local AS number 1
BGP table version is 500001, main routing table version 500001
100000 network entries using 14400000 bytes of memory
BGP using 24373520 total bytes of memory convergence is achieved if all
BGP activity 115574/15574 prefixes, 300000/200000 paths, scan interval 60 secs peers are at the table version

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd


output queue is not empty,
persistently?
10.100.1.1 4 1 1257 67368 402061 0 2000 18:56:16 0
10.100.1.2 4 1 1219 23362 402061 0 0 18:23:46 0
10.100.1.3 4 1 1257 23398 402061 0 0 18:56:42 0
10.100.1.4 4 1 10002 1891 402061 0 0 00:01:37 100000

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
RR Problems & Solutions
Best Path Selection - Route Advertisement on RR
P: Z
Path 1: NH: PE1, best
Path 2: NH: PE2
P: Z ingress PE does not
Path 1: NH: PE1, best learn 2nd path
NH: PE1, P: Z
NH: PE1, P: Z
PE1

P:Z CE1 RR PE3 CE3

PE2 NH: PE2, P: Z

• The BGP4 protocol specifies the selection and propagation of a single best path for each prefix
• If RRs are used, only the best path will be propagated from RRs to ingress BGP speakers
• Multipath on the RR does not solve the issue of RR only sending best path
• This behavior results in number of disadvantages for new applications and services

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
Why Having Multiple Paths?
• Convergence
• BGP Fast Convergence (multiple paths in local BGP table)
• BGP PIC Edge (backup paths ready in forwarding plane)

• Multipath load balancing


• ECMP

• Allow hot potato routing


• = use optimal route
• The optimal route is not always known on the border routers

• Prevent oscillation
• The additional info on backup paths leads to local recovery as opposed to relying on iBGP
• Stop persistent route oscillations caused by comparison of paths based on MED in topologies
where route reflectors or the confederation structure hide some paths (pretty rare)

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
Diverse BGP Path Distribution
Overview

• VPN unique RD (Route Distinguisher)


• BGP Best External
• BGP shadow RR / session
• BGP Add-Path
• BGP ORR

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
Unique RD for MPLS VPN
VRF
P: Z
Path 1: NH: PE1
Path 2: NH: PE2
NH: PE1, P: Z/RD1
RD1 NH: PE1, P: Z/RD1
VRF PE1

RR PE3 CE3
P:Z CE1 RD2
NH: PE2, P: Z/RD2
VRF
PE2
NH: PE2, P: Z/RD2

• Unique RD per VRF per PE


• One IPv4 prefix in one VRF becomes unique vpnv4 prefix per VPN per PE
• RR advertises all paths
• Available since the beginning of MPLS VPN, but only for MPLS VPN

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
Shadow Route Reflector (aka RR Topologies)
P: Z
Path 1: NH: PE1, best
Path 2: NH: PE2 P: Z
Path 1: NH: PE1
NH: PE1, P: Z Path 2: NH: PE2
NH: PE1, P: Z

PE1 RR1

P:Z CE1 PE3 CE3


shadow RR

PE2 RR2
NH: PE2, P: Z
NH: PE2, P: Z

router bgp 1
P: Z
Path 1: NH: PE1, best
address-family ipv4
Path 2: NH: PE2, 2nd best bgp additional-paths select backup
neighbor 10.100.1.3 advertise diverse-path backup

• Easy deployment
• One additional “shadow” RR per cluster
• RR2 does announce the 2nd best path, which is different from the primary best path
on RR1 by next hop

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
Shadow Route Reflector – RR Placement
Note: primary RRs do not need diverse path code

P: Z
Path 1: NH: PE1, best
equal distance Path 2: NH: PE2 RR and shadow RR are co-located.
They‘re on same vlan with same IGP metric towards
prefix.
P: Z
PE1 RR1 Path 1: NH: PE1, best
Path 2: NH: PE2, 2nd best Note: primary and shadow RRs do not need
P:Z to turn off IGP metric check
P
shadow RR
PE2
RR2

P: Z
all links have the same IGP cost Path 1: NH: PE1, best RR and shadow RR are not co-located.
Path 2: NH: PE2

Note: primary and shadow RRs need to turn IGP metric


check off.
PE1 RR1 P: Z All RRs to calculate the same best path so that primary
Path 1: NH: PE1, 2nd best
P:Z Path 2: NH: PE2, best and shadow RRs do not advertise the same path
P shadow RR

PE2 RR2

solution RR(config-router-af)#bgp bestpath igp-metric ignore RR2 advertises same path as RR1 !

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
Shadow Session
Note: second session from RR to RR-client (PE3) has diverse-path
command in order to advertise 2nd best path

P: Z P: Z
Path 1: NH: PE1, best Path 1: NH: PE1
Path 2: NH: PE2, 2nd best Path 2: NH: PE2

NH: PE1, P: Z
NH: PE1, P: Z
PE1

P:Z RR PE3 CE3


CE1
CE1

NH: PE2, P: Z
PE2
NH: PE2, P: Z

• Easy deployment – only RR needs diverse path code and new iBGP session per each
extra path (CLI knob on RR)

• Shadow iBGP session does announce the 2nd best path


• 2nd session between a pair of routers is no issue (use different loopback interfaces)

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
ADD Path
router bgp 1
address-family ipv4
bgp additional-paths select best 2
bgp additional-paths send
neighbor PE3 advertise additional-paths best 2

P: Z
Path 1: NH: PE1, best
Path 2: NH: PE2, best2 P: Z
Path 1: NH: PE1, best
Path 2: NH: PE2, backup/repair
NH: PE1, P: Z
NH: PE1, P: Z
PE1

P:Z CE1 RR PE3 CE3


CE1

NH: PE2, P: Z
PE2
NH: PE2, P: Z

router bgp 1
address-family ipv4
• PE routers need to run newer code in bgp additional-paths receive
order to understand second path bgp additional-paths install
• Path-identifier used to track ≠ paths

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
Add Path - Possibilities
add-all-path add-n-path
• RR will do best path computation for up to n paths and send
n paths to the border routers
• RR will do the first best path computation and then send
• This is the only mandatory selection mode
all paths to the border routers
• Pros
• Pros
• less storage used for paths
• all paths are available on border routers
• less BGP info exchanged
• Cons
• Cons
• all paths stored
• more best path computation
• more BGP info is exchanged
• Usecase: Primary + n-1 backup scenario
• Usecase: ECMP, hot potato routing
(n is limited to 3 (IOS) or 2 (IOS-XR), to preserve CPU
power) = fast convergence

bgp additional-paths select all bgp additional-paths select best<N>

multipath
• RR will do the first best path computation and then send all IOS-XR
multipaths to the border routers only
• Use case: load balancing and primary + backup scenario

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
For Your
Reference
Add-Path - IOS-XR router bgp 1
example config

address-family vpnv4
• Path selection is configured additional-paths install backup (deprecated)
additional-paths advertise
in a route-policy additional-paths receive
additional-paths selection route-policy apx
• Global command, per
address family, to turn on example RPL config
add-path in BGP route-policy ap1
if community matches-any (1:1) then
• Configuration in VPNv4 set path-selection backup 1 install add-n-path
mode applies to all VRF elseif destination in (10.1.0.0/16, 10.2.0.0/16)then
set path-selection backup 1 advertise install
IPv4-Unicast AF modes endif
unless overridden at add-all-path
individual VRFs route-policy ap2
set path-selection all advertise

route-policy ap3 multipath


set path-selection multipath advertise
needed to have a non-
multipath path as backup path route-policy ap4
set path-selection backup 1 install multipath-protect advertise

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
Hot Potato Routing - No RR
• Hot potato routing = packets are passed on (to next AS) as soon as received
• Shortest path though own AS must be used
• In transit AS: same prefix could be announced many times from many eBGP peers

P: Z
eBGP: P: Z
Path 1: NH: PE1
Path 2: NH: PE2
Path 3: NH: PE3, best
PE3 NH: PE3, P: Z
NH: PE1, P: Z
eBGP: P: Z
PE1

eBGP: P: Z
PE2 PE4

NH: PE2, P: Z

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
Hot Potato Routing - With RR
• Introducing RRs break hot potato routing
• Solutions: Unique RD for MPLS VPN or Add Path
Step 8 in the BGP best path
selection algorithm

P: Z
Path 1: NH: PE1, best
Path 2: NH: PE2
Path 3: NH: PE3 eBGP: P: Z

PE3
NH: PE1, P: Z
eBGP: P: Z P: Z
NH: PE3, P: Z Path 1: NH: PE1, best
PE1

RR
eBGP: P: Z
PE2 PE4
NH: PE1, P: Z
NH: PE2, P: Z

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
Hot Potato Routing in Large Transit SP
BR BR
BR BR

BR BR
BR RR
RR BR

BR RR RR BR
BR RR RR BR
RR

RR BR
RR RR
RR BR
BR
BR

add-path
• Large transit ISPs with full mesh iBGP • add-all-path could be deployed between
between regional RRs and hub/spoke centralized and regional RR’s
between local BR and RR • Also possible: remove the need for regional
• Full mesh and global hot potato routing RR if all BR routers support add-path

BR
BR
Border Router

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 48
BGP Optimal Route Reflection (ORR)
• Another way to allow hot-potato routing with RR
• Step 8 in the BGP best path selection algorithm is still the issue

P:Z • The RR can choose to send a different best


NH = BR1 path to different BGP border routers or set of
P:Z
RR
NH = BR3
border routers
PE1
• The RR will perform the BGP best path
BR3
eBGP: P: Z calculation from the perspective of the ingress
eBGP: P: Z border router
BR1 P1 P2 • The RR can run a Shortest Path First (SPF)
calculation with the ingress border router as
P:Z
NH = BR2 the root of the tree and calculate the cost to
eBGP: P: Z every other router
BR2 P3 P4 PE2

• Only RR needs ORR code


• Must have Link-State routing protocol
PE3
• Support per address family

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Fast Convergence
BGP PIC (Prefix Independent Convergence) Edge
Problem

• Convergence in flat FIB is prefix dependent


• More prefixes -> more convergence time

• Classical convergence (flat FIB)


• Routing protocols react - update RIB - update Solution
CEF table (for affected prefixes)
• Time is proportional to # of prefixes • The idea of PIC:
• In both SW and HW:
• Pre-install a backup path in RIB
• Pre-install a backup path in FIB
Result • Pre-install a backup path in LFIB

• Improved convergence
• Reduce packet loss
• Have the same convergence time for all BGP prefixes
(PIC)

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
MPLS VPN Dual Homed CE - No PIC Edge
P: Z
Path 1: NH: PE1, best
Path 2: NH: PE2
NH: PE1, P: Z

PE1 PE3 CE3

P:Z CE1

NH: PE2, P: Z
PE2

Steps in convergence Steps in convergence on ingress PE


1. Egress PE goes down 1. Ingress PE recomputes BGP bestpath
2. IGP notifies ingress PE in sub-second 2. Ingress PE installs new BGP bestpath in RIB
3. Ingress PE installs new BGP bestpath in FIB
4. Ingress PE reprograms hardware

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
MPLS VPN Dual-Homed CE - PIC Edge
P: Z
Path 1: NH: PE1, best router bgp 1
Path 2: NH: PE2, backup/repair address-family vpnv4
bgp additional-paths install

NH: PE1, P: Z

PE3 CE3
PE1
P:Z CE1
CE1

PE2 NH: PE2, P: Z

Steps in convergence Steps in convergence on ingress PE


1. Egress PE goes down 1. Switch to repair path with new Next Hop
2. IGP notifies ingress PE in sub-second 2. Ingress PE reprograms hardware
We eliminate convergence dependence on: this scales to the number of prefixes
• Scanning of the BGP table
• Bestpath calculation (because there is a pre-computed backup/repair path)
• Time to generate and propagate updates (PE and RR)
• Updating the FIB (with PIC the FIB update is prefix independent)

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
No BGP Best External – Default BGP Policy
P: Z P: Z
Path 1: NH: CE1, localpref 100, external, best Path 1: NH: PE1, internal, localpref 100, best
Path 2: NH: PE2, internal, localpref 100, backup/repair

NH: PE1, localpref: 100, P: Z

PE1 PE3 CE3

P:Z CE1 NH: PE2,


localpref: 100,
NH: PE1, P: Z
localpref: 100,
P: Z
NH: PE2, localpref: 100, P: Z

PE2
Full mesh iBGP
BGP policies are all default

P: Z
Path 1: NH: CE1, localpref 100, external, best

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
No BGP Best External - Changed BGP Policy
P: Z
Path 1: NH: CE1, localpref 200, external, best P: Z
Path 1: NH: PE1, internal, localpref 200, best

local preference 200


NH: PE1, localpref: 200, P: Z

no backup/repair
PE3 CE3
PE1 path
P:Z CE1

NH: PE1,
localpref: 200,
P: Z

PE2

Even with full mesh in iBGP,


policy can prevent egress PE
P: Z from learning all paths
Path 1: NH: CE1, localpref 100, external, best
Path 2: NH: PE1, localpref: 200, internal, best
If default policy is changed, one egress PE could
have iBGP path to other egress PE as best path and
not its own external BGP path

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
BGP Best External - Changed BGP Policy
P: Z P: Z
Path 1: NH: CE1, external, best Path 1: NH: PE1, internal, localpref 200, best
Path 2: NH: PE2, localpref 100, internal, backup/repair Path 2: NH: PE2, localpref 100, internal, backup/repair

local preference 200


NH: PE1, localpref: 200, P: Z

PE1 PE3 CE3

P:Z CE1 NH: PE2,


localpref: 100,
NH: PE1, P: Z • With Best External, the backup PE
localpref: 200, (PE2) still propagates its own best
P: Z
external path to the RRs or iBGP peers
NH: PE2, localpref: 100, P: Z
• PE1 and PE3 learn 2 paths
PE2

router bgp 1
address-family vpnv4
bgp additional-paths install P: Z
bgp additional-paths select best-external Path 1: NH: CE1, external, best backup/repair, advertise-best-external
neighbor x.x.x.x advertise best-external Path 1: NH: PE1, localpref: 200, internal, best

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
Deployment
BGP Selective Download
RIB – Full Internet Routes
FIB – Full Internet Routes
• Access router RIB holds full Internet routing table,
but fewer routes in FIB
• Example: ME switches, ASR900
ASBR ASBR
• FIB holds default route and selective more
specific routes
iBGP iBGP
ISP ISP
• Enterprise CPE devices will receive full Internet
routes through their BGP peering with the access access router ASBR
router(s)
configuration eBGP eBGP
router bgp 1 bRIB – Full Internet Routes
FIB – Default & Filtered routes RIB – Full Internet Routes
CPE
address-family ipv4 FIB – Full Internet Routes
table-map filter-into-fib filter

route-map filter-into-fib deny 10


match community 100 enterprise

ip community-list 100 permit 65510:100

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
Path MTU Discovery (PMTUD)
• MSS (Max Segment Size) – Limit on the largest segment that can traverse a TCP session
• Anything larger must be fragmented & re-assembled at the TCP layer
• MSS is 536 bytes by default for client BGP without PMTUD
• Enable PMTU for BGP with
• Older command “ip tcp path-mtu-discovery”
• Newer command “bgp transport path-mtu-discovery” (PMTUD now on by default)

• 536 bytes is inefficient for Ethernet (MTU of 1500 or more) or POS (MTU of 4470) networks
• TCP is forced to break large segments into 536 byte chunks
• Adds overheads
• Slows BGP convergence and reduces scalability

• TCP MSS set per neighbor (IOS-XR 5.4)

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
Session/Timers
• Timers = keepalive and holdtime • Do not use Fast Session Deactivation (FSD)
• Default is ok – Tracks the route to the BGP peer
• Smallest is 3/9 for keepalive/holdtime – A temporary loss of IGP route, will kill off the iBGP sessions
• Scaling <> small timers – Very dangerous for iBGP peers
• IGP may not have a route to a peer for a split second
• Use BFD • FSD would tear down the BGP session
• Built for speed – It is off by default
• When failure occurs, BFD notifies BFD neighbor x.x.x.x fall-over
client (in 10s of msecs) – Next Hop Tracking (NHT), enabed by default, does the job
fine

BFD Clients BFD Clients


OSPF OSPF
IS-IS
BFD Control Packets IS-IS
EIGRP BF D BFD EIGRP
BGP BGP

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
IOS

Dynamic Neighbors
• Remote peers are defined by IP address range
• Less configuration for defining neighbors DMVPN
1

• Remote initiate BGP session iBGP


iBGP iBGP
n
• Enterprise networks (DMVPN, ...)
R1

• iBGP and limited eBGP (limited nr of ASNs)


eBGP eBGP eBGP
configuration
router bgp 1
bgp listen range 192.168.0.0/16 peer-group 192-16
1 n
bgp listen range 10.1.1.0/24 peer-group 10-24
bgp listen limit 1000
neighbor 10-24 peer-group
neighbor 10-24 remote-as 1
neighbor 192-16 peer-group
neighbor 192-16 remote-as 2 alternate-as 3 4 5 6 7
neighbor 192-16 ebgp-multihop 2
neighbor 192-16 update-source Loopback0

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Multisession
IOS

Multisession
• BGP Multisession = multiple BGP (TCP) sessions between 2 single session
BGP speakers carrying all AFs

• Even if there is only one BGP neighbor statement defined between the R1 R2
BGP speakers in the configuration

• Introduced with Multi Topology Routing (MTR) multisession


1 topo per session
• One session per topology
R1 R2
• Now: possibility to have one session per AF/group of AFs
• Good for incremental deployment of AFs
• Avoids a BGP reset
• But multisession needs to be enabled beforehand
• Good for troubleshooting multisession
• Good for issues when BGP session resets 1 AF per session
• For example “malformed update”
R1 R2
• Not so good for scalability
• IOS only and not enabled by default

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
IOS

Multisession BGP: 10.100.1.2 passive rcvd OPEN w/ optional parameter type 2 (Capability)
For Your
Reference
len 3
capability BGP: 10.100.1.2 passive OPEN has CAPABILITY code: 131, length 1
BGP: 10.100.1.2 passive OPEN has MULTISESSION capability, without grouping

R2#show bgp ipv4 unicast neighbors


BGP neighbor is 10.100.1.1, remote AS 1, internal link

BGP multisession with 3 sessions (3 established), first up for 00:05:43

multisession for Neighbor sessions:


3 active, is multisession capable
Session: 10.100.1.1 session 1
MTR Topology IPv4 Unicast
Session: 10.100.1.1 session 2
1 session
Topology IPv4 Unicast voice per topology
Session: 10.100.1.1 session 3
Topology IPv4 Unicast video

R2#show ip bgp neighbors 10.100.1.1 | include session|address family


BGP multisession with 3 sessions (3 established), first up for 00:02:29
Neighbor sessions:
3 active, is multisession capable
Session: 10.100.1.1 session 1
Session: 10.100.1.1 session 2
Session: 10.100.1.1 session 3
multisession Route refresh: advertised and received(new) on session 1, 2, 3
Multisession Capability: advertised and received

without MTR For address family: IPv4 Unicast


Session: 10.100.1.1 session 1
1 session
session 1 member
For address family: IPv6 Unicast
Session: 10.100.1.1 session 2
per address
session 2 member family
For address family: VPNv4 Unicast
Session: 10.100.1.1 session 3
session 3 member

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
IOS

Multisession
Conclusion

• Increases # of TCP sessions


• Not really needed
• Current default behavior = multisession is off
• Can be turned on by “neighbor x.x.x.x transport multi-session”
• Makes sense to have IPv4 and IPv6 on seperate TCP sessions
• IPv6 over IPv4 (or IPv4 over IPv6) can be done, but next hop mediation is needed

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
MPLS VPN Scaling
RR-groups
• Use one RR (set of RRs) for a subset of prefixes
• By carving up range of RTs

• Only for vpnv4/6


• RR only stores and advertises the specific range of prefixes

• Less storage on RR, but more RRs needed + more peerings

rr-group 1

RR1

vpnv4/6 RR2
vpnv4/6
PE1 PE2
rr-group 2
vpnv4/6 vpnv4/6
RR1

RR2

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
RR-groups Configuration Example
address-family vpnv4
• Dividing of RTs done by simple ext bgp rr-group 100
community list 1-99 or ext community list address-family vpnv6
with regular expression 100-500 bgp rr-group 100

rr-group 1 ip extcommunity-list 100 permit RT:1:(1|3|5)....


• PEs still send all vpnv4/6 prefixes to RR,
but RR filters them
RR1

vpnv4/6 RR2 vpnv4/6


PE1 PE2
rr-group 2
vpnv4/6 vpnv4/6

RR1 address-family vpnv4


• Dividing RT = more work bgp rr-group 100
RR2 address-family vpnv6
• PEs are not involved, only RRs bgp rr-group 100

ip extcommunity-list 100 permit RT:1:(2|4|6)....


BGP(4): 10.100.1.1 rcvd UPDATE w/ attr: nexthop 10.100.1.1, origin ?, localpref 100, metric 0, extended community RT:1:10001
BGP(4): 10.100.1.1 rcvd 1:10001:100.1.1.2/32, label 22 -- DENIED due to: extended community not supported;

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 69
Route Target Constraint (RTC)
CE1 CE2
BGP capability exchange
OPEN message PE1 RR1 PE2
CE3 CE4
capability 1/132 (RTFilter)
for vpnv4 & vpnv6

origin AS origin AS origin AS


AF RTFilter exchange MP_REACH_NLRI Route Route Route
AF RTFilter Target 1:2 Target 1:1 Target 0:0

PE1 installs Default RR installs RT filter RT:1:1 & RT 1:2


RT filter for RR for PE1 (implicitly denying all else)

PE sends all its


AF vpnv4/6 prefixes
vpnv4/6 prefixes to RR RR sends only RED and green (not
exchange
blue) vpnv4/6 prefixes to PE1

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
Route Target Constraint (RTC)
• Results
• Eliminates the waste of processing power on the PE and the waste of bandwidth
• Number of vpnv4 formatted message is reduced by 75%
• BGP Convergence time is reduced by 20 - 50%
• The more sparse the VPNs (few common VPNs on PEs), the more performance gain

• Note: PE and RR need the support for RTC


• Incremental deployment is possible (per PE)
• Behavior towards non-RT Constraint peers is not changed

• Note
• RTC clients of RR with different set of importing RTs will be in the same update group on the RR
• In IOS-XR, different filter group under same subgroup

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
Legacy PE RT Filtering
• Problem: If one PE does not support RTC (legacy prefix), then all RRs in one cluster must
store and advertise all vpn prefixes to the PE
• Solution: Legacy PE sends special prefixes to mimic RTC behavior, without RTC code

Legacy PE RR
• Collect import RTs • The presence of the community triggers the RR to
• Create route-filter VRF (same RD for all these VRFs extract the RTs and build RT membership
across all PEs) information
• Originate special route-filter route(s) with • RR only advertises wanted vpn prefixes towards
• the import RTs attached legacy PE
• one of 4 route-filter communties
• NO-ADVERTISE community

4 route-filter communties
0xFFFF0002 ROUTE_FILTER_TRANSLATED_v4
0xFFFF0003 ROUTE_FILTER_v4
0xFFFF0004 ROUTE_FILTER_TRANSLATED_v6
0xFFFF0005 ROUTE_FILTER_v6

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
Legacy PE RT Filtering
Legacy
PE Import
Import CE2 RT 1:1
RT 1:1 CE1

PE1 RR1 PE2 Import


Import CE4 RT 1:3
RT 1:2 CE3
RTC no RTC
Route-filter Export RT
for vpnv4 for vpnv6
VRF 1:1 1:3

legacy PE sends route-filter VRF


route(s) with unique RD, route-filter
community and importing RTs
vpnv4/6 update with 9999:9999:9.9.9.9/32 RD:prefix
prefix(es) RT membership
Community: 4294901762 One of 4 route-filter communities
information
Extended Community: RT:0.1.0.0:1 All import RTs of the legacy PE
RT:0.1.0.0:3 NO-ADVERTISE NO-EXPORT
no-export no-advertise community

AF vpnv4/6 prefixes PE1 sends all its vpnv4/6 RR sends only RED (not green)
exchange prefixes to RR vpnv4/6 prefixes to PE2

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
For Your
Legacy PE RT Filtering - Configuration Reference

Legacy PE config

ip vrf route-filter
RR config
rd 9999:9999
router bgp 1 export map SET_RT
address-family vpnv4
neighbor 10.100.1.2 route-reflector-client router bgp 1
neighbor 10.100.1.2 accept-route-legacy-rt address-family vpnv4
neighbor 10.100.1.3 route-map legacy_PE out
address-family ipv4 vrf route-filter
network 9.9.9.9 mask 255.255.255.255

ip route vrf route-filter 9.9.9.9 255.255.255.255 Null0


ip prefix-list match_RT_1 seq 5 permit 9.9.9.9/32

route-map SET_RT permit 10


match ip address prefix-list match_RT_1
Import
RR PE2 CE2 set community 4294901762 (equals 0xFFFF0002)
RT 1:1 set extcommunity rt 0.1.0.0:1 0.1.0.0:3 additive

new old CE4


Import route-map legacy_PE permit 10
RT 1:3 match ip address prefix-list match_RT_1
code code set community no-export no-advertise additive

Route-filter Export Map


VRF 1:1 1:3
BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 74
Full Internet in a VRF?
• Why? Because design dictates it
• Unique RD, so that RR can advertise 2 paths?

PRO CON

• Remove Internet routing table from P routers • Increased memory and bandwidth
• Security: move Internet into VPN, out of global consumption
• Added flexibility
• More flexible DDOS mitigation

• Platform must support enough MPLS labels


– Label allocation is per-prefix by default
– Perhaps per-ce or per-vrf label allocation is wanted here
– Now also per-CE and per-VRF label allocation for 6PE (in IOS-XR)

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
Full Internet in a VRF?
Considerations

• Two Internet gateways for redundancy


• RRs are present: unique RDs needed
• Then double # vpn prefixes
• ADD-PATH increases paths too Internet Peerings

RD 1:1

PE1 PE3

RR

RD 1:2

PE2 PE4

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
Label Allocation Mode: Per-CE Label
• One unique label per prefix is always the default
• Per-CE : one MPLS label per next-hop (so per connected CE router)
2 CEs = 2 labels
• No IP lookup needed after label lookup

• Caveats
• No granular load balancing because the bottom label is the same for all prefixes from one CE, if platform load
balances on bottom label
• eBGP load balancing & BGP PIC is not supported (it makes usage of label diversity), unless resilient per-ce label
• Only single hop eBGP supported, no multihop

• Number of prefixes (n) is much larger than


number of CE routers (x) per VPN
• Number of MPLS labels used is very low
NH: PE1, P: Z1, label L1

CE1

P:Z1-n PE1 PE3 CE3CE2

CEx
NH: PE1, P: Zx, label Lx

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
Resilient per-ce is enabled by
configuring regular per-ce
Per-CE Label: Caveats - PIC commands (label allocation
mode or in RPL)

Before failure
Best paths:
Z1: CE1 Z1 data flow after failure
PE3
Z2: eibgp multipath to
CE1 and PE2 per-ce label NH: PE1, P: Z1, label L1
P:Z1 allocation mode
Backup paths (PIC):
NH: PE1, P: Z2, label L1
Z1 via PE3
PE1
Z1, Z2 data flow
before failure
After failure
Best paths: PE4 CE4
CE2
CE1
Z1: PE3
Z2 data flow after failure
Z2: PE2
P:Z2 PE2
NH: PE2, P: Z1, label L2

CE2 NH: PE2, P: Z2, label L3


per-prefix label
allocation mode
Solution: resilient per-ce : “hack” by doing IP lookup after label lookup
Per-prefix customized resilience
BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 79
Label Allocation Mode: Per-VRF Label
• Per-VRF : one MPLS label per VRF (all CE routers in the VRF)
- Con: IP lookup needed after label lookup

- Con: No granular load balancing because the bottom label is the same for all prefixes, if platform load balances on
bottom label

- Potential forwarding loop during local traffic diversion to support PIC

- No support for EIBGP multipath

Number of MPLS labels used per VRF is 1 !

NH: PE1, P: Z1, label L1

CE1

P:Z1-n PE1 PE3 CE3 CE2

CEx
NH: PE1, P: Zx, label L1

IOS-XR can do selective label mode (prefix | CE | VRF) with RPL


BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
Per-VRF Label: Caveats – Transient loop with PIC
P: Z
Path 1: NH: CE1, external, best
Path 2: NH: PE2, localpref 100, internal, backup/repair

local preference 200 NH: PE1, localpref: 200, P: Z,


Label L1 (per_vrf_PE1)
per-VRF label
allocation mode CE3
PE1 PE3

P:Z CE1 NH: PE2, L1 IP


localpref: 100,
NH:L2
PE1, IP L1 P: IPZ
localpref: 200,
P: Z
NH: PE2, localpref: 100, P: Z,
Label L2 (per_vrf_PE2)
per-VRF label
allocation mode PE2

router bgp 1
address-family vpnv4
bgp additional-paths install P: Z
bgp additional-paths select best-external Path 1: NH: CE1, external, best backup/repair, advertise-best-external
neighbor x.x.x.x advertise best-external Path 1: NH: PE1, localpref: 200, internal, best

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 81
IOS-XR

Selective VRF Download (SVD)


• Download to a line card only those prefixes and labels from a VRF that are actively required to forward
traffic through that line card
Linecard Role Which routes are present?
• In IOS-XR 4.2.0 and enabled by default
Core facing routes for all VRFs, but only the local routes
Local routes
Customer facing routes only for VRFs which the LC is
interested in (local and remote routes)
Standard all routes are present
CE

CE
L L L MPLS
CE
C C C PE
CE

CE CE

Customer Core Remote routes


facing facing

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
OS Enhancements
Multi-Instance BGP multi-instance
BGP
• A new IOS-XR BGP architecture to support multiple
BGP instances RR1
• Each BGP instance is a separate process running multi-instance BGP
20.0.0.1
on the same or a different RP/DRP node BGP vpnv4

• Different prefix tables PE1 BGP


20.0.0.2 IPv4
• Multiple ASNs are possible BGP
vpnv4 10.0.0.1
• Solves the 32-bit OS virtual memory limit
• Different BGP routers: isolate services/AFs on BGP
IPv4 10.0.0.2 single-instance
common infrastructure
BGP
• Achieve higher prefix scale (especially on a RR) by 30.0.0.2
having different instances carrying different BGP RR2
tables 30.0.0.1 BGP
• Achieve higher session scale by distributing the different peerings
overall peering sessions between instances

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
ASR9K: Scaling Enhancement
• BGP RIB Scale enhancement in 5.1.1
• Only for RSP440-SE
• Reload is needed

• Get more virtual address space for BGP process


• From 2 GB to 2.5 GB RP/0/RSP1/CPU0:router(admin-config)#hw-module profile scale ?
default Default scale profile
l3 L3 scale profile
l3xl L3 XL scale profile

Profile Layer 3 (Prefixes) Layer 2 (MAC Table)


default Small (512k) Large (512k)
l3 Large (1,000k) Small (128k)
l3xl Extra large (1,300k) Minimal
l3xl (5.1.1 RSP3) Extra large (2,500k) Minimal

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
For Your
Reference
OS Scaling Enhancements for BGP
OS releases
BGP Keepalive
Enhancements

IOS
• Priority queues for
reading/writing Keepalive/Update BGP PE-CE Scale
BGP Generic Scale messages
Enhancements

IOS
Enhancements • Results = avoid neighbor flaps /
IOS

ability to support small keepalive • Modified internal data structures


• Parceling of BGP processes values in a scaled setup and optimized internal algorithms for
• Created new BGP task IOS VRF based update generation
process: “BGP Task” • Result = faster convergence /
• Result = optimized update greater VRF and PE-CE session
generation / faster convergence scaling
BGP PE Enhancements

IOS-XR
• Optimised BGP processing of label
BGP PE Enhancements

IOS-XR
on PE router
• Result = reduced CPU usage
BGP PE-CE Scale • Modified BGP import processing
Enhancements
IOS

on PE router
• Modified internal data structures BGP PE Scale • Result = reduced CPU usage
and optimized internal algorithms
Enhancements
IOS

for VRF based update generation


• Result = faster convergence / BGP RIB Scale

IOS-XR
greater VRF and PE-CE session • Modified internal data structures
scaling for VRFs Enhancement
• Result = considerable memory
savings / greater prefix scalability • Only for ASR9K
• Result = more prefixes

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
Key Takeaways
For Your
Reference
Takeaway : When is the Boat Not Big Enough?
IOS IOS-XR NX-OS
Convergence
show bgp convergence show bgp convergence detail

Measure Prefix instability show bgp all summary show bgp table
Traffic drops
Table Versions show bgp process performance-statistics
Timestamps detail

IOS IOS-XR NX-OS


Memory
show bgp all summary show bgp table show bgp internal mem-stats detail
- look for “Grand total”, “Private
memory”, “Shared memory”
show processes memory sorted show process memory <job-id> location <> show system resource
show watchdog memory-state
show memory compare start | end | report
show bgp scale

IOS IOS-XR NX-OS


CPU
show processes cpu history show processes cpu show processes cpu history
show processes cpu | include show processes bgp show processes cpu | include bgp
BGP show processes cpu | include bgp show process cpu detailed <bgp pid>
BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 88
Key Takeaways

• Design
• Topology
• Features
• Address families
• Full mesh iBGP / RRs

• Memory and CPU

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 89
Complete Your Online Session Evaluation
• Give us your feedback to be
entered into a Daily Survey
Drawing. A daily winner will
receive a $750 Amazon gift card.
• Complete your session surveys
through the Cisco Live mobile
app or from the Session Catalog
on CiscoLive.com/us.

Don’t forget: Cisco Live sessions will be available


for viewing on-demand after the event at
CiscoLive.com/Online

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
Continue Your Education
• Demos in the Cisco campus
• Walk-in Self-Paced Labs
• Lunch & Learn
• Meet the Engineer 1:1 meetings
• Related sessions

BRKRST-3321 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 91
Thank you

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy