0% found this document useful (0 votes)
517 views43 pages

Cisco ACI Remote Leaf Architecture

Uploaded by

ravi kant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
517 views43 pages

Cisco ACI Remote Leaf Architecture

Uploaded by

ravi kant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

White Paper

Cisco ACI Remote


Leaf Architecture
White Paper

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 1 of 43
Contents
Introduction .............................................................................................................................................................. 3
Business values of Remote leaf ............................................................................................................................. 4
Architecture overview ............................................................................................................................................. 6
Hardware and Software support ............................................................................................................................. 7
IP Network (IPN) requirements for Remote leaf .................................................................................................... 8
Recommended QOS configuration for Remote leaf ........................................................................................... 10
1G or 10G connectivity from Remote leaf switches to upstream Router .......................................................... 11
Discovery of Remote leaf ...................................................................................................................................... 12
Remote leaf software upgrade .............................................................................................................................. 15
ACI connectivity and policy extension to Remote leaf ....................................................................................... 15
Endpoint connectivity option ............................................................................................................................... 16
Remote leaf Control Plane and data plane .......................................................................................................... 16
More than one pair of RL at Remote Location .................................................................................................... 22
Inter-VRF traffic between Remote leaf switches ................................................................................................. 23
VMM Domain Integration and vMotion with Remote leaf ................................................................................... 24
External Connectivity from Remote leaf .............................................................................................................. 26
Failure handling in Remote leaf deployment ....................................................................................................... 29
ACI Multipod and Remote leaf Integration ........................................................................................................... 35
Summary ................................................................................................................................................................ 43
For more information............................................................................................................................................. 43

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 2 of 43
Introduction
® ™
With the increasing adoption of Cisco Application Centric Infrastructure (Cisco ACI ) as pervasive fabric
technology, enterprises and service providers commonly need to interconnect separate Cisco ACI fabrics.
Business requirements (business continuance, disaster avoidance, etc.) lead to the deployment of separate data
center fabrics, and these need to be interconnected with each other.

Note: To best understand the design presented in this document, readers should have at least a basic
understanding of Cisco ACI and how it works and is designed for operation in a single site or pod. For more
information, see the Cisco ACI white papers available at the following link:
https://www.cisco.com/c/en/us/solutions/data-center-virtualization/application-centric-infrastructure/white-paper-
listing.html.

Following figure shows the various Cisco ACI fabric and policy domain extension options that have been offered
from the launch of Cisco ACI up to today. Cisco ACI Remote leaf is the latest addition to the ACI policy domain
extension to satellite datacenters with consistent policy and centralized management.

Figure 1. ACI fabric and policy domain evolution

● The first option, available from Cisco ACI Release 1.0, consists of a classic leaf-and-spine two-tier fabric
(a single Pod) in which all the deployed leaf nodes are fully meshed with all the deployed spine nodes. A
single instance of Cisco ACI control-plane protocols runs between all the network devices within the pod.
The entire Pod is under the management of a single Cisco Application Policy Infrastructure Controller
(APIC) cluster, which also represents the single point of policy definition.
● The next step in the evolution of Cisco ACI geographically stretches a pod across separate physical data
center locations, usually deployed in the same metropolitan area. Given the common limited availability of
fiber connections between those locations, the stretched fabric uses a partial mesh topology, in which some
leaf nodes (called transit leaf nodes) are used to connect to both the local and remote spine nodes, and the
rest of the leaf nodes connect only to the local spine nodes. Despite the use of partial mesh connectivity,
functionally the stretched fabric still represents a single-pod deployment, in which a single instance of all the
Cisco ACI control-plane protocols run across all the interconnected data center sites, and so creates a
single failure domain.
For more information about the Cisco ACI stretched-fabric deployment option, refer to the following link:
https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/kb/b_kb-aci-stretched-fabric.html.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 3 of 43
● To address the concerns about extending a single network fault domain across the entire stretched-fabric
topology, Cisco ACI Release 2.0 introduced the Cisco ACI Multipod architecture. This model calls for the
deployment of separate Cisco ACI Pods, each running separate instances of control-plane protocols and
interconnected through an external IP routed network (or interpod network [IPN]). The Cisco ACI Multipod
design offers full resiliency at the network level across pods, even if the deployment remains functionally a
single fabric, with all the nodes deployed across the Pods under the control of the same APIC cluster. The
main advantage of the Cisco ACI Multipod design is hence operational simplicity, with separate Pods
managed as if they were logically a single entity.
For more information about the Cisco ACI Multipod design, refer to the following link:
https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-
paper-c11-737855.html.

Maximum latency of 50 msec RTT can be supported between Pods.

● The need for complete isolation across separate Cisco ACI networks led to the Cisco ACI Multisite
architecture, introduced in Cisco ACI Release 3.0. Cisco ACI Multi-Site offers connectivity between the two
completely separate ACI fabrics (Sites) that are managed by a Multisite Orchestrator (MSO). Each ACI
fabric has an independent APIC cluster, and control plane to provide complete fault isolation. BGP EVPN is
used to exchange control plane information and VXLAN is used for data-plane communication between ACI
Sites and to extend the policy domain by carrying to policy information in the VXLAN header. Cisco ACI
Multisite Orchestrator provides centralized policy definition (intent) and management by
◦ Monitoring the health-state of the different ACI Sites.
◦ Provisioning of day-0 configuration to establish inter-site EVPN control plane.
◦ Defining and provisioning policies across sites (scope of changes).
◦ Inter-site troubleshooting.
For more information about the Cisco ACI Multisite design, refer to the following link:
https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-
paper-c11-739609.html.

● The need to extend connectivity and consistent policies to remote locations where it is not possible or
desirable to deploy a full ACI Pod (with leaf and spine nodes) led to the development of the Remote leaf
solutions. With Cisco Remote leaf solutions, the APIC controller deployed in the ACI main Datacenter (DC)
manages the Remote leaf switches connected over a generic IP Network (IPN). Remote Location DCs may
not be large in size, hence the Remote leaf solution allows to deploy only Leaf switches, while the spines
remain in ACI main DC. The APIC controller manages and operates the Remote leaf nodes as if those were
local leaf switches, pushing there all the centrally defined ACI policies. This architecture is the main focus of
this document and will be discussed in great detail in the following sections.

Business values of Remote leaf


Remote leaf solution provides multiple business values since it can manage remote DCs without investing into
APIC controllers and Spine switches at remote Locations. Following are key business values of Remote leaf
Solution-

● Extension of the ACI policy model outside the main datacenter to remote sites distributed over IP backbone.
● Extension of ACI fabric to a small DC without investing in full-blown ACI Fabric.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 4 of 43
● Centralized policy management and control plane for remote locations.
● Small form factor solution at locations with space constraints.

Figure 2. Remote leaf use cases

There are multiple use-cases of Remote leaf in the DC, following are some example of these:

● Satellite/Co-Lo Data Center


Due to business demands, customers often have main DCs and many small satellite DCs distributed across
multiple locations. There is a demand for centralized management of all these DCs to simplify operations.
Customers also need to ensure that satellite DCs have same networking, security, monitoring, telemetry,
and troubleshooting policy as main DC. ACI Remote leaf provides the solution for these demands.
Customers can use Remote leaf switches at Co-Lo facility to manage and apply policies in the same way as
on prem DCs.
● Extension and Migration of DCs
Remote leaf deployment simplifies the provisioning of L2 and L3 multi-tenant connectivity between locations
with consistent policy. During the migration of DCs, there is a need to build DCI over which workload can be
migrated between DCs. Remote leaf allows the extension of L2 domain from ACI main DC to remote DCs.
Customers can migrate workloads using vMotion using L2 extension built through Remote leaf solution.
● Telco 5G distributed DCs
Typically Telecom operators build DCs at central and edge locations, but now due to increasing demands,
and provide better experience to subscribers, some services are moving to aggregation layer. DCs are
becoming smaller, but lot more in number, there is a demand to have centralized management and
consistent policy for these DCs. Cisco Remote leaf solutions allows the centralized management of these
DC by providing full day-0 and day-1 automation, consistent day-1 policy and end to end troubleshooting
across any location.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 5 of 43
Architecture overview
Figure 3. Remote leaf architecture overview

In Remote leaf solution, APIC controllers and spines remain at the main DC, while leaf switches at remote location
(Remote leaf) logically connect to spines in main DC over IP network. Discovery, configuration and pushing
policies to the Remote leaf is done by the APIC cluster at main DC. Remote leaf connects to the Spines of one of
the Pod in Main DC over VXLAN tunnels.

Just like local leaf, Remote leaf can be used to connect virtual servers, physical servers and containers. Traffic to
the end points connected to Remote leaf is locally forwarded through remote leaves. Remote location firewalls,
routers, load-balancers or other service devices can be connected to Remote leaf switches similar to local
switches. Customers can use ACI service graphs to perform service chaining for remote location DC with ACI
Remote leaf solution. At the time of writing of this document, only unmanaged service graph mode is supported
when connecting service nodes to Remote leaf switches. Local L2out or L3out can also be used to connect
networking nodes at Remote locations using Remote leaf solution.

ACI main DC could be a Multipod fabric; in this case Remote leaf nodes get logically associated with one specific
Pod. In the following diagram, the Remote leaf pair is associated with ACI Pod1.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 6 of 43
Figure 4. Logical connectivity of Remote leaf to a spines of a Pod in a Multipod fabric

The Remote leaf nodes need to have IP reachability to the APIC controller for allowing zero-touch provisioning of
the switches (this will be discussed more in detail in the “Discovery of” section). Remote leaves also need
connectivity to the VTEP pool of its associated Pod (Pod1 in the following example) to make sure remote endpoints
can leverage the L3out connection in the ACI main DC when the local L3out fails (details about this is explained in
the “External Connectivity from” section).

Hardware and Software support


Remote leaf solution is supported from ACI 3.1(1) release. Remote leaf solution is supported with cloud scale
ASICs. Following table has the list of hardware that supports Remote leaf as of writing of this document. Please
check latest release note for hardware support for specific release.

Spine Leaf

Fixed Spine: ● N93180YC-EX


● N9364C ● N93180YC-FX
Modular Spine with following Linecard ● N93108TC-EX
● N9732C-EX ● N93108TC-FX
● N9736C-FX ● N93180LC-EX
● N9348GC-FXP
● N9336C-FX2

Customers may already be using first generation Spines that do not support Remote leaf solution. In this case, first
generation spines and next-gen spines can be part of the same ACI fabric. However, only next-gen spines as
shown below should connect to the IPN to support Remote leaf. Following diagram shows how to connect mixed
generation spines within fabric.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 7 of 43
Figure 5. Different generation of spines in ACI main DC

IP Network (IPN) requirements for Remote leaf


Remote leaves connect to the ACI main DC over an IP Network (IPN). Below is the list of requirements for IPN:

1. VXLAN is used as an overlay between ACI main DC and Remote leaf. To take care of the overhead of the
VXLAN encapsulation, it’s important to increase of at least 100B the supported MTU in the IPN network to
allow data-plane communication between endpoints in the main DC and at the remote location (this is valid
assuming the endpoints source traffic of the default 1500B MTU size). Additionally, the MTU for control plane
traffic between the spines in the main DC and the Remote leaf nodes should also be tuned (by default it tries
to use up to 9000B MTU). The following snapshot from APIC controller highlights how to tune the MTU of
control plane traffic to 1500 Bytes.

2. Up to 300 msec latency between ACI main DC and Remote Location.


3. Minimum 100 Mbps bandwidth in IP Network.
4. Remote leaf is logically associated to one of the Pod of ACI main DC. As previously mentioned, the Remote
leaf nodes should have reachability to the VTEP pool of its logically associated Pod. This could be achieved
via the backbone if the TEP pool are enterprise routable or via a dedicated VRF or a tunneling mechanism.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 8 of 43
5. Reachability to APIC cluster’s infra IP addresses. APIC nodes may have got an IP address from a TEP pool
different to the one used in the Pod the RLs associate to. Following is the snapshot from APIC that shows the
infra IP address allocated to APICs.

Following diagram provides the example for reachability requirement for Remote leaf.

Figure 6. APIC and TEP Pool reachability between RL and ACI main DC

Following configuration is needed on upstream router connected to Remote leaf.

1. OSPF with Vlan-4 sub-interface at upstream router connected to Remote leaf.


2. DHCP relay to APIC controllers.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 9 of 43
Figure 7. Configuration for upstream router connected to Remote leaf

Following configuration is needed on upstream router connected to Spine.

1. OSPF with Vlan-4 sub-interface at upstream router connected to pines.


Following diagram provides the example configuration for upstream routers connected to spines.

Figure 8. Configuration for upstream router connected to spine

Recommended QOS configuration for Remote leaf


There may be a requirement to preserve the COS values coming from remote endpoints connected to EPGs and
carry the same value to the ACI main DC Pod. To achieve this goal, it is possible to enable the preservation of the
COS value (“Dot1p Preserve” flag) as part of the “External Access Policies” on APIC, as shown in the snapshot
below. With this configuration, the COS value of the original frame received from the endpoint connected to the
Remote leaf node is translated to a corresponding DSCP value that is then copied to outer IP header of the VXLAN
encapsulated frame. This value can then be propagated across the IPN and converted back to the original COS
value before sending the traffic to the endpoints connected in the main ACI DC (and vice versa).

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 10 of 43
Another common requirement could be to classify ACI fabric classes (QOS Levels) to a DSCP value within IPN. To
achieve this requirement, ACI Fabric should be enabled with “DSCP class-cos translation policy for L3 traffic”.
Following snapshot from APIC controller shows how to map ACI QOS levels and default classes to DSCP values in
IPN.

Following are key points to remember while configuring QOS in ACI fabric for Remote leaf.

1. ACI fabric expect that DSCP values of the packet exiting from Remote leaf and entering to spine is same,
hence DSCP value of the packets can’t be modified inside the IPN.
2. Dot1P preserve and DSCP translation policies can’t be enabled together.

3. Following table explains the COS values that are used within ACI Fabric and recommended action for those
COS values in IPN.
Cos Value Recommendation
Control plane COS5 Prioritize in IPN
Policy plane COS3 Prioritize in IPN
Traceroute COS6 Prioritize in IPN
SPAN COS4 De-prioritize in IPN

1G or 10G connectivity from Remote leaf switches to upstream Router


Upstream Routers at remote locations may not have 40G/100G connectivity. It might be required to connect
Remote leaf switches with upstream routers with 1G or 10G connectivity. Remote leaf switches with QSA adapter
can be connected with 1G or 1G connectivity. Please check latest optics compatibility matrix for QSA support.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 11 of 43
Figure 9. Options for connecting 1G or 10G links from Remote leaf to upstream router

Discovery of Remote leaf


Remote leaf gets discovered and configured automatically as soon as it gets powered up at Remote Location.
APIC controller at ACI main DC Pod and IPN needs to be pre-configured for this. For complete discovery of
Remote leaf, following two process takes place.

1. IP address allocation to uplink interfaces, and configuration push to Remote leaf.


2. TEP IP address allocation to Remote leaf.

Figure 10. IP address assignment for Remote leaf uplink, and configuration push to Remote leaf

1. When Remote leaf gets powered on, it sends a broadcast DHCP discover message out of its uplink interfaces.
2. When the DHCP discover message is received by upstream router, it relays the DHCP discover message to
APIC controller. Upstream router’s interfaces towards Remote leaf are configured to relay DHCP discover to
APIC controller.
3. APIC controllers send DHCP offer for the uplink interface of Remote leaf along with bootstrap configuration file
location in the DHCP offer message.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 12 of 43
4. Remote leaf picks up one of the APIC controller and sends a DHCP request for the uplink interface of the
Remote leaf to the APIC controller.
5. APIC controller sends DHCP ACK to complete the IP address allocation for the uplink IP address.
This process gets repeated for all the uplink addresses. To get bootstrap configuration file from APIC controller,
Remote leaf automatically configures a static route with next-hop of upstream router.

After receiving configuration file, this static route is removed and Remote leaf is configured as per configuration file.

Next step in Remote leaf discovery is the assignment of TEP address to the Remote leaf switches Following
diagram and steps explains the assignment on TEP address to Remote leaf.

Figure 11. TEP address assignment to Remote leaf

1. Remote leaf gets full IP connectivity to the APIC controllers once it’s uplinks get ip addresses, it sends a
DHCP discover message to APICs to receive the TEP address.
2. APIC controllers sends the DHCP offer for TEP IP address.
3. Remote leaf picks up one of the APIC controller and sends the DHCP request for the TEP IP.
4. APIC controller sends DHCP ack to complete the DHCP discovery process for the TEP IP address.

Once this process is over, the Remote leaf becomes active into the Fabric and it’s shown in topology diagram at
APIC controller.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 13 of 43
Remote leaf switches can be connected to ACI Multipod Fabric. Following snapshot from APIC controller shows
the ACI Fabric with two Pods. Remote leaf switches are part of Pod2.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 14 of 43
Remote leaf software upgrade
Remote leaf switches can be upgraded in the same way as Local Leaf using APIC controllers. Customers can
divide the switches into odd and even groups and upgrade one group at a time.

Following is the snapshot from APIC controller that divides Remote leaf switch along with other switches into odd
and even groups for upgrades.

ACI connectivity and policy extension to Remote leaf


One of the main use-case of Remote leaf is to provide centralized management and consistent ACI Policy to
remote locations DC. Customers can define ACI networking policies such as Tenant, VRF, BD, EPG, L3outs,
security policies such as contracts and micro-segmentation, telemetry policies, monitoring policies, troubleshooting
policies for the main DC on APIC controller and extend the same policy to Remote DC using Remote leaf.

Remote leaf and Local Leaf have same ACI features and functionality. Single EPG can have EPs located in Local
Leaf as well as Remote leaf. This functionality seamlessly offers L2 extension between ACI main DC and Remote
Locations without much complexity and extra protocol overheads.

Following snapshot from APIC controller shows that EPG1 is stretched to Remote leaf. It has end points connected
to Remote leaf (Node-110 and Node-111) and Local leaf (Node-201).

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 15 of 43
Endpoint connectivity option
Endpoints can be connected to Remote leaf switches in different ways, however Remote leaf switches should
always be configured as part of vPC domain. The following options are possible (also shown in the diagram below):

1. End points connected to Remote leaf switches using vPC port-channel (usually leveraging LACP).
2. Endpoints are single home connected to one of the Remote leaf switch as ‘orphan port’.
3. Endpoints are dual home connected to Remote leaf switches leveraging any form of active/standby NIC
redundancy functionality (MAC pinning, NIC teaming, etc.).
Notice that option 2 and 3 is only supported starting from ACI 3.2 release.

Figure 12. Options to connect end points to Remote leaf

Remote leaf Control Plane and data plane


In a Cisco ACI fabric, information about all the endpoints connected to leaf nodes is stored in the COOP database
available in the spine nodes. Every time an endpoint is discovered as locally connected to a given leaf node, the
leaf node originates a COOP control-plane message to communicate the endpoint information (IPv4/IPv6 and MAC
addresses) to the spine nodes. COOP is also used by the spines to synchronize this information between them.

In a Cisco ACI Remote leaf deployment as well, host information for discovered endpoints must be exchanged with
spine nodes. Remote leaf builds a COOP session with the Spines and updates the Spine with the information of
locally attached host.

There are specific TEP IP addresses that are defined on Spine and Remote leaf to exchange control plane and
data plane information.

RL-DP-TEP (Remote leaf Data-Plane Tunnel End Point) - This is an automatically (and unique) IP address
assigned to each Remote leaf switch from the TEP pool that is allocated to the Remote Location. VXLAN packets
from a Remote leaf node are originated using this TEP as source IP address when the Remote leaf nodes are not
part of a vPC domain.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 16 of 43
RL-vPC-TEP (Remote leaf vPC Tunnel End Point) - This is an anycast IP address automatically assigned to the
vPC pair of Remote leaf nodes from the TEP pool that is allocated to the Remote Location. All the VXLAN packets
sourced from both Remote leaf switches are originated from this TEP address if the Remote leaf switches are part
of a vPC domain.

RL-Ucast-TEP (Remote leaf Unicast Tunnel End Point) – This is an anycast IP address part of the local TEP
pool automatically assigned to all the spines to which the Remote leaf switches are being associated. When
unicast packets are sent from endpoints connected to the RL nodes to the ACI main Pod, VXLAN encapsulated
packets are sent with destination as RL-Ucast-TEP address and source as RL-DP-TEP or RL-vPC-TEP. Any Spine
in the ACI main DC Pod can hence receive the traffic, decapsulate it, perform the required L2 or L3 lookup and
finally re-encapsulate it and forward it to the final destination (a leaf in the local Pod or in a separate Pod in case of
Multipod fabric deployments).

RL-Mcast-TEP (Remote leaf Unicast Tunnel End Point) - This is another anycast IP address part of the local
TEP pool automatically assigned to all the spines to which the Remote leaf switches are being associated. When
BUM (Layer 2 Broadcast, Unknown Unicast or Multicast) traffic is generated by an endpoint connected to the
Remote leaf nodes, packets are VXLAN encapsulated by the RL node and sent with destination as RL-Mcast-TEP
address and source as RL-DP-TEP or RL-vPC-TEP. Any of the spines in the ACI Pod can receive the BUM traffic
and forward it inside the fabric.

The following diagram shows the EP learning process on Spine through COOP session. It also shows the TEP
addresses present on Remote leaf and Spines.

Figure 13. TEP addresses on spine and Remote leaf

Let’s understand the traffic forwarding under different scenario -

● Unicast Traffic between endpoints dual-homed to a pair of Remote leaf nodes.


● Unicast traffic between endpoints single-homed to a pair of Remote leaf nodes (Orphan ports).

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 17 of 43
● Broadcast, Unknown Unicast and Multicast (BUM) traffic generated by an endpoint connected to a pair of
Remote leaf nodes and sent to a Local Leaf (in the ACI Pod) when BD is configured in flood mode.
● Broadcast, Unknown Unicast and Multicast (BUM) traffic generated by an endpoint connected to a pair of
Remote leaf nodes and sent to Local Leaf when BD has the default proxy mode configuration.
● Unicast traffic from an endpoint connected to a Local Leaf to an endpoint connected to the Remote leaf
nodes.
● Unicast traffic from an endpoint connected to the Remote Leaf to an endpoint connected to a Local Leaf.

Unicast traffic between Remote leaf Nodes (Dual-Homed Endpoints)


It’s always recommended to configure Remote leaf switches as part of a vPC domain even if the endpoints are
single homed. When Remote Leaves are configured as a vPC pair, they will establish a vPC control plane session
through over the upstream router, similarly to what happens via the vPC peer-link in NXOS standalone vPC
deployments (ACI does not require the use of a vPC peer-link between leaf nodes). Endpoint information in the
same EPG is synchronized over the vPC control plane session.

Communication between dual-homed endpoints is usually handled locally on the RL node receiving the traffic from
the source endpoint, a behavior named “Greedy Forwarding” and highlighted in the diagram below.

Figure 14. Unicast RL to RL packet flow when end point is connected to RL with vPC

Unicast traffic between Remote Leaves with Orphan port


Communication between endpoints single attached to Remote Leaf nodes (orphan ports) is supported from ACI 3.2
release. EP information is synced between the Remote leaf switches over the vPC control plane session. In the
following diagram EP1 and EP2 are part of same EPG. Since the EPG is deployed on both the RL nodes, they
synchronize EP1 and EP2 information over the vPC control plane and traffic between the single-homed endpoints
is forwarded establishing a VXLAN tunnel through the upstream router.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 18 of 43
Figure 15. Unicast RL to RL packet flow when end point is connected to RL as orphan port

Broadcast, Unknown Unicast and Multicast (BUM) traffic from Remote leaf to Local Leaf when
BD in flood mode

Figure 16. Broadcast, Unknown Unicast and Multicast traffic (BUM) traffic flow from RL to ACI main DC when bridge domain is
in flood mode

ACI Remote leaf solution uses Head End Point Replication (HREP) for Broadcast, Unknown and Multicast (BUM)
traffic. In the above example, end point (EP1) connected to the Remote leaf nodes sends an ARP request for end
point connected to local leaf (EP3), ARP request is first flooded within the Remote leaf pair.

Based on the incoming packet hash value, one of the Remote leaf will act as a designated forwarder for the BD.
Remote leaf that is acting as designated forwarder, encapsulates the ARP packet into VXLAN header and forwards
a single packet to Spines with destination TEP address in VXLAN header as any cast address of the spine (RL-
Ucast-TEP). Remote leaf uses source TEP address in VXLAN header as anycast address of the RL vPC pair (RL-
vPC-TEP).

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 19 of 43
Spine that receives this packet update its COOP database with EP1 information, and makes an entry for EP1 with
next-hop as RL-vPC-TEP. COOP process is running between all Spines, EP1 reachability information is updated in
all the Spines of the Pod.

Spine floods the packet to all the leaves using the BD Multicast address. Every configured BD has a unique
multicast address assigned inside the ACI Pod. When spine floods the ARP packet with BD multicast IP, it will then
reach only the leaves that have endpoints actively connected to this BD.

Broadcast, Unknown Unicast and Multicast (BUM) traffic from Remote leaf to Local Leaf when
BD in proxy mode

Figure 17. Broadcast, Unknown Unicast and Multicast traffic (BUM) traffic flow from RL to ACI main DC when bridge domain is
in proxy mode

In the above example, EP1 connected to Remote leaf switches sends the ARP request for EP3 that is connected to
Local Leaf. EP1 and EP3 are part of same BD, and BD is in proxy mode. When Remote leaf receives the ARP
request, it forwards the packet to spine switches with the source address in VXLAN header as RL-vPC-TEP, and
destination address as RL-Ucast-TEP.

When spine receives this packet, it updates COOP database with EP1 information, and makes an entry for EP1
with next-hop as RL-vPC-TEP. COOP process is running between all Spines, EP1 reachability information is
updated in all the Spines in the POD.

Spine forwards the ARP request to the destination leaf if it has information about EP3 in its database, otherwise it
sends glean message to all the leaves (including other Remote leaf nodes associated to the same ACI Pod). Glean
message triggers each leaf to send an ARP request on all the local interfaces in the BD that received ARP request.
This triggers an ARP reply from EP3 and when the leaf receives the ARP response, it updates its hardware table
and sends a COOP update to the spine with EP3 location.

This ensures that the following ARP request generated by EP1 and received by the spines, can then be directly
forwarded to the leaf where EP3 is connected, allowing the completion of the ARP exchange between EP1 and
EP3 (as discussed in the following sections).

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 20 of 43
Unicast traffic from Local Leaf to Remote leaf

Figure 18. Unicast traffic flow from ACI main DC to RL

In the above example EP3 respond to ARP request sent from EP1. Following are the sequence of event when ARP
response packet is forwarded to EP1 from EP3.

1. EP3 sends a unicast ARP response to EP1 with sender MAC as EP3 and target MAC as EP1.
2. Local Leaf in main DC, receives the ARP response packet, and look for EP3 MAC address in its hardware
table. It will find EP3 with next-hop of RL-vPC-TEP. It encapsulates it in VXLAN header with source TEP
address as Local Leaf TEP (LL TEP) and destination TEP as RL-vPC-TEP. As previously explained, the
destination TEP would be RL-DP-TEP if the Remote leaf nodes were not configured as part of a vPC domain.
3. When spine receives the packet with destination TEP as RL-vPC-TEP, it will update the source TEP as
anycast IP address of the Spines (RL-Ucast-TEP). Spine forwards the packet to the anycast IP address of RL
vPC pair (RL-vPC-TEP).
4. One of the RL receives the packet, de-encapsulates the VXLAN header and forwards the ARP response to the
locally attached host. RL learns the EP3 with next-hop of anycast IP address of the Spine (RL-Ucast-TEP).

Unicast traffic from Remote leaf to Local Leaf

Figure 19. Unicast traffic flow from RL to ACI main DC

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 21 of 43
In the above example EP1 is sending a unicast packet to EP3. Following sequence of events happen when packet
is sent to EP3 from EP1.

1. EP1 sends a unicast packet to EP3 with source IP as local IP EP1 and destination as EP3 (same applies to
the MAC addresses). EP1 picks one of the link towards Remote leaf for forwarding the packet based on hash.
2. Packet from EP1 is received on one of the Remote leaf. Receiving Remote leaf does a Layer 2 look-up for
EP3’s MAC. It finds the next-hop for EP3’s MAC as anycast IP address of Spine (RL-Ucast-TEP). Remote leaf
encapsulates the packet into VXLAN header with source TEP as RL-vPC-TEP and destination TEP as RL-
Ucast-TEP.
3. One of the Spine receives this packet. Spine does a look up for inner header destination (EP3’s MAC). It
updates the destination TEP as Local Leaf TEP (TEP) and forward the packet to the leaf connected to EP3.
4. Local Leaf forwards the packet to connected End point (EP3).

More than one pair of RL at Remote Location


Customers may have more than one pair of Remote leaf switches at Remote Location. In such cases, traffic
between the Remote leaf pair gets forwarded through the Spines in the ACI main DC Pod.

Figure 20. Traffic flow between more than one pair of RL

Following diagram and steps explains this behavior.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 22 of 43
Figure 21. End point learning between RL pairs

1. In the above example EP1 connected to the Remote leaf pair 1 sends unicast traffic to EP2 connected to the
Remote leaf pair 2.
2. When one of the Remote leaf pair 1 nodes receives this packet, assuming it does not have any information
about EP2’s MAC in its hardware table, it forwards the packet to the spine. When doing so, the Remote leaf
encapsulates the packet into VXLAN header with source TEP as RL Pair 1 TEP and destination as anycast IP
address of the Spine (RL-Ucast-TEP).
3. This packet can be received by any spine of the main DC. Spine in Main DC looks up for the MAC address of
EP2, and assuming it knows where it is located, it then forwards the packet to EP2 at Remote leaf Pair 2.
When doing so, and this is the critical point to understand, it changes the source TEP to RL-Ucast-TEP and
destination to RL-Pair 2 TEP.
4. As a consequence, the Remote Leaf pair 2 will associate EP1’s MAC to the anycast IP address of the spine
(RL-Ucast-TEP). Hence Remote leaf pair 2, will always send packet to Spine in ACI main DC before sending it
to Remote leaf pair 1.

Inter-VRF traffic between Remote leaf switches


In ACI Inter-VRF traffic always gets forwarded to the Spine Proxy function; as a consequence, when the endpoints
are connected to the Remote leaf nodes, inter-VRF traffic gets forwarded through the spines in main DC. This is
the behavior at the time of writing of this paper; however, an optimization will be introduced from ACI release 4.0,
allowing Inter-VRF traffic between endpoints connected to Remote leaf switches to be locally forwarded.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 23 of 43
Figure 21a. Inter-VRF traffic within RL pair before ACI 4.0

Figure 22. Inter-VRF traffic within RL pair after ACI 4.0

VMM Domain Integration and vMotion with Remote leaf


ACI Fabric allows the integration with Multiple VMM (Virtual Machine Manager) domains. With this integration,
APIC controller pushes the ACI policy configuration such as networking, telemetry monitoring and troubleshooting
to switches based on the location of virtual instances. APIC controller can pushes the ACI policy in the same way
as Local Leaf. A single VMM domain can be created for compute resources connected to both the ACI main DC
Pod and Remote leaf switches. VMM-APIC integration is also used to push a VDS to those hosts managed by the
VMM and dynamically create port-groups as a result of the creation of EPGs and their association to the VMM
domain. This allows to enable mobility (‘live’ or ‘cold’) for virtual endpoints across different compute hypervisors.

Note: it is worth noticing that mobility for virtual endpoints could also be supported if VMM domain is not created
(i.e. VMs are treated as physical resources).

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 24 of 43
Virtual instances in same EPG or L2 domain (vlan) can be behind local leaf as well as Remote leaf. When Virtual
instance moves from Remote leaf to Local Leaf or vice versa, APIC controller detects the leaf switches where
virtual instance is moved, and pushes the associated policies to new leaf. All VMM and container domain
integration supported for Local leaf are supported for Remote leaf as well.

Figure 23. vMotion between RL to ACI main DC

Above example shows the process of vMotion with ACI Fabric. The following events happen during a vMotion
event:

1. The VM has IP address “100.1.1.100” and a default gateway of 100.1.1.1 in VLAN 100. When the VM comes
up, ACI fabric configures the VLAN and the default gateway of the Leaf switches where the VM is connected.
The APIC controller also pushes the contract and other associated policies based on the location of the VM.
2. When the VM moves from a Remote leaf to a Local Leaf, the ACI detects the location of VM through VMM
integration.
3. Depending on the EPG specific configuration, the APIC controller may need to pushes the ACI policy on the
Leaf for successful VM mobility or policy may already be existing on destination leaf.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 25 of 43
Following is the snapshot from APIC controller where single VMM domain is extended from Local Leaf in ACI main
DC to Remote leaf.

External Connectivity from Remote leaf


External connectivity to Remote DC can be provided through Local L3Out connections on Remote leaf switches.
Following diagram provides an example of how local L3Out on Remote leaf works.

Figure 24. Control plane with Local L3out on RL

In the above example, Remote leaf has local L3Out connection to external router. External router is connected to
Remote leaf switches over vPC with SVI. External prefixes on Remote leaf switches are learnt with next-hop of
Router learnt over SVI.

Remote leaf advertises the externally learnt prefixes to spines through the MP-BGP VPNv4 session between RLs
and Spines. Spines in ACI main DC act as BGP Route Reflector (RR) for both Remote leaf and Local Leaf. Spines
advertises the prefixes received from the Remote leaf nodes to the local leaf switches through Intra-Pod MP-BGP
VPNv4 sessions.

ACI main DC can also have a local L3Out connection to connect to the external Layer 3 domain. Server Leaves in
ACI main DC learns the external prefixes with the next-hop of local Border Leaf TEP addresses. ACI Main Pod
prefers BL1-TEP and BL-2-TEP compared to RL1-DP-TEP and RL2-DP-TEP due to better is metric.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 26 of 43
As a consequence, endpoints connected to either Remote leaf nodes or Local Leaf nodes use by default the local
L3Out for external connectivity. However, ACI main DC L3out can act as a backup for RL and vice versa. For this
BL TEP address has to be advertise to RL.

Figure 25. Dataplane with Local L3out on RL

Both Remote DC and ACI main DC may have Local L3Outs for WAN connectivity and there could be cases where
hosts belonging to the same IP subnet are deployed in both locations. In this case, as explained in the previous
section, the local L3Out connection will normally be preferred for outbound traffic from DC to WAN. However, since
the Border Leaf nodes in the ACI main DC Pod and the Remote leaf switches may be advertising the same IP
subnet prefixes toward the external Layer 3 network domain, incoming traffic may take a sub-optimal path. In such
scenario, incoming traffic destined to an endpoint connected to the Remote leaf switches might in fact ingress the
L3Out in the ACI main DC, if endpoints in the same IP subnet of the remote endpoint are connected in the Main
DC. The traffic would then be forwarded from the ACI main DC to the endpoint connected to the Remote leaf
switches via the IPN; when the endpoint replies, the outbound traffic will take the path via the local L3Out creating
an asymmetric traffic path behavior that may cause traffic drop if perimeter stateful devices (like firewalls) are
deployed between the leaf switches and the external routers.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 27 of 43
Figure 26. Possible traffic asymmetry with same prefix being advertised from RL and ACI main DC

A possible solution to this problem is providing a more granular advertisement of routing information into the WAN,
whereby the BL nodes in the ACI main DC would advertise not only the IP subnet but also the specific endpoints
belonging to that IP subnet and discovered in the main Pod. At the same way, the RL nodes would advertise into
the WAN the IP subnet and the specific host routes for endpoints belonging to that IP subnet and locally
discovered.

This host routes advertisement capability will be available on ACI leaf switches starting from ACI software release
4.0, and ensures that both ingress and egress paths are symmetric and use the same Local L3Out connection (as
shown in the diagram below). With this capability, it is hence possible to deploy independent pairs of perimeter
firewalls in the main DC and at the Remote leaf location.

Figure 27. Solution for traffic asymmetry with same prefix being advertised from RL and ACI main DC

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 28 of 43
Failure handling in Remote leaf deployment
This section captures the different failure scenarios and explains how the network behaves during those failures.

Spine Failure in Main DC


Remote leaf switches uses Anycast IP address of the Spine to send unicast or BUM traffic toward the ACI Pod. If a
spine fails, alternatively available spine can accept traffic from Remote leaf and forward to final destination. Leaf
switches in main DC are attached to multiple spine, and uses ECMP to pick a spine to forward traffic. When one
spine fails, Local leaf switches can pick alternative available spine to forward traffic to Remote leaf.

Figure 28. Traffic forwarding when spine fails in ACI main DC

Remote leaf failure


It’s always recommended to use vPC for Remote leaf switches. When the Remote leaf nodes are part of a vPC
domain, packets from the spines are forwarded to the anycast IP of the Remote leaf vPC pair. If one Remote leaf
fails, traffic will be rerouted toward the second Remote leaf that can accept and forward packets to attached host.

Similarly, when a host is dual-homed to the Remote leaf pair through vPC, the failure of a Remote leaf node would
simply cause the re-shuffling of flows on the link connecting to the second RL switch.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 29 of 43
Figure 29. Traffic forwarding when one Remote leaf fails

Upstream router is lost


It’s recommended to use multiple upstream router for the redundancy purpose. When one of the upstream router
fails, Remote leaf switches can use alternative available upstream router for traffic forwarding.

Figure 30. Upstream router failure at Remote location

No Reachability between Remote leaf and Spine in ACI main DC


Reachability between Remote leaf and Spine can fail due to multiple reasons mentioned below, hence it’s
recommended to build redundancy between Remote leaf and Spine connectivity. When any of the above failure
happens, ACI Remote leaf will lose connectivity to Spines and APIC controllers.

1. All upstream router connected to Remote leaf fails.


2. IP network between Remote leaf and Spine fails.
3. All Spines in ACI main DC fails.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 30 of 43
Figure 31. Failure scenario’s when RL loses connectivity to ACI main DC

Figure 32. Failure scenario when RL loses all upstream routers, or all uplink connections

When Remote leaf switches are configured with vPC domain and end host/L3Out is connected to Remote leaf
switches with vPC, if all upstream routers fails (failure1), then all the uplinks of the RL nodes will fail, and as a
consequence the vPC links toward the endpoints are brought down. This behavior is to avoid traffic blackholing
when EP attached to Remote leaf can send packet, but Remote leaf can’t forward due to uplink failure.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 31 of 43
Figure 33. Scenario when RL loses IP network or spines in ACI main DC

When Remote leaf switches are configured with vPC domain and end host/L3out is connected to Remote leaf
switches with vPC. Following behavior will be seen during 2nd (IP Network) or 3rd (all Spines) failure.

● Traffic to known end points/prefixes connected to Remote leaf will continue to work.
● Traffic to destinations outside of Remote leaf pair will drop since Spine is not available.
● No configuration can be added to Remote Leaves since connectivity to APIC is lost.

Complete failure of Local L3Out at Remote leaf (WAN Isolation Scenario)


Remote leaf switches receives the BGP updates for the prefix learnt through L3Out in ACI main DC using MP-BGP
session with Spines. However, when the same prefix is learnt over Local L3Out, endpoints connected to the
Remote leaf switches prefer Local L3Out over the L3Out connection in the ACI main DC.

When the local L3Out connection fails, outbound traffic can instead start using the L3Out connection in the ACI
main DC, as shown in the figure below.

Figure 34. Traffic forwarding when RL loses local l3out

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 32 of 43
L3Out interface failure when L3Out on Remote leaf is configured over vPC

Figure 35. Control plane learning when l3out is configured over SVI with vPC on RL

Just like Local switches, Remote leaf switches can also be configured with L3out over SVI with vPC. In this
configuration, each RL uses local next-hop reachable via SVI to reach external prefix.

When local interface towards external router goes down, SVI interface does not go down and next-hop does not
change. Next-hop becomes reachable via upstream Router.

Figure 36. Traffic forwarding when L3out interface fails on RL

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 33 of 43
Figure 37. Traffic forwarding when L3out interface fails on RL along with connectivity to ACI main DC

When L3out on Remote leaf is configured over SVI with vPC, it does not depend upon Spine MP-BGP
neighborship to provide reachability information for locally learnt external prefixes. This ensures that even if IP
Network connectivity is lost, external prefixes are reachable via local L3out.

L3Out interface failure when L3Out on Remote leaf is configured over Routed Interface

Figure 38. Control plane learning when l3out is configured on RL over routed interface/sub-interface

Just like Local switches, Remote leaf switches can also be configured with L3Out over routed interface or sub-
interface. In this configuration, each RL will use local next-hop reachable to reach external prefix.

When local interface towards external router goes down, Remote leaf can use peer RL to reach external prefix.
Remote leaf receives external prefix update via the MP-BGP session with Spine.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 34 of 43
Figure 39. Traffic forwarding when L3out interface fails on RL

Figure 40. Traffic forwarding when L3out interface fails on RL along with connectivity to ACI main DC

If both local Leaf and IPN connectivity fails at the same time, then Remote leaf can’t receive BGP update from
spine for the external prefix. In this case, RL won’t have reachability to external prefix.

ACI Multipod and Remote leaf Integration


ACI Remote leaf switches support the integration with an ACI Multipod fabric. In such deployment option, the
Remote leaf switches logically becomes part of one of the ACI Pod. In the following example, ACI main DC has two
Pods, Pod1 has one Remote Location, while Pod2 has two Remote Location logically connected to it.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 35 of 43
Figure 41. Remote leaf with ACI Multipod

When Remote leaf switches are used in Multipod solution, there is a need to configure separate sub-interface on
Spine switches and a separate VRF in IPN. This is to make sure unicast and multicast traffic takes same path and
next-hop of end point does not flap. Following section provides more details about Remote leaf with ACI Multipod.

Reason to keep separate sub-interface on Spine and separate VRF on IPN


To understand the reason for this requirement, we need to analyze what would be the behavior when using the
normal configuration on the spines required for building the Multipod fabric (i.e. leveraging a single L3Out in the
infra tenant with only sub-interfaces tagging traffic on VLAN 4 and connecting the spines with the IPN devices). In
this scenario, we are going to consider how those different traffic flows would be handled:

In the example below Remote leaf is associated to Pod1. Multipod local leaf term is being used for a leaf of remote
pod (pod2).

Unicast traffic within BD from Remote leaf to Multipod Local leaf.

1. Unicast traffic within BD from Multipod Local leaf to Remote leaf.


2. L2 multicast traffic from Multipod Local leaf to Remote leaf.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 36 of 43
Figure 41a. Unicast traffic within BD from Remote leaf to Multipod Local leaf

The figure above highlights the unicast traffic flow between and endpoint EP1 connected to the Remote leaf
switches and an endpoint EP2 part of the ACI Multipod fabric but connected in a different Pod than the one to
which the RL nodes are logically associated (in this above example the Remote leaf switches are part of Pod1).
The following sequence of events happens to establish a unicast traffic flow between EP1 and EP2. The
assumption here is that EP1 and EP2 are part of same BD and IP subnet.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 37 of 43
1. EP1 sends a unicast packet to EP2 with source mac of EP1 and destination mac of EP2.
2. The packet is received by one of the Remote leaf nodes, which performs a Layer 2 lookup for EP2’s Mac
address in its hardware table and finds that EP2 is reachable via the Anycast VTEP address enabled on all the
Spines of its Pod (Pod1). The Remote leaf switch encapsulates this packet with VXLAN header and forwards
toward the Multipod fabric with source IP being the anycast IP address of Remote Leaves (RL-vPC-TEP), and
destination IP the anycast IP address of the Spines in Pod1 (RL-Ucast-TEP).
3. One of the Spines in Pod1 receives the packet. The spine performs a lookup in the COOP database and finds
that EP2 is part of Pod2. As a consequence, it changes the destination IP address to the anycast IP address
of the spines in Pod2.
4. One of the Spine in Pod2 receives this packet, perform the lookup in the COOP database and forwards it to
the Local Leaf where EP2 is located. When doing that, it changes the destination IP to the TEP address of
Local Leaf (Multipod LL TEP) of Pod2 where EP2 is located.
5. Local Leaf of Pod2 receives the packet and updates the hardware table with EP1 information with next-hop of
RL-vPC-TEP based on source TEP of the packet and forwards traffic to EP1.

Figure 42. Unicast traffic within BD from Multipod Local leaf to Remote leaf

In the figure above it is instead shown the return unicast traffic flow from EP2 to EP1, which includes the
following sequence of events:

1. EP2 sends a unicast packet to EP1 with source Mac of EP2 and destination Mac of EP1.
2. Multipod Leaf receives this packet and does the Layer 2 look up for EP1 in its hardware table and finds the
next-hop as anycast IP address of the Remote leaf (RL-vPC-TEP), based on the data plane learning
previously described. The leaf switch encapsulates the packet into VXLAN header with source TEP as its own
TEP (Multipod LL TEP) and destination as RL-vPC-TEP. Since Remote leaf and Multipod are sharing same
IP Network in same VRF, it can directly send the packet to Remote leaf switches over IP Network.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 38 of 43
3. One of the Remote leaf switch receives the packet, and updates EP2 information in its local hardware table
with next-hop of Multipod LL TEP learned from the source TEP of the VXLAN encapsulated packet.
4. The Remote leaf forwards then the packet to the attached host EP1.

Figure 43. L2 multicast traffic from Multipod Local leaf to Remote leaf

Let’s see now what happens when a Multicast traffic flow is sent from EP2 to EP1 without configuring
separate sub-interface and separate VRF in IPN.

1. EP2 sends a multicast traffic with source Mac as EP2 and destination as Mac as L2 Multicast Mac.
2. Multipod Leaf receives this packet and checks for the Bridge domain (BD) of EP2, it forwards the packet to
Multicast IP address of the BD.
3. Local Spines in Pod2 receives the packet and forwards to Spines of Pod1 with destination as BD Multicast
group. Since Remote leaf are logically associated to Pod1, Pod2 spines can’t do HREP of L2 multicast traffic
directly to Remote leaf.
4. Spine in Pod1 forwards the packet to Remote leaf after encapsulating multicast packet in a unicast VXLAN
packet with source as anycast IP address of Spines (RL-Ucast-TEP) and destination as anycast IP address of
Remote Leaves (RL-vPC-TEP).
5. Remote leaf now learns the EP1 with next-hop of RL-Ucast-TEP. It already learnt EP1 with Multipod-LL TEP
with unicast traffic flow.
As a consequence, Unicast and Multicast traffic takes different path without configuring separate sub-
interface and separate VRF in IPN. This causes EP information to flap on Remote leaf.

Remote leaf solution with Multipod creates a separate paths for Remote leaf and ensures that unicast and
Multicast traffic take same path.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 39 of 43
Figure 44. Solution for EP flap with RL and multipod

The following diagram highlights the configuration needed for supporting the integration of Remote leaf nodes with
an ACI Multipod fabric to ensure that Unicast and Multicast traffic flows take the same path avoiding the previously
described flapping of EP information. It is critical to highlight that this configuration is not required without
Multipod.

Figure 45. Configuration details for RL and multipod

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 40 of 43
● As part of the regular ACI Multipod configuration, an L3Out in the infra Tenant is required, leveraging vlan-4
sub-interfaces on the Spines to peer via OSPF with the IPN routers. For integrating Remote leaf nodes with
a Multipod fabric, it is additionally required to create a separate L3Out in the infra tenant leveraging vlan-5
sub-interfaces (OSPF enabled as well).
● The corresponding Vlan-5 sub-interfaces on the upstream IPN routers are configured in a different routing
domain (VRF2) than the one used by Vlan-4 sub-interfaces (VRF1) for East-West Multipod traffic. Notice
that any of those two VRFs could be represented by the global routing table of the IPN device, if desired.
● The IPN must extend connectivity across Pods for this second VRF (VRF2) to which Vlan-5 sub-interfaces
are connected. This can be achieved in multiple ways such as MPLS VPN, VXLAN or VRF-Lite.
● The Remote leaf nodes remain configured with only Vlan-4 sub-interfaces used to peer OSPF with the IPN.
Vlan-4 sub-interfaces on upstream routers are configured in the VRF (VRF1). As a consequence,
reachability information for the RL TEP pool is propagated across the IPN only in the context of VRF1.
● In addition, the APIC controller automatically applies the configuration show in the diagram below and
required to implement the following key functionalities:

Figure 46. Automatic configuration done by APIC for RL and Multipod

1. Spines part of the Pod to which the RL nodes are logically associated start advertising the RL TEP pool on
Vlan-5 sub-interfaces towards all other Pods.
2. Spines are configured with a route-map to reject learning of that Remote leaf TEP pool of all the other Pods on
Vlan-4 sub-interfaces. Spines accept only its associated RL’s TEP pool on vlan-4 sub-interface.
As a result of those two automatic configuration steps, all the spines part of remote Pods have reachability
information toward the RL TEP Pool (10.1.1.0/24) only via the second set of sub-interfaces (leveraging VLAN tag
5) and in the context of the second VRF (VRF2) inside the IPN.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 41 of 43
Let’s take a look at how the unicast traffic from Multipod LL to Remote leaf changes after applying this
configuration.

Figure 47. Unicast traffic from Multipod local leaf to RL

1. EP2 sends a unicast packet to EP1 with source MAC of EP2 and destination MAC of EP1.
2. Multipod Leaf receives this packet and does the Layer 2 look up for EP1 in its hardware table and finds the
next-hop as anycast IP address of the Remote leaf (RL-vPC-TEP). The leaf switch encapsulates the packet
into VXLAN header with source TEP as its own TEP (Multipod LL TEP) and destination as RL-vPC-TEP.
3. Packet is received on one of the Spines of Pod2, which has reachability information for RL-vPC-TEP via vlan-
5 sub-interfaces that are part to VRF2 in the IPN. The spines in Pod2 do not have direct reachability to
the Remote leaf in the context of VRF2, hence the packet must first be routed toward the Spines in
Pod1. Spine of Pod2 changes the destination TEP to anycast IP address of Spines of Pod1 for Multipod
purpose.
4. One of the Spine of Pod1, receives the packet and changes the source TEP to anycast IP address of Spine
(RL-Ucast-TEP) and destination to RL-vPC-TEP.
5. One of the Remote leaf switch receives the packet, and updates EP2 information in its local hardware
table with next-hop of RL-Ucast-TEP, learned from the source TEP of the VXLAN encapsulated packet.
This behavior is now the same as the L2 Multicast traffic previously described.
Now traffic path is same for both Unicast and Multicast traffic from Multipod Local Leaf to Remote leaf, hence there
is no EP information flap on Remote leaf.

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 42 of 43
Summary
Cisco Remote leaf solution allows the centralized management and consistent policy for Remote DCs without the
investment of APIC controllers and Spines for these smaller size DCs. Networking, Security, Monitoring, Telemetry
and troubleshooting Policies that are defined for Main DC can be used for Remote DCs as well that are connected
to ACI main DC over IP Network.

For more information


For more information about the ACI architecture, refer to the documentation available at the following links:
https://www.cisco.com/c/en/us/solutions/data-center-virtualization/application-centric-infrastructure/index.html

https://www.cisco.com/c/en/us/solutions/data-center-virtualization/application-centric-infrastructure/white-paper-
listing.html

Printed in USA C11-740861-00 07/18

© 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 43 of 43

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy