BRKDCN 3020
BRKDCN 3020
Yogesh Ramdoss
Principal Engineer, Customer Experience
@YogiCisco
BRKDCN-3020
Cisco Webex Teams
Questions?
Use Cisco Webex Teams to chat
with the speaker after the session SPEAKER 1
How SPEAKER 2
1 Find this session in the Cisco Events Mobile App WEBEX TEAMS
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
Agenda
• Introduction
• Monitor and Health-Check
• Troubleshooting Tools
• Troubleshooting Traffic Forwarding
• Best Practices and Recommendations
• Summary and Take-Aways
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
Introduction
Switching Architecture Changes FWD – Forwarding
FIRE – Fabric Interface and Replication Engine ASIC
Consolidation of Functions CTS – Cisco TrustSec
SOC – Switch on Chip
FABRIC INTERFACE
LC Arbitration
CPU Fabric ASIC Aggregator
Distributed
Forwarding Card
FIRE FIRE FIRE FIRE
ASIC ASIC L2 FWD ASIC ASIC LC Inband
Linecard
to LC
L3 FWD to ARB CPU S6400
Port ASICs
4 X 10G 4 X 10G 4 X 10G 4 X 10G 4 X 10G 4 X 10G 4 X 10G 4 X 10G 4 X 10G 4 X 10G 4 X 10G 4 X 10G
SOC 1 SOC 2 SOC 3 SOC 4 SOC 5 SOC 6 SOC 7 SOC 8 SOC 9 SOC 10 SOC 11 SOC 12
CTS ASICs
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
NFE – Network Forwarding Engine
1 2 3
NFE NFE ASE SOC
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Nexus 9000 Product Family
Focus For This Session
ASICs Platforms
StrataXGS Trident* 94XX, 9636
StrataXGS Tomahawk* 9432C, C950X-FM-S
StrataXGS Trident* + Northstar 9396, 93128, 95XX
StrataXGS Trident* + Donner 9372, 9332, 93120
This session
StrataDNX Jericho* X9600-R/X-9600-RX
is going to
Tahoe-Sugarbowl 93XX-EX, 97XX-E/EX discuss …
Tahoe-Lacrosse 92XX, C950X-FM-E
Cisco
Tahoe-Davos 92160YC
Cloud-Scale
Rocky-Homewood F/FX/FXP
ASICs
Rocky-Bigsky 9364C,C95XX-FM-E2
Rocky-Heavenly FX2
Rocky-Sundown FX3
DCNM
DCNM
L3
L3
RR RR
VXLAN / L3
L3
EVPN L3
L2
L3 VPC
L3 L3
Hypervisor Hypervisor
Hypervisor
• With wide range of Nexus9000 platforms available in the marketplace, this session is
going to focus on models that are with Cloud-Scale ASICs and are at the cutting-
edge.
• We will not be discussing hardware architecture in detail, but will provide a quick
refresher
• With good number of topics to cover, we are not going to discuss Multicast, QoS or
Buffering.
• Please hold on to your questions till end of the section.
• At any point of time during the presentation and after, you can ask your question in
Webex Teams room.
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
Just to let you know… Reference ➔
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Nexus 9000
… platform of possibilities
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Monitor
and
Health-Check
Agenda
• Hardware Diagnostics
• On-board Failure Logging
• Introduction • Device Resource Usage
• Monitor and Health-Check • Control-Plane Policing
• Hardware Rate-Limiters
• Troubleshooting Tools
• Troubleshooting Traffic Forwarding
• Best Practices and Recommendations
• Summary and Take-Aways
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Hardware Diagnostics
Configuration and Commands run at bootup and detect faulty
hardware before it is brought online
by NX-OS. E.g., EOBCPortLoopback
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
Hardware Diagnostics
Configuration and Commands
Run “show diagnostic result
Diagnostic tests status and testing intervals: <options>” to find the test results.
N93128# show diagnostic content module <mod | all>
Diagnostics test suite attributes:
B/C/* - Bypass bootup level test / Complete bootup level test / NA
P/* - Per port test / NA
M/S/* - Only applicable to active / standby unit / NA
D/N/* - Disruptive test / Non-disruptive test / NA
H/O/* - Always enabled monitoring test / Conditionally enabled test / NA
F/* - Fixed monitoring interval test / NA
X/* - Not a health monitoring test / NA
E/* - Sup to line card test / NA
L/* - Exclusively run this test / NA
T/* - Not an ondemand test / NA
A/I/* - Monitoring is active / Monitoring is inactive / NA
Module 1: 1/10G-T Ethernet Module (Active)
ID Name Attributes Testing Interval(hh:mm:ss)
____ __________________________________ ____________ _________________
1) USB---------------------------> C**N**X**T* -NA-
16) RewriteEngineLoopback---------> CP*N***E**A 00:01:00
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
On-Board Failure Logging (OBFL)
Why we need it and what it does?
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
On-Board Failure Logging (OBFL)
Configuration and Status
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
On-Board Failure Logging (OBFL)
CLI Options
Nearly 20 different options!
N93128# show logging onboard ?
boot-uptime Boot-uptime
card-boot-history Show card boot history
card-first-power-on Show card first power on information
counter-stats Show OBFL counter statistics
credit-loss Show OBFL Credit Loss logs
device-version Device-version
endtime Show OBFL logs till end time mm/dd/yy-HH:MM:SS
environmental-history Environmental-history
error-stats Show OBFL error statistics
exception-log Exception-log
flow-control Show OBFL Flow Control log
internal Show Logging Onboard Internal
interrupt-stats Interrupt-stats
kernel-trace Show OBFL Kernel Trace
module Show OBFL information for Module
obfl-history Obfl-history
obfl-logs Show OBFL Tech Support Log.
stack-trace Stack-trace
starttime Show OBFL logs from start time mm/dd/yy-HH:MM:SS
status Status
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
On-Board Failure Logging (OBFL)
Example – OBFL Exception Log
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Device Resource Usage
Checking Usage of Resources
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Control-Plane Policing (CoPP)
Things to Check
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
Control-Plane Policing (CoPP)
Quick Check – Config and Stats
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
Hardware Rate-Limiters (HWRL)
Things to Check
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
Nexus 9000
… platform of possibilities
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Troubleshooting
Tools
Agenda
• Ethanalyzer
• Introduction • SPAN to CPU
• Monitor and Health-Check
• Consistency Checkers
• Virtual TAC Assistant
• Troubleshooting Tools • Port ACL / Router ACL
• Troubleshooting Traffic Forwarding
• Best Practices and Recommendations
• Summary and Take-Aways
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
Ethanalyzer
Process and Configuration
Capture Stop
Filters
Interface Criteria
(1) Identify Capture Interface
• mgmt – captures traffic on mgmt0 interface
• Inband - captures traffic sent to and received from the control-plane/CPU
(2) Configure Filter
• Display-Filter – captures all traffic but displays only the traffic meeting the criteria
• Capture-Filter - captures only the traffic meeting the criteria
(3) Define Stop Criteria
• By default, it stops after capturing 10 frames. Can be changed with limit-
captured-frames configuration. 0 means no limit, runs until user issues cntrl+C
• autostop can be used, to stop the capture after specified duration, filesize, or
number of files.
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
Ethanalyzer
Introduction
• Built-in tool to analyze the traffic sent and received by CPU. Helpful to
troubleshoot High CPU or Control-plane issues like HSRP failover or OSPF
adjacency flaps.
• Based on tshark code
• Two filtering approaches for configuring a packet capture
Display-Filter Example Capture-Filter Example
“eth.addr==00:00:0c:07:ac:01” “ether host 00:00:0c:07:ac:01”
“ip.src==10.1.1.1 && ip.dst==10.1.1.2” “src host 10.1.1.1 and dst host 10.1.1.2”
“snmp” "udp port 161”
“ospf” “ip proto 89”
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
Ethanalyzer
Putting It All Together
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
Real World Example Host
172.18.37.71
Slow Download Rate
Nexus 9000
▪ Server in VLAN 527
▪ Downloads/Uploads over the WAN are
slow Eth4/1 Eth4/2
▪ Downloads/Uploads on the LAN have WAN
no problem
▪ No incrementing errors on any
interface and low average interface Internet
SVI 527
utilization Gateway
10.5.27.2/24
64a0.e745.89c1
10.5.27.1
Server
78da.6e19.4500
10.5.27.38
000a.f31a.1c1c
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
Real World Example Host
172.18.37.71
Slow Download Rate
Nexus 9000
With Ethanalyzer, can we quickly
validate on the Nexus 9000 if traffic is
hardware or software switched? Eth4/1 Eth4/2
WAN
N9k# ethanalyzer local interface inband capture-filter "host 10.5.27.38 or host 172.18.37.71" Internet
SVI 527
Gateway
10.5.27.2/24
If traffic is software-switched it 64a0.e745.89c1
would be seen on the inband.
Filter for any traffic 10.5.27.1
Server between hosts
experiencing the
78da.6e19.4500
10.5.27.38
000a.f31a.1c1c slow downloads.
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
Real World Example Host
172.18.37.71
Slow Download Rate
Nexus 9000
All traffic from Server (10.5.27.38)
to the Internet (172.18.37.71) is
being software switched
Eth4/1 Eth4/2
WAN
N9k# ethanalyzer local interface inband capture-filter "host 10.5.27.38 or host 172.18.37.71"
Capturing on inband
2020-01-17 07:28:16.406589 10.5.27.38 -> 172.18.37.71 Internet
TCP 60 [TCP Keep-Alive] 28123 > http [ACK]
Seq=1 Ack=1 Win=8760 Len=0 SVI 527
2020-01-17 07:28:16.406603 10.5.27.2 -> 10.5.27.38 ICMP 70 Redirect (Redirect for host) Gateway
2020-01-17 07:28:16.406617 10.5.27.38 Server
-> 10.5.27.2/24
172.18.37.71 TCP 60 [TCP Out-Of-Order] 28123 > http [FIN, ACK]
Seq=1 Ack=1 Win=8760 Len=0
2020-01-17 07:28:16.407142 10.5.27.38 -> 172.18.37.71 TCP 60 64a0.e745.89c1
28124 > httpN9K (10.5.27.2)
[SYN] sends ICMP
Seq=0 Win=8760 Len=0 MSS=1460
2020-01-17 07:28:16.407175 10.5.27.38 -> 172.18.37.71 TCP 60 [TCP
redirects to Server (10.5.27.38)
Out-Of-Order] 28124 > http [SYN]
Seq=0 Win=8760 Len=0 MSS=1460 10.5.27.1
etc...
78da.6e19.4500
10.5.27.38
000a.f31a.1c1c
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
Real World Example Host
172.18.37.71
Slow Download Rate
Nexus 9000
N9k# ethanalyzer local interface inband capture-filter "host 10.5.27.38" limit-captured-frames 1 detail | i
Ethernet|Internet Internet
Capturing on inband
1 packet captured Gateway
Server
Ethernet II, Src: Cisco_1a:1c:1c (00:0a:f3:1a:1c:1c), Dst: Cisco_45:89:c1 (64:a0:e7:45:89:c1)
Internet Protocol Version 4, Src: 10.5.27.38 (10.5.27.38), Dst: 172.18.37.71 (172.18.37.71)
SVI 527
10.5.27.2/24
10.5.27.1
Server 64a0.e745.89c1
78da.6e19.4500
10.5.27.38
000a.f31a.1c1c
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
Real World Example Host
172.18.37.71
Slow Download Rate
Nexus 9000
Root cause:
▪ Server has a firewall enabled
Eth4/1 Eth4/2
to block ALL ICMP Redirects
WAN
to avoid poisoning
Fix Options: Server’s Default
Gateway: 10.5.27.2
1. Re-configure the Server’s Internet
firewall to allow ICMP SVI 527
redirects Gateway
10.5.27.2/24
2. Add a route for WAN subnets 64a0.e745.89c1
to the Server, with Internet
10.5.27.1
Gateway as next-hop
78da.6e19.4500
3. Configure “no ip redirects” Server
under the SVI VLAN527 10.5.27.38
000a.f31a.1c1c
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
SPAN to CPU
Introduction and Configuration
SPAN Replicated packet
But, how to differentiate the regular control-plane packets to SPAN to CPU packets?
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
SPAN to CPU
Troubleshooting Packet Loss monitor session 1
source interface eth1/1 rx
destination interface sup-eth 0
monitor session 1 filter access-group ACL1
source interface eth1/2 rx no shut
destination interface sup-eth 0 ip access-list ACL1
filter access-group ACL1 permit icmp 10.214.10.5/32 any
no shut
ip access-list ACL1
permit icmp 10.214.10.5/32 any
Eth1/1
Network
Network Network
Eth1/2 Eth1/1
N9K-A N9K-B
10.214.10.5/24 10.214.50.11/24
ICMP Traffic
Host A Host B
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
SPAN to CPU
Troubleshooting Packet Loss Captures only the SPAN to CPU
packets, not regular packets!!
Eth1/1
Network
Network Network
Eth1/2 Eth1/1
N9K-A N9K-B
10.214.10.5/24 10.214.50.11/24
ICMP Traffic
Host A Host B
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
SPAN to CPU
Troubleshooting Packet Loss (contd.)
Eth1/1
Network
Network Network
Eth1/2 Eth1/1
N9K-A N9K-B
10.214.10.5/24 10.214.50.11/24
ICMP Traffic
Host A Host B
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
SPAN to CPU
Narrow-scoped Troubleshooting
Eth1/1
Network
Network Network
Eth1/2 Eth1/1
N9K-A N9K-B
10.214.10.5/24 10.214.50.11/24
ICMP Traffic
Host A Host B
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
SPAN to CPU
VXLAN – Topology and Traffic Flow
Spine
10.0.0.100 10.0.0.101
Host A ICMP Host B
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
SPAN to CPU
VXLAN Decode Example
Available in release 7.0(3)I7(4), 9.2(1) and later releases
N9200# ethanalyzer local interf inband mirror display-filter icmp limit-cap 0 detail
Frame 1 (148 bytes on wire, 148 bytes captured)
<snip>
[Protocols in frame: eth:ip:udp:vxlan:eth:ip:icmp:data] <<< frame structure
Ethernet II, Src: 78:0c:f0:a2:2b:df (78:0c:f0:a2:2b:df), Dst: 70:0f:6a:f2:8c:05
(70:0f:6a:f2:8c:05)
<snip>
Type: IP (0x0800)
Internet Protocol, Src: 10.1.1.1 (10.1.1.1), Dst: 10.1.1.2 (10.1.1.2) <<< VTEPs
Version: 4
Header length: 20 bytes
<snip>
Source: 10.1.1.1 (10.1.1.1)
Destination: 10.1.1.2 (10.1.1.2)
User Datagram Protocol, Src Port: 22790 (22790), Dst Port: 4789 (4789) <<< VXLAN Attributes
Source port: 22790 (22790)
Destination port: 4789 (4789)
<snip>
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
SPAN to CPU
VXLAN Example (Contd.)
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
SPAN to CPU
Things to Know
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
Consistency Checkers
What it does?
Consistency Checkers compares the software
Protocol state against the hardware state for consistency,
Configurations and report PASSED or FAILED.
States
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 48
Consistency Checkers
Example – Unicast Route and vPC
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Virtual TAC Assistant
Commands Cascading
Let me
help you
What is it?
• It takes output and parameters from one command
and pass them on to the next command as inputs Virtual TAC
Assistant
and cascade them through the entire sequence of
troubleshooting.
How it helps with troubleshooting?
• speeds up troubleshooting
• avoids missing out commands
• avoids entering wrong commands inputs
• no need to know the procedure or methodology
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
Virtual TAC Assistant
L2 MAC – Command Options
DC2-VTEP# show troubleshoot ?
L2 Display L2 information
L3 Display L3 information
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
Virtual TAC Assistant
L2& L3 – Command Options
Validates programming of a
DC2-VTEP# show troubleshoot ? MAC Address in a given VLAN
L2 Display L2 information
L3 Display L3 information
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
Virtual TAC Assistant
Example - L3 IPv4 Step-by-step
methodical
DC2-VTEP# show troubleshoot L3 ipv4 172.16.144.254 vrf tenant-1
CHECKING HARDWARE ASIC TYPE slot 1 quoted "show hardware internal dev-version” check
<snip>
CHECK ROUTE IN PI RIB show ip route 172.16.144.254 vrf tenant-1
<snip>
CHECK ROUTE IN PD FIB show forwarding route 172.16.144.254/32 vrf tenant-1
<snip>
CHECK HOST ROUTE IN HARDWARE show hardw internal tah L3 v4host | grep 172.16.144.254
<snip>
CHECK FOR THE ADJACENCY show hardware internal tah l3 adjacency 0xd0001"
<snip>
CHECK ROUTE IN SOFTWARE PT
sh hardw internal tah l3 trie detail 172.16.144.254/32 table 3"
<snip>
CHECK FOR THE ROUTE IN E-TABLE
show hardware internal tah sdk l3 sw-table e-table | grep 172.16.144.254"
<snip>
CHECK FOR THE ROUTE IN HASH-TABLE
show hardware internal tah sdk l3 sw-table ipv4 hash-table | grep 172.16.144.254"
<snip>
RUNNING CONSISTENCY CHECKER
Consistency checker passed for 172.16.144.254/32
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
Virtual TAC Assistant
Example – ECMP Hardware Programming Failure Detection
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
Port ACL / Router ACL
Tool and Requirements
N9K# show run | include ignore-case tcam
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
Port ACL / Router ACL
Troubleshooting Packet Loss
root@Server~$ ping 172.18.1.100 -c 5000 -W 1 -i 0
<snip>
5000 packets transmitted, 4886 packets received, 0.2% packet loss,
Using a Port-ACL (PACL) to match
bridged traffic on an L2 switchport
ip access-list 101
N9K-1# show ip access-lists 101 statistics per-entry
IPV4 ACL 101 10 permit icmp 10.0.1.100/32 172.18.1.100/32
statistics per-entry 20 permit ip any any
10 permit icmp 10.0.1.100/32 172.18.1.100/32 [match=5000] ! Apply to server ingress interface
20 permit ip any any [match=323321] interface port-channel101
ip port access-group 101 in
N9K-1
172.18.1.100 N9K-3
Host B Po101
WAN 5000 ICMP Requests
received by N9K-1 10.0.1.100
Host A
N9K-1
1/14
172.18.1.100 N9K-3
Host B
WAN
5000 ICMP Responses 10.0.1.100
received by N9K-3
Host A
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
More Tools
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
Tools and Supported Products
Summary
Supported in Supported in
Tool Nexus 9000 Nexus 9000 Impact
(Broadcom)? (Tahoe/Rocky)?
1 = “dMirror” feature
2 = Limited capabilities
3 = TCAM carving needed
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
Nexus 9000
… platform of possibilities
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Troubleshooting
Traffic
Forwarding
“It is a capital mistake to theorize
before one has data. Insensibly
one begins to twist facts to suit
theories, instead of theories to
suit facts.”
Sherlock Holmes (A Scandal in Bohemia)
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
Troubleshooting Methodology
Application Webpage Choppy
• Define the problem, understand the impact, Call Drops Slowness Won’t Load Video
and determine the scope of the problem
based on the information gathered. This
helps you to make progress towards
resolution. Impact/Scope
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
Agenda • Nexus 9000 Hardware Forwarding – Refresher
• Path-of-the-Packet Troubleshooting
• Control-Plane Traffic
• Introduction • Data-Plane Traffic
• Monitor and Health-Check
• Troubleshooting Tools
• Troubleshooting Traffic Forwarding
• Best Practices and Recommendations
• Summary and Take-Aways
#CLUS BRKDCN-3020 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
Nexus 9000 Traffic Forwarding
SoC and Slice
Slice
• SoC has one or more slices, and a slice Ingress Slice 1 Interconnect
interconnect if more than one slice
Egress Slice 1
Slice 0 - 1.8T
Slice Interconnect
Slice 1 - 1.8T
Ingress Slice 2
• Slice LS3600FX2 – 36x 100G
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
Slice Forwarding Path
(S6400 /
LS3600FX2 only)
Slice
Ingress → SSX
Ingress Forwarding Controller
Packet Payload
Ingress Ingress Packet
Packets MAC Parser
Lookup Key
Lookup
Result
Lookup
Pipeline
Replication Slice
Interconnect
Egress
Egress Egress Packet Egress
Buffering / Queuing
Packets MAC Rewrites Policy
/ Scheduling
← Egress
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
Ingress Lookup Pipeline
From
Ingress Ingress Forwarding Controller
MAC
Packet To Egress
Parser Slice
Flex
TCAM
Tiles TCAM
Lookup
Lookup Key
Result
Load
Forwarding Ingress
Balancing,
Lookup Classification
AFD* / DPP*
Flush
Flow Table
LSE / LS1800FX /
LS3600FX2 only Lookup Pipeline
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
Flexible Forwarding Tiles
Forwarding Lookup
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
IP Unicast Forwarding
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 69
VXLAN Forwarding Encapsulation
BD,DMAC “Is packet destined
VRF,IPDA to remote VTEP?”
• VXLAN and other tunnel L2/L3 Tunnel Outer MACs/
Receive
encapsulation/ decapsulation Lookups Destination IPs/VNID
performed in single pass MAC/
LPM/HRT
DST_INTF/ ADJ PTRs Rewrite
ADJ_PTR “What are the
tunnel header
• Encapsulation values?”
• L2/L3 lookup drives tunnel destination
Decapsulation
• Rewrite block drives outer header fields
(tunnel MACs/IPs/VNID, etc.) Strip outer
header and
• Decapsulation “Is this a tunnel (Outer (Inner rewrite inner
packet?” VRF,IPDA) MAC/VRF/IP) packet
• Packet parser determines whether and
what type of tunnel packet Receive Parser
My TEP Inner L2/L3
Rewrite
Table Lookups
• Forwarding pipeline determines whether
tunnel is terminated locally, drives inner “Is the tunnel destination “If Yes, process inner
lookups a TEP I terminate?” packet headers”
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
Classification TCAM 256
256
256
256
256 256 256 256
256 256 256 256
• Dedicated TCAM for packet classification 256 256 256 256
256 256
• Capacity varies depending on platform 256 256
256 256 256 256
• Leveraged by variety of features: 256 256 256 256
• RACL / VACL / PACL 256 256 256 256
• L2/L3 QOS 256 256 256 256
• SPAN / SPAN ACL Ingress Slice Ingress Slice
• NAT Egress Slice Egress Slice
• COPP 256 256 256 256
• Flow table filter (LS1800FX/ LS3600FX2) 256 256 256 256
256 256 256 256
256 256 256 256
• To increase size of a region, some other region must be sized smaller Egress Slice
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
Path of the Packet
Control-Plane Traffic - Setup
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
Path of the Packet
Process-level Debug
Control-Plane Traffic
OSPF BGP PIM
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
Path of the Packet 2
Control-Plane Traffic: ASIC Counters (for front-panel ports)
Drop Conditions
Indicates traffic dropped because of ACL!!
--------------- -----------------------------------------------------------------------------
67 : TAHOE Ingress DROP_ACL_DROP
Do “clear hardware internal interface-all asic counters mod <mod#>” to clear the conditions
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
Path of the Packet
Tahoe ASIC Counters
# Description # Description
1 DROP_PARSE_ERR 26 DROP_MC_GIPO_MISS
2 DROP_EOF_ERR 27 DROP_UC_HIT_NO_PATH
3 DROP_OUTER_IDS_G0 28 DROP_UNUSED
4 DROP_OUTER_IDS_G1 29 DROP_AC_SUP_DROP
5 DROP_OUTER_IDS_G2 30 DROP_AC_POL_DROP
6 DROP_OUTER_IDS_G3 31 DROP_AC_STORM_POL_DROP
7 DROP_OUTER_IDS_G4 32 DROP_FAST_CONV_LOOP_PREVENT
8 DROP_OUTER_IDS_G5 33 DROP_PP_BOUNCE_MYTEP_MISS
9 DROP_OUTER_IDS_G6 34 DROP_VLAN_MBR_INPUT
10 DROP_OUTER_IDS_G7 35 DROP_IEOR_PP_RETURN_PC_2_HG2_MISS
11 DROP_OUTER_XLATE_MISS 36 DROP_IEOR_UPLINK_UC_SAME_IF
12 DROP_INFRA_ENCAP_SRC_TEP_MISS 37 DROP_IEOR_SPINE_PROXY_PC_2_HG2_MISS
13 DROP_INFRA_ENCAP_TYPE_MISMATCH 38 DROP_VIF_MISS
14 DROP_UC_TENANT_MYTEP_ROUTE_MISS 39 DROP_UNEXPECTED_VFT
15 DROP_TENANT_MYTEP_BRIDGE_MISS 40 DROP_MISSING_VNTAG
16 DROP_ARP_ND_UCAST_MISS 41 DROP_VLAN_XLATE_MISS
17 DROP_QIQ_EXPECT_2_QTAGS 42 DROP_RBID_FTAG_MISS
18 DROP_MC_DVIF_MISS 43 DROP_IP_MTU_CHECK_FAILURE
19 DROP_SHARD_OVERRIDE_VLAN_XLATE_MISS 44 DROP_UC_RPF_FAILURE
20 DROP_FCF_CHECK_FAILED 45 DROP_MC_RPF_FAILURE
21 DROP_TTL_EXPIRED 46 DROP_L3_BINDING_FAILURE
22 DROP_SECURITY_GROUP_DENY 47 DROP_IP_UNICAST_FIB_MISS
23 DROP_LOOPBACK_OUTER_HEADER_MISMATCH 48 DROP_FIB_SA
24 DROP_OVERLAYL2_OUTER_HEADER_MISMATCH 49 DROP_FIB_DA
25 DROP_MC_IIC 50 DROP_NSH_NOT_ALLOWED
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
Path of the Packet
Tahoe ASIC Counters
# Description # Description
51 DROP_SRC_VLAN_MBR 74 DROP_INNER_IDS_G2
52 DROP_NSH_SRC_SW_CHK_FAILED 75 DROP_INNER_IDS_G3
53 DROP_L2MP_IIC_FAILED 76 DROP_INNER_IDS_G4
54 DROP_L2MP_ON_CE_BD 77 DROP_INNER_IDS_G5
55 DROP_L2MP_ENCAP_FROM_EDGE 78 DROP_INNER_IDS_G6
56 DROP_L2MP_NOENCAP_FROM_CORE 79 DROP_INNER_IDS_G7
57 DROP_OUTER_TTL_EXPIRED 80 DROP_INFRA_ENCAP_SRC_TEP_DROP
58 DROP_INCORRECT_VNTAG_TYPE 81 DROP_SPLIT_HORIZON_CHECK
59 DROP_L2MP_FTAG_COMP_MISS 82 DROP_MC_FIB_MISS
60 DROP_IPV6_UC_LINK_LOCAL_CROSS_BD 83 DROP_MC_L2_MISS
61 DROP_IPV6_MC_SA_LOCAL_DA_GLOBAL_SVI 84 DROP_UC_DF_CHECK_FAILURE
62 DROP_IPV6_MC_SA_LOCAL_DA_GLOBAL_L3IF 85 DROP_UC_PC_CFG_TABLE_DROP
63 DROP_ROUTING_DISABLED 86 DROP_ILLEGAL_EXPL_NULL
64 DROP_FC_LOOKUP_MISS 87 DROP_MPLS_LOOKUP_MISS
65 DROP_NO_SGT_FROM_CORE 88 DROP_OUTER_CBL_CHECK
66 DROP_IP_SELF_FWD_FAILURE 89 DROP_NULL_SHARD_WITH_E_BIT_SET
67 DROP_ACL_DROP 90 DROP_LB_DROP
68 DROP_SMAC_MISS 91 DROP_NAT_FRAGMENT
69 DROP_SECURE_MAC_MOVE 92 DROP_ILLEGAL_DCE_PKT
70 DROP_NON_SECURE_MAC 93 DROP_DCI_VNID_XLATE_MISS
71 DROP_L2_BINDING_FAILURE 94 DROP_DCI_SCLASS_XLATE_MISS
72 DROP_INNER_IDS_G0 95 DROP_DCI_2ND_UC_TRANSIT
73 DROP_INNER_IDS_G1
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 79
Path of the Packet ASE – Application Spine Engine
Drop Conditions
--------------- -----------------------------------------------------------------------------
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 81
Path of the Packet 4
Control-Plane Traffic: Inband Counters
Sup Engine
N9508-A# show hardware internal cpu-mac inband counters NX-OS
eth2 Link encap:Ethernet HWaddr 00:00:00:01:1b:01
BROADCAST MULTICAST MTU:9400 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0 IP Stack
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
Packet
collisions:0 txqueuelen:1000
Manager
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
eth3 Link encap:Ethernet HWaddr 00:00:00:01:1b:01
UP BROADCAST RUNNING MULTICAST MTU:9400 Metric:1 PS-INB
RX packets:8484226 errors:0 dropped:0 overruns:0 frame:0
TX packets:4523271 errors:0 dropped:0 overruns:0 carrier:0 Eth2 Eth3
collisions:0 txqueuelen:1000
RX bytes:860671333 (820.8 MiB) TX bytes:493276319 (470.4 MiB)
ps-inb Link encap:Ethernet HWaddr 00:00:00:01:1b:01
UP BROADCAST RUNNING MULTICAST MTU:9400 Metric:1 SC-A
RX packets:14327 errors:0 dropped:0 overruns:0 frame:0
TX packets:14312 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000 FM1
RX bytes:38890552 (37.0 MiB) TX bytes:37871460 (36.1 MiB)
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
Path of the Packet <… continued …>
4
Control-Plane Traffic: Inband Statistics Missed packets (FIFO overflow) 0
Single collisions .............. 0
Excessive collisions ........... 0
Multiple collisions ............ 0
N9508-A# show hardware internal cpu-mac inband stats Late collisions ................ 0
<snip> Collisions ..................... 0
eth3 stats: Defers ......................... 0
RMON counters Rx Tx Tx no CRS ..................... 0
----------------------+------------+-------------------- Carrier extension errors ....... 0
total packets 8406058 4481386 Rx length errors ............... 0
<snip> FC Rx unsupported .............. 0
65-127 bytes packets 8391840 4470748 Rx no buffers .................. 0
<snip> Rx undersize ................... 0
broadcast packets 15 561531 Rx fragments ................... 0
multicast packets 0 0 Rx oversize .................... 0
<snip> Rx jabbers ..................... 0
Error counters Rx management packets dropped .. 0
--------------------------------+-- Tx TCP segmentation context .... 0
CRC errors ..................... 0 Tx TCP segmentation context fail 0
Alignment errors ............... 0 Rate statistics
Symbol errors .................. 0 -----------------------------+---------
Sequence errors ................ 0 Rx packet rate (current/peak) 160 / 1254 pps
Good health-check.
RX errors ...................... 0 Tx packet rate (current/peak) 112 / 889 pps
Set a baseline!!
<… continued …> <snip>
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
Path of the Packet
LSE
Data-Plane: L3 Flow CPU Slice 0 Slice Interconnect Slice 1
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
Path of the Packet 1
Data-Plane: L3 Flow – Check SW/HW FIB
eth1/1
Nexus93180YC-EX Network
10.4.10.9
10.222.222.2
Check Forwarding Information Base (FIB) in Hardware make sure the results are matching
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
Path of the Packet Table #1 is for default VRF. To
find table number for other VRFs,
2
Data-Plane: L3 Flow – Route Programmed in ASIC use “show hardware internal tah
L3 v4host” command
Entry in Tahoe-Sugarbowl Routing Table
module-1# show hardware internal tah L3 10.222.222.2/32 table 1
DLeft location: 0x0
FP location : 255/4/0xc the physical interface where the
**EPE label packet is going to be sent out.
*Flags: “show hardware internal tah
CC=Copy To CPU, SR=SA Sup Redirect, interface ethernet 1/1 | inc
DR=DA Sup Redirect, TD=Bypass TTL Dec, src_intf_num” should report “1”
DC=SA Direct Connect,DE=Route Default Entry,
LI=Route Learn Info,HR=Host as Route
HW Loc | Ip Entry | VRF | MPath | NumP | Base/L2ptr |CC|SR|DR|TD|DC|DE|LI|HR|
-----------|--------------|-------|-------|-------|------------|--|--|--|--|--|--|--|--|
4/16 | 10.222.222.2 | 1 | No | 0 | 0x40000002 | | | | | | | | |
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 87
Path of the Packet 4
Data-Plane: L3 Flow – Adjacency Programmed in ASIC
next-hop IP address
Adjacency Information in Software
DC1-BGW1# show ip adjacency 10.4.10.9 Destination
IP Adjacency Table for VRF default mac-address Egress Interface
Total number of entries: 1
Address MAC Address Pref Source Interface Flags
10.4.10.9 700f.6a00.cee1 50 arp Ethernet1/1
DC1-BGW1#
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 88
Path of the Packet 5&6
Data-Plane: L3 Flow – Adjacency Programmed in ASIC
Adjacency
module-1# Entry programmed in the Tahoe-Sugarbowl ASIC
10.4.10.1/30 10.4.10.10/30
Eth1/2 Eth1/1
Nexus93180YC-EX Network
10.4.10.9/30
10.4.10.2/30 10.222.222.2
700F.6A00.CEE1
700F.6A5E.32EB
Do you remember
Virtual TAC Assistant
and its benefits?
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
Best Practices
and
Recommendations
Agenda
• Introduction
• Monitor and Health-Check
• Troubleshooting Tools
Based on true data!!
• Troubleshooting Traffic Forwarding
• Best Practices and Recommendations
• Summary and Take-Aways
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 92
Customer-reported Problems Trends
topics
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 93
Best Practices and Recommendations
Layer 1 and Transceivers
• Connect the cable/media at both ends, insert the transceivers completely and
through following commands verify speed, duplex, capabilities, supported modes
and DOM values.
show interface eth x/y transceiver details
show interface eth x/y capabilities
show interface brief - check for the interface tuple display and others
show interface eth x/y status
• Enable auto-negotiation at both ends. Yes, we need it!
• Check transparent device or circuit in the middle, if any
• Have you checked Transceiver compatibility? Review Transceiver Compatibility
Matrix at https://tmgmatrix.cisco.com/
• Internal event-history commands can be helpful to determine which device have
initiated link-down first.
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 94
Best Practices and Recommendations
Redundancy and High-Availability
© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 95
Best Practices and Recommendations
System Management – Choose right NX-OS version
Nexus 9000
Recommended Software
General Recommendation for New and Existing Deployments: bulletin at Cisco.com
* If 9.2(x) or 9.3(x) is needed to deploy new hardware or features, use the latest version available on CCO
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 96
Summary
&
Take-Aways
Summary
What you need to do? How its going to help you?
monitor health and resource usage proactively identify bottlenecks and hotspots
get familiar with built-in tools attain better visibility and localize issues
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 98
Take-Aways
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 99
References and Useful Links
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 100
Complete your
online session
survey • Please complete your session survey
after each session. Your feedback
is very important.
• Complete a minimum of 4 session
surveys and the Overall Conference
survey (starting on Thursday) to
receive your Cisco Live t-shirt.
• All surveys can be taken in the Cisco Events
Mobile App or by logging in to the Content
Catalog on ciscolive.com/emea.
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 101
Continue your education
Demos in the
Walk-in Labs
Cisco campus
BRKDCN-3020 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 102
Thank you