0% found this document useful (0 votes)
160 views41 pages

Enterprise Sonic Distribution Lifecycle Management

Uploaded by

Freddy Vergara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
160 views41 pages

Enterprise Sonic Distribution Lifecycle Management

Uploaded by

Freddy Vergara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Enterprise SONiC Distribution by Dell

Technologies
Lifecycle Management

Abstract
This document describes the outlook of current and future management infrastructure in
SONiC, and the benefits and options offered by the new SONiC Management Framework
pioneered by Dell Technologies. Zero Touch Provisioning (ZTP), monitoring and telemetry, and
flow analysis are also discussed.

Dell Technologies Networking Infrastructure Solutions

Part Number: H18379

June 2020
Notes, cautions, and warnings

NOTE: A NOTE indicates important information that helps you make better use of your product.

CAUTION: A CAUTION indicates either potential damage to hardware or loss of data and tells you how to avoid the
problem.

WARNING: A WARNING indicates a potential for property damage, personal injury, or death.

© 2020 Dell Inc. or its subsidiaries. All rights reserved. Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other
trademarks may be trademarks of their respective owners.
Contents

1 Introduction to SONiC.................................................................................................................. 5
Dell Technologies vision........................................................................................................................................................ 5
SONiC..................................................................................................................................................................................... 5
Containerized architecture...................................................................................................................................................5
Enterprise SONiC Distribution by Dell Technologies......................................................................................................... 7
Enterprise SONiC Distribution by Dell Technologies 3.0.............................................................................................7
Enterprise SONiC Distribution by Dell Technologies offerings.................................................................................. 8
Typographical conventions...................................................................................................................................................8
Support and feedback...........................................................................................................................................................8
Technical support............................................................................................................................................................ 8
Feedback for this document.......................................................................................................................................... 8

2 Outlook of the Current Management Infrastructure........................................................................ 9


Current management models...............................................................................................................................................9
Config_db.json....................................................................................................................................................................... 9
Minigraph.xml........................................................................................................................................................................10
Linux shell...............................................................................................................................................................................11
Python-based SONiC CLI................................................................................................................................................... 12
Free Range Routing............................................................................................................................................................. 13
Simple Network Management Protocol............................................................................................................................ 14
Summary of management models..................................................................................................................................... 15
Challenges with current management models................................................................................................................. 16
Summary................................................................................................................................................................................17

3 SONIC Management Framework.................................................................................................. 18


Management framework.....................................................................................................................................................18
Advantages of using the Management Framework CLI..................................................................................................18
Access the CLI......................................................................................................................................................................19
Management framework CLI..............................................................................................................................................19
Management Framework CLI use case example.......................................................................................................20
REST...................................................................................................................................................................................... 21
REST use case example.................................................................................................................................................21
gRPC Network Management Interface............................................................................................................................23
gNMI use case example................................................................................................................................................ 23
Use case example for setting MTU on interface Ethernet0.................................................................................... 24
gRPC Network Operations Interface................................................................................................................................24
DevOps, automation, and telemetry................................................................................................................................. 24
OpenConfig option........................................................................................................................................................ 24
gNMI option....................................................................................................................................................................24
gNOI option.................................................................................................................................................................... 24
Authentication, authorization, and accounting services.................................................................................................25
Role-based access control................................................................................................................................................. 25
Summary...............................................................................................................................................................................25

Contents 3
4 Zero Touch Provisioning............................................................................................................. 26
Introduction.......................................................................................................................................................................... 26
Advantages of using ZTP................................................................................................................................................... 26
ZTP process......................................................................................................................................................................... 26
User-defined ZTP JSON file use case example...............................................................................................................27
Example of configuring ZTP using CLI..............................................................................................................................29
Enable ZTP..................................................................................................................................................................... 29
Verify ZTP.......................................................................................................................................................................29
Disable ZTP.................................................................................................................................................................... 29
Summary...............................................................................................................................................................................30

5 Monitoring and Telemetry............................................................................................................31


Introduction...........................................................................................................................................................................31
Telemetry terminology.........................................................................................................................................................31
gNMI modes..........................................................................................................................................................................31
Dial-in...............................................................................................................................................................................32
Dial-out............................................................................................................................................................................32
Supported RPC operations................................................................................................................................................ 32
Visibility with advanced telemetry..................................................................................................................................... 32
Inband flow analysis.......................................................................................................................................................33
Congestion monitoring..................................................................................................................................................33
Mirror on drop................................................................................................................................................................ 33
Summary...............................................................................................................................................................................33

6 Flow Analysis............................................................................................................................. 34
Introduction.......................................................................................................................................................................... 34
sFlow..................................................................................................................................................................................... 34
Advantages of sFlow.....................................................................................................................................................34
sFlow operations............................................................................................................................................................34
sFlow use case and example........................................................................................................................................ 35
Introduction to packet-level network telemetry..............................................................................................................36
Benefits of packet-level network telemetry.....................................................................................................................37
Everflow................................................................................................................................................................................37
Everflow components and operation ......................................................................................................................... 37
Everflow use case and example...................................................................................................................................38
Summary...............................................................................................................................................................................40

7 References................................................................................................................................. 41
SONiC documentation.........................................................................................................................................................41
Ansible documentation........................................................................................................................................................ 41
Dell Technologies Networking Infrastructure Solutions documentation.......................................................................41

4 Contents
1
Introduction to SONiC
Topics:
• Dell Technologies vision
• SONiC
• Containerized architecture
• Enterprise SONiC Distribution by Dell Technologies
• Typographical conventions
• Support and feedback

Dell Technologies vision


The vision at Dell Technologies is to be the essential technology company for the data era. Dell Technologies ensures modernization for
today's applications and the emerging cloud-native world. Our Networking team is committed to disrupting the fundamental economics of
the market with an open strategy that gives you the freedom of choice for networking operating systems and top-tier merchant silicon.
The Dell Technologies strategy enables business transformations that maximize the benefits of collaborative software and standards-
based hardware, including lowered costs, flexibility, freedom, and security. Dell Technologies provides further customer enablement
through validated deployment guides that demonstrate these benefits while maintaining a high standard of quality, consistency, and
support.

SONiC
Software for Open Networking in the Cloud (SONiC) is a Networking Operating System (NOS) based on Linux. It is an open-source NOS.
Switches from various vendors support SONiC. The open-source community contributing to the development of SONiC NOS provides a
great advantage. SONiC has been embraced and adopted in companies that run huge enterprise data centers and cloud providers.
The benefits that SONiC offers are:
• Hardware independence
• Containerized architecture
• Scale with ease
• Open source
• High performance
• Agility with a flexible management framework

NOTE: For more details, see https://azure.github.io/SONiC/.

Containerized architecture
Containerized architecture provides maximum control, flexibility, and choice, as shown in the following figure. Various modules in SONiC
architecture use the Redis database to work with each other. The modules are placed in docker containers which include:
• DHCP-relay
• Pmon
• Snmp
• Lldp
• Bgp
• Teamd
• Database
• Swss
• Syncd

Introduction to SONiC 5
Docker containers, some of which are integrated within Linux, work to carry out significant functionalities. An example is how the SONiC
shell provides the CLI.

Figure 1. SONiC containerized architecture

Tools and various management platforms integrate seamlessly into the SONiC switches. The Switch Abstraction Interface is an integral
part of the SONiC NOS. The SONiC NOS can run on supported Dell EMC PowerSwitch hardware or switch hardware from supported
vendors. This provides flexibility for the user and prevents vendor lock-in.
SONiC is a competitive choice for large enterprises, private cloud, and cloud environments and is suitable for scalable solutions. A use case
example includes a typical leaf-spine (also known as Clos) architecture as well as a leaf-spine architecture with super spines. The first of
the following figures shows a typical Leaf-spine or Clos architecture. The second of the following figures shows a leaf-spine topology with
super spines.

Figure 2. Leaf-spine or Clos architecture

6 Introduction to SONiC
Figure 3. Leaf-spine topology with super spines

Enterprise SONiC Distribution by Dell


Technologies
Dell Technologies’ distribution of SONiC is a hardened, validated, and supported version of SONiC. Enterprise SONiC Distribution by Dell
Technologies implements distribution of the community SONiC. It includes the existing features of the stock SONiC NOS, other additional
features and is validated for customer use cases. Enterprise SONiC Distribution by Dell Technologies allows the user to integrate
applications and features further to support the ecosystem and partners.
NOTE: Community SONiC or stock SONiC refers to the open-source SONiC. These two terms are used interchangeably
in this document.

Enterprise SONiC Distribution by Dell Technologies 3.0


Enterprise SONiC Distribution by Dell Technologies 3.0 is pioneered by Dell Technologies. It takes the stock SONiC components from the
SONiC community. It also introduces new features that are based on customer use cases before they appear in the community.
Some of the key features of Enterprise SONiC Distribution by Dell Technologies 3.0 include:
• Zero-Touch Provisioning (ZTP)
• VRRP
• BGP
• MLAG
• Network Address Translation (NAT)
• Management Framework
• ACL
• RADIUS/TACAS+ authentication
• gNMI/REST with OpenConfig YANG models
• Telemetry
• Simple Network Management Protocol (SNMP)
• QoS
• sFlow
• VXLAN
• UDLD
• BFD

Introduction to SONiC 7
Enterprise SONiC Distribution by Dell Technologies
offerings
Customers can deploy the most suited offering for Enterprise SONiC Distribution by Dell Technologies 3.0 based on their requirements.
The offerings include:
• Cloud standard
• Enterprise standard
• Cloud premium
• Enterprise premium

Typographical conventions
The CLI and GUI examples in this document use the following conventions:

Monospace text CLI examples


Underlined monospace text CLI examples that wrap the page
Italic monospace text Variables in CLI examples
Bold text within monospace text Commands entered at the CLI prompt, or to highlight information in
CLI output
Bold text GUI fields and information entered in the GUI

Support and feedback

Technical support
For technical support, visit http://www.dell.com/support or call (USA) 1-800-945-3355.

Feedback for this document


We encourage readers to provide feedback on the quality and usefulness of this publication by sending an email to
Dell_Networking_Solutions@Dell.com.

8 Introduction to SONiC
2
Outlook of the Current Management
Infrastructure
Topics:
• Current management models
• Config_db.json
• Minigraph.xml
• Linux shell
• Python-based SONiC CLI
• Free Range Routing
• Simple Network Management Protocol
• Summary of management models
• Challenges with current management models
• Summary

Current management models


There are various management models available to manage, configure, and administer the stock SONiC NOS. Numerous use case
examples outline the use of different management models that the stock SONiC NOS offers. The use case for Linux shell CLI includes
configuration of features such as:
• Creation of port channels
• Addition of interface members to the port channels
• Creation of VLANs
• Addition of members to the VLANs
• Trunk or access modes for the VLAN interfaces
The config_db.json file satisfies the use cases of modifying the tables of the file for related features. For example, a user can
manipulate the ACL tables directly for configuring and managing ACLs. The BGP_NEIGHBOR table provides session configuration
information for the BGP. The use case for the FRRouting shell arises from the ease of configuring the routing protocols using the FRR
shell. A user can directly configure BGP in the FRR shell instead of manipulating the config_db.json file.
This chapter discusses the use case examples of different management models.

Config_db.json
The primary databases hosted by the REDIS database include APPL_DB, CONFIG_DB, STATE_DB, ASIC_DB, and COUNTERS_DB.
Stock SONiC keeps the configuration in ConfigDB. ConfigDB uses a table-object schema, and config_db.json is a serialization of DB.
The following example shows a Device Metadata table. The Device Metadata table has a single object named localhost. For example, the
Device Metadata table contains hwsku, platform, and so on.

"DEVICE_METADATA": {
"localhost": {
"hostname": "sonic-leaf1",
"hwsku": "DellEMC-S5232f-C32",
"mac": "3c:2c:30:49:20:00",
"platform": "x86_64-dellemc_s5232f_c3538-r0",
"type": "LeafRouter"
}
}

Outlook of the Current Management Infrastructure 9


In the stock SONiC, ConfigDB is implemented as database 4 of local Redis. The config_db.json file contains the startup
configurations. It is found in /etc/sonic/. When system boots, configurations are loaded from the config_db.json file into Redis.
The config_db.json is in /etc/sonic/. The content in the config_db.json file is start-up config, and content in redisDB is
running-config.
The configuration can be modified in the config_db.json file, and the config file can be loaded into the NOS. For example, the
following configuration can be added into the config_db.json file for configuring VLAN 1003 on a switch running stock SONiC NOS,
adding IP address, and adding interfaces to the VLAN.

"Vlan1003": {
"members": [
"Ethernet48",
"Ethernet0"
],
"vlanid": "1003"
}
},
"VLAN_INTERFACE": {
"Vlan1003|172.10.3.253/24": {}
},
"VLAN_MEMBER": {
"Vlan1003|Ethernet0": {
"tagging_mode": "tagged"
},
"Vlan1003|Ethernet48": {
"tagging_mode": "untagged"
}
}

Viewing the added configuration in the config_db.json file can be done using the Linux shell. See the following figure.
NOTE: For the configurations added to the config_db.json file to take effect in the redisDB, use the config load or
config reload command to force load the JSON file into the database. Alternatively, users can reboot the device.

Figure 4. Validate configured VLAN

Minigraph.xml
A legacy method of configuration was done using an XML file named minigraph.xml. The user can edit and modify the file mingraph.xml
located in /etc/sonic/. The file can also be copied from a remote server. The configuration from minigraph.xml can be loaded using the
config load_minigraph command. Optionally, the user can reboot the device to make the configuration effective. For example, the
device information can be defined in the minigraph.xml file.
The hostname and HwSku information are defined in the root element <DeviceMiniGraph>.

<DeviceMiniGraph>
...
<Hostname> sonic-leaf1</Hostname>
<HwSku> DellEMC-S5232f-C32</HwSku>
</DeviceMiniGraph>

NOTE: The configuration using minigraph.xml is deprecated.

10 Outlook of the Current Management Infrastructure


Linux shell
The user is greeted with the Linux shell when accessing the SONiC NOS. The list of Linux shell commands available is invoked by typing
help.

admin@Leaf1:~$ help
GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu)
These shell commands are defined internally. Type `help' to see this list.
Type `help name' to find out more about the function `name'.
Use `info bash' to find out more about the shell in general.
Use `man -k' or `info' to find out more about commands not in this list.
A star (*) next to a name means that the command is disabled.
job_spec [&] history [-c] [-d offset] [n] or hist>
(( expression )) if COMMANDS; then COMMANDS; [ elif C>
. filename [arguments] jobs [-lnprs] [jobspec ...] or jobs >
: kill [-s sigspec | -n signum | -sigs>
[ arg... ] let arg [arg ...]
[[ expression ]] local [option] name[=value] ...
alias [-p] [name[=value] ... ] logout [n]
bg [job_spec ...] mapfile [-d delim] [-n count] [-O or>
bind [-lpsvPSVX] [-m keymap] [-f file> popd [-n] [+N | -N]
break [n] printf [-v var] format [arguments]
builtin [shell-builtin [arg ...]] pushd [-n] [+N | -N | dir]
caller [expr] pwd [-LP]
***** OUTPUT TRUNCATED *****

NOTE: The preceding code is from the Device information section of https://github.com/Azure/SONiC/wiki/
Configuration-with-Minigraph-(~Sep-2017).
For example, the command to list the contents of the folder sonic is listed below:

admin@Leaf1:~$ ls /etc/sonic/
agent_config.cfg frr snmp.yml
asic_config_checksum generated_services.conf sonic_branding.yml
config_db.json hamd sonic_version.yml
constants.yml init_cfg.json updategraph.conf

Another example to view the contents of the config_db.json file is shown below:

admin@Leaf1:~$ cat /etc/sonic/config_db.json


{
"DEVICE_METADATA": {
"localhost": {
"docker_routing_config_mode": "split",
"hostname": "Leaf1",
"hwsku": "DellEMC-S5248f-P-25G",
"mac": "8c:04:ba:a7:ee:c0",
"platform": "x86_64-dellemc_s5248f_c3538-r0",
"type": "LeafRouter"
}
},
"FLEX_COUNTER_TABLE": {
"PFCWD": {
"FLEX_COUNTER_STATUS": "enable"
},
"PORT": {
"FLEX_COUNTER_STATUS": "enable"
},
"QUEUE": {
"FLEX_COUNTER_STATUS": "enable"
}
},
"HARDWARE": {
"ACCESS_LIST": {
"COUNTER_MODE": "per-rule",
"LOOKUP_MODE": "optimized"
}
},
"INTERFACE": {

Outlook of the Current Management Infrastructure 11


"Ethernet52": {
"ipv6_use_link_local_only": "enable"
}
},
"LOOPBACK_INTERFACE": {
"Loopback0": {},
"Loopback0|10.0.1.1/32": {}
},
"MGMT_INTERFACE": {
"eth0": {},
"eth0|100.67.204.35/24": {}
},
"MGMT_PORT": {
"eth0": {
"admin_status": "up",
"autoneg": "true",
"description": "Management0",
"mtu": "1500",
"speed": "1000"
}
}
***** OUTPUT TRUNCATED *****

Python-based SONiC CLI


The Python-based SONiC CLI is built on the implementation of Python Click library. The CLI component provides the administrators with a
customizable approach to creating command-line tools. It uses the Python Click library to create a command-line interface. Click provides
users with the option to perform various configurations and leverage the defaults that are provided with the package.

NOTE: For more information about Click, see https://click.palletsprojects.com/en/7.x/.

NOTE: The Python-based SONiC CLI is also referred to as the Click CLI, or SONiC CLI.

The following are some of the configurations supported using the Python-based SONiC CLI:

admin@sonic:~$ sudo config ?


Usage: config [OPTIONS] COMMAND [ARGS]...

SONiC command line - 'config' command

Options:
--help Show this message and exit.

Commands:
aaa AAA command line
acl ACL-related configuration tasks
bgp BGP-related configuration tasks
classifier Classifiers related configuration tasks
copp Configure COPP
core Configure coredump
custom_assert Configuration action on assert
ecn ECN-related configuration tasks
export
flow Flow related configuration tasks
hardware Configure hardware parameters
hostname
igmp_snooping igmp-snooping configuration tasks
interface Interface-related configuration tasks
***** OUTPUT TRUNCATED *****

NOTE: Root privilege is required to run the configuration commands.

For example, both IPv4 and IPv6 can be assigned to interfaces. To assign or remove an IP address for an interface, use the following
commands:

sudo config interface ip add <interface-name> <ip-address>


sudo config interface ip remove <interface-name> <ip-address>

12 Outlook of the Current Management Infrastructure


To assign the 192.168.1.0/31 IP address to Ethernet0 on the switch, enter the following command:

sudo config interface ip add Ethernet0 192.168.1.0/31


sudo config save

The below example shows the configuration of VLAN 1611 on a switch using the Python-based SONiC CLI.
To configure an interface in a VLAN, assign the VLAN ID, then assign an interface to the VLAN ID. Optionally, an IP address can be
assigned to the VLAN.
Enter the following command to create VLAN 1611:

sudo config vlan add 1611

To add interface Ethernet4 to VLAN 1611, enter the following command:

sudo config vlan member add 1611 Ethernet4

To assign IP address 172.16.11.254/24 to VLAN 1611, enter the following command:

sudo config interface ip add Vlan1611 172.16.11.254/24

Free Range Routing


Another method to configure routing uses the Free Range Routing (FRR) shell.

NOTE: Free Range Routing is also referred to as FRR and FRRouting.

FRR provides the configuration and management of Layer 3 protocols. FRR can be used to configure routing protocols such as BGP and
RIP. FRR operates as a suite of daemons that work together with the protocols that are implemented as separate services. Some of the
advantages of using FRR include:
• High resiliency
• Independent daemons for protocols
• Flexibility
• Open source
Some of the protocols that FRR supports include:
• EIGRP
• BABEL
• RIP
• OSPF
• IS-IS
• PIM
• VRRP
• BGP
FRR also supports various use cases that require routing and routing protocols.
NOTE: For detailed information about FRR and supported platforms, see http://docs.frrouting.org/en/latest/
overview.html.
By default, the FRR stack is installed in the SONiC NOS and is available when the switch boots. The FRR CLI is accessible from the Linux
shell.
SONiC routing protocols are configured in the FRR shell. To enter the FRR shell from SONiC, enter the vtysh command.

Outlook of the Current Management Infrastructure 13


Figure 5. Leaf-spine topology

The following is a sample eBGP configuration that uses the FRR shell on Leaf1, as shown in the previous figure.

router bgp 65101


bgp router-id 10.0.2.1
bgp graceful-restart
bgp bestpath as-path multipath-relax
neighbor SPINE peer-group
neighbor SPINE timers 3 9
neighbor SPINE advertisement-interval 5
neighbor 192.168.1.0 peer-group SPINE
neighbor 192.168.1.0 remote-as 65100
neighbor 192.168.2.0 peer-group SPINE
neighbor 192.168.2.0 remote-as 65100
address-family ipv4 unicast
redistribute connected route-map spine-leaf
exit-address-family

Simple Network Management Protocol


In general, network management stations use Simple Network Management Protocol (SNMP) to retrieve and modify software
configurations for managed objects on an agent in network devices. The SNMP agent in a managed device maintains the data for
managed objects in management information bases (MIBs). Managed objects are identified by their object identifiers (OIDs). A remote
SNMP agent performs an SNMP walk on the OIDs stored in MIBs on the local switch to view and retrieve information.

NOTE: For more information, see SONiC system architecture.

The SNMP versions supported by SONiC are:


• SNMPv1
• SNMPv2c
The MIBs supported by Enterprise SONiC Distribution by Dell Technologies include:
• SNMPv2-MIB
• RFC1213-MIB

14 Outlook of the Current Management Infrastructure


• LLDP-MIB
• IF-MIB
• IP-MIB
• TCP-MIB
• UDP-MIB
• IPV6-MIB
• BRIDGE-MIB
• Q-BRIDGE-MIB
• HOST-RESOURCES-MIB
• DISMAN-EVENT-MIB
• MTA-MIB
• ENTITY-MIB
• NOTIFICATION-LOG-MIB
• CISCO-PFC-EXT-MIB
• UCD-SNMP-MIB
• NET-SNMP-AGENT-MIB
• NET-SNMP-VACM-MIB
• SNMP-FRAMEWORK-MIB
• SNMP-MPD-MIB
• SNMP-TARGET-MIB
• SNMP-NOTIFICATION-MIB
• SNMP-USER-BASED-SM-MIB
• SNMP-VIEW-BASED-ACM-MIB
Thus, SNMP provides a functional model to retrieve information, configure entities, and monitor the SONiC switch.

Summary of management models


The methods to configure SONiC are listed below:
• Linux shell and Python-based SONiC CLI (see figure)
• config_db.json (see figure)
• minigraph.xml
• FRR (see figure)
• SNMP

Figure 6. Linux shell and Python-based SONiC CLI

Outlook of the Current Management Infrastructure 15


Figure 7. Config_db.json file

Figure 8. Configuration using the FRR shell

Challenges with current management models


SONiC provides disparate choices for the configuration of the NOS. While different options are available and may look beneficial, there are
challenges:
• Configuration model is not centralized
The user has to navigate between different shells such as Click CLI and FRR shell, JSON files, and .xml files to configure the switch.
• Back-up of configurations
The user cannot back up the switch configurations from a single place. FRR, config_db.json, and minigraph.xml configurations
might be needed for a single switch.
• Varied commands and formats
The user must familiarize themselves with different configuration commands and syntaxes. For example, configurations that use the
Python-based SONiC CLI are of a different format from the configuration that is done in the FRR shell.
• Monitoring alerts
The management and configuration of alerts can prove to be a challenge, without options for a cohesive way to send alerts.
• Misconfiguration due to different options
Misconfiguration of a switch is possible if a user doesn’t update the relevant shell or config file when configuring a switch with existing
configuration files.

16 Outlook of the Current Management Infrastructure


• Root privilege to run configuration commands
A user requires root privilege to run the configuration commands for Python-based SONiC CLI.
With such challenges existing with current management models, there is a need for a comprehensive and centralized management
framework.

Summary
This chapter covered the legacy management models for SONiC NOS. The configuration method and sample example using the
management models such as Linux shell, Python-based SONiC CLI, FRRouting shell, config_db.json file, and SNMP were discussed.
The advantage, disadvantages, and operational complexity can be inferred in using these disparate management models. To avoid these
hardships, the SONiC Management Framework model pioneered by Dell Technologies provides a single way to configure features and is a
model that is configured and managed from a single place.

Outlook of the Current Management Infrastructure 17


3
SONIC Management Framework
Topics:
• Management framework
• Advantages of using the Management Framework CLI
• Access the CLI
• Management framework CLI
• REST
• gRPC Network Management Interface
• gRPC Network Operations Interface
• DevOps, automation, and telemetry
• Authentication, authorization, and accounting services
• Role-based access control
• Summary

Management framework
To overcome the challenges discussed in the previous chapter, Dell Technologies has taken the lead to introduce a new Management
Framework for SONiC. The Management Framework includes:
• CLI
• REST
• gNMI
• gNOI
The Management Framework CLI is a SONiC application responsible for managing configuration and status on SONiC switches. This
application provides a coherent way to validate, configure, and manage the features running on the SONiC NOS. This intuitive and holistic
CLI provides a familiar way of configuring various functions and features available in the switch. There are numerous benefits in using the
CLI as it allows for centralized operation and management.
SONiC may also be configured and administered using a REST API and a gRPC network management interface (gNMI) using YANG data
models. The Management Framework supports both standard and custom YANG models for communication with the corresponding
management servers. The Management Framework runs in a single container named sonic-mgmt-framework. The following
command lists the docker containers and displays only the Management Framework docker container:

admin@Leaf1:~$ docker ps | grep mgmt


00983c37e4a7 docker-sonic-mgmt-framework-dbg:latest "/usr/bin/supervisord" 27 hours
ago Up 27 hours mgmt-framework

NOTE: Legacy methods of configuration, such as the FRR shell and config_db.json, are supported.

Advantages of using the Management Framework


CLI
The management model for configuring and using the SONiC switches is now centralized. You only need to minimally navigate between
different shells and JSON files to configure and validate the switches. The back-up of configurations is far more effortless. The switch
configurations from different management models need not be backed up individually.
Since the CLI provides centralized management, you do not need to learn commands and formats associated with various shells. For
example, the difference in configurations using the Linux shell, Python-based SONiC CLI, and FRR shell becomes immaterial with the use
of CLI. The management and configuration of alerts are streamlined with options to create a cohesive way to send alerts. Misconfiguration
due to different shells available is now eliminated by using the CLI.

18 SONIC Management Framework


Access the CLI
After logging into the switch, the user can access the CLI from Linux shell using the command shown below:
sonic-cli

Figure 9. Accessing the CLI

Management framework CLI


The CLI supports numerous show commands. To view the various show commands that the CLI supports, type show ?.

sonic# show ?
aaa Show aaa info
bfd Bidirection Forwarding Detection
bgp show bgp community, ext community, as list information
image Show image information
interface Show Interface info
ip show IP commands
ipv6 show IP commands
kdump Show kdump status
link Show link state tracking information
lldp Show lldp information
mac Show MAC
mclag Show mclag domain
mirror-session Show Mirror session information
nat Show NAT info
NeighbourSuppressStatus Show arp and nd suppression status
platform Show platform information
PortChannel LAG status and configuration
***** OUTPUT TRUNCATED ******

A user can take advantage of the CLI to access the information and data that is associated with the SONiC NOS and hardware. The
following examples depict how the show command gathers system details.
The show system command lists the attributes such as the hostname, boot time, current date and time, and domain name.

sonic# show system


-----------------------------------------------------------
Attribute Value/State
-----------------------------------------------------------
Hostname :Leaf3
Boot Time :1577362334
Current Datetime :2020-01-08T16:44:22Z+00:00
Domain Name :None

The show system memory command provides details on the total available memory and the used memory.

sonic# show system memory


-----------------------------------------------------------
Attribute Value/State
-----------------------------------------------------------
Used :2951756
Total :8162428

The CLI supports many configuration commands. Access the configuration mode by using the command configure terminal.

sonic(config)# ?
aaa AAA configuration
bfd Configure BFD peers
bgp bgp command list
end Exit to the exec Mode
evpn EVPN Global Configuration

SONIC Management Framework 19


exit Exit from current mode
interface Select an interface
ip Global IP configuration subcommands
ipv6 ipv6 prefix-list
kdump kdump command
link Create link state tracking group
mclag domain
mirror-session Mirror session configuration
***** OUTPUT TRUNCATED ******

Management Framework CLI use case example


Various use cases are associated with the Management Framework CLI. The CLI allows a user to configure the Enterprise SONiC
Distribution by Dell Technologies for enterprise and scalable use cases. One use case for an enterprise data center uses ZTP to configure
several switches simultaneously. To do that, the ZTP service must be enabled on the NOS. The ZTP service is enabled by default when
the NOS boots. If the service is disabled, use the ztp enable CLI command to reenable the service.
Another use case example relates to a scalable architecture in the enterprise data center. The Layer 3 leaf-spine architecture model
requires the configuration of the leaf and spine switches, with the Layer 2 and Layer 3 boundary defined at the leaf switches.

Figure 10. Assigning IP addresses on switch Leaf1 using the CLI

The following example shows the initial configuration steps for assigning an IP address to a loopback interface, and point-to-point
interfaces in switch Leaf 1 within the leaf-spine topology shown in the previous figure.

interface Loopback 0
ip address 10.0.2.1/32
exit

interface Ethernet 0
description To-Spine1
ip address 192.168.1.1/31
no shutdown
exit

20 SONIC Management Framework


interface Ethernet 4
description To-Spine2
ip address 192.168.2.1/31
no shutdown
exit

interface Ethernet 12
description To-Dynamic-neighbor
ip address 50.50.50.2/24
no shutdown
exit

REST
SONiC Management Framework provides various common North Bound Interfaces (NBIs) for managing configuration and retrieving
status on SONiC switches. SONiC Management Framework includes support for NBIs such as CLI, gNMI, and REST. A use case includes
using a REST client by the user to perform POST, PUT, PATCH, DELETE, GET operations on the supported data models, and URL paths.
The management REST server leverages the HTTP/HTTPS protocol.
NOTE: For more information, see https://github.com/project-arlo/SONiC/blob/master/doc/mgmt/Management
%20Framework.md#3122-REST.

REST use case example


Consider the use case example of accessing the REST API using the management IP address of the switch for viewing switch information.
The management IP address of the switch in this example is 100.67.204.213. To access the REST API, in a web browser, type:
https://100.67.204.213/ui

SONIC Management Framework 21


Figure 11. SONiC REST API explorer

Select the REST API for openconfig-platform. This will open the page for selected REST API. The page will list all the REST API
paths and each action available such as GET, POST, PATCH, DELETE, and PUT. Once it is executed, the tool fetches the data via REST
API with the "components" data, as shown in the following figure.

22 SONIC Management Framework


Figure 12. Switch data retrieval using REST API for openconfig-platform

gRPC Network Management Interface


gRPC Network Management Interface (gNMI) is a Google Protocol RPC (gRPC) based protocol to manage network devices. gRPC is a
framework used to connect services competently. SONiC provides the gNMI server, while the user provides the client. gNMI uses the
gRPC framework to enable telemetry and configuration management of devices. gNMI streams data from the network device and
provides the functionality to configure and retrieve operational and configuration states while providing a powerful method for switch
management. The use of gNMI for the streaming of data from the network device and the functionality to configure and retrieve
operational and configuration states provides a powerful method for switch management.
The supported RPC operations include:
• Get
• Set
• Capabilities
• Subscribe

gNMI use case example


The use case for gNMI includes streaming telemetry, verifying the state of interfaces, setting the MTU for an interface, configuring and
manipulating ACL, and so on. This section provides an example of setting MTU on an interface.

SONIC Management Framework 23


Use case example for setting MTU on interface Ethernet0
The following example provides the means to set the MTU on interface Ethernet0.

sudo -I
docker exec -it telemetry bash
gnmi_set -username admin -password Dell\!234 -update /openconfig-interfaces:interfaces/
interface[name=Ethernet0]/config/mtu:@mtu.json -target_addr 127.0.0.1:8080 -insecure true -
pretty

The json file mtu.json contains the following:

{
"mtu":1514
}

gRPC Network Operations Interface


gRPC Network Operations Interface (gNOI) is a set of microservices that are based on gRPC that runs commands on a network device
using protocol buffers. Protocol buffers are used to serialize structured data.

NOTE: For more information, see https://developers.google.com/protocol-buffers.

DevOps, automation, and telemetry


Options for DevOps, automation, and telemetry for Enterprise SONiC Distribution by Dell Technologies include:
• OpenConfig support
• gRPC Network Management Interface (gNMI)
• gRPC Network Operations Interface (gNOI)

OpenConfig option
OpenConfig aims to develop networks that are dynamic and programmable using software-defined networking principles, such as model-
driven management and operations. For more information, see http://openconfig.net/.

gNMI option
gRPC Network Management Interface (gNMI) uses the gRPC framework to enable telemetry and configuration management of devices.
Google Protocol RPC (gRPC) is a Remote Procedure Call framework that is used to connect services competently. gNMI service
describes an interface for the network management system to interact with a network element. gNMI streams data from the network
device and provides the functionality to configure and retrieve operational and configuration states while providing a powerful method for
switch management.

gNOI option
gRPC Network Operations Interface (gNOI) is a set of microservices that are based on gRPC that runs commands on a network device
using protocol buffers. Protocol buffers are used to serialize structured data.

NOTE: For more information, see https://developers.google.com/protocol-buffers.

24 SONIC Management Framework


Authentication, authorization, and accounting
services
Authentication, authorization, and accounting (AAA) services secure networks against unauthorized access. Besides local authentication,
Enterprise SONiC Distribution by Dell Technologies supports remote authentication dial-in user service (RADIUS) and terminal access
controller access control system (TACACS+) client/server authentication systems. For RADIUS and TACACS+, a SONiC switch acts as a
client and sends authentication requests to a server that contains all user authentication and network service access information.
For user authentication, the SONiC REST API uses:
• HTTP basic authentication
• Server certificates
• JSON Web Token (JWT)-coded tokens
REST API authentication using a certificate requires a client certificate to be sent by the client. The certificate is signed by a certificate
authority (CA) and contains the common name (CN) field set to the name of the user.
There are three types of authentication which you can include in gNMI requests:
• Username and password
The username and password are sent in the metadata in the request.
• JSON web token (JWT)
JWT requires you first to authenticate using a gNOI RPC call by providing a username and password.
• Certificates
Certificate authentication requires the use of a valid certificate, signed by the certificate authority (CA) specified in the switch, and
must contain the username in the common name (CN) field.

Role-based access control


Role-based access control (RBAC) provides control for access and authorization. Users are granted permission based on defined roles.
The administrator can create user roles that are based on job functions to allow users appropriate system access. RBAC places limitations
on each role’s permissions to allow you to partition tasks. You can assign each user only a single role. Several users can have the same role.
A user role authenticates and authorizes a user at login and places the user in EXEC mode. Each user role assigns permissions that
determine the commands that a user can enter, and the actions a user can perform. RBAC provides an efficient way to administer user
rights.

Summary
In this chapter, the advantages and use cases for the SONiC Management Framework model, accessing and configuring features in the
CLI, infrastructure management integration with SONiC NOS, various DevOps, automation and telemetry options available was covered.
Thus, the SONiC Management Framework provides a comprehensive advantage in configuring and managing the Enterprise SONiC
Distribution by Dell Technologies NOS by a user with ease, overcoming the challenges that arose with the use of disparate traditional
management models such as FRR shell, config_db.json, and so on.

SONIC Management Framework 25


4
Zero Touch Provisioning
Topics:
• Introduction
• Advantages of using ZTP
• ZTP process
• User-defined ZTP JSON file use case example
• Example of configuring ZTP using CLI
• Summary

Introduction
The Zero-Touch Provisioning (ZTP) service upgrades firmware and configures many switches simultaneously. ZTP can also help perform
connectivity checks after the upgrade. As the data center scales and more network equipment is added, the switches must be upgraded
with the wanted firmware, followed by the configuration of the switches. It is inefficient and time-consuming to update the firmware and
configure each switch manually.
To carry out ZTP, the booting switches require a connection to the remote provisioning server so that it can download the necessary
firmware, configuration files, and scripts. The SONiC ZTP service processes this provisioning information in the ZTP JSON file. The
automation of bringing up the switch that is based on user requirements without user intervention provides a significant advantage.

Advantages of using ZTP


The benefits of using ZTP include:
• Easy installation of switch firmware
• Automatic push of switch configuration files
• Ability to run postprovisioning scripts
• Use of scripts to check connectivity

ZTP process
When a switch boots, the ZTP service starts and checks for the presence of the startup configuration file. The config_db.json
startup configuration file is at /etc/sonic/. DHCP discovery is used to get the information about where the ZTP JSON file is located.
The JSON file has the configuration needed for the switch. The administrator can define the configurations that are required for a
particular switch.

NOTE: When ZTP is disabled by the user, the switch boots the default configuration available on the switch.

The DHCP discovery process is done on in-band interfaces and also the management interface. DHCPv4 and DHCPv6 address discovery
are supported. The first interface that receives a valid DHCP offer determines the URL specifying the location of the ZTP JSON file. At
the DHCP server end, the DHCP offer must include the DHCPv4 Option 67 or the DHCPv6 Option 59 to specify the ZTP JSON file URL.
See the following figure.

26 Zero Touch Provisioning


Figure 13. JSON URL

The ZTP service performs the configuration steps, which may cause a switch to reboot multiple times. After the downloaded ZTP JSON
file is processed, the SONiC ZTP service exits, as shown in the following figures.

Figure 14. ZTP JSON file

Figure 15. Sections of ZTP JSON file run

NOTE: For more information, see https://github.com/Azure/SONiC/blob/master/doc/ztp/ztp.md.

User-defined ZTP JSON file use case example


Consider the use case of enterprise data centers with leaf switch pairs in the top of rack(TOR). It is tedious to configure the switches
individually, and ZTP is an excellent choice for pushing needed firmware and configurations.
The configuration sections that are defined in the ZTP JSON file are processed in the lexical order of their names. This provides a suitable
way for the user to indicate the order in which they must be run.

Zero Touch Provisioning 27


Figure 16. Leaf-spine topology use case for ZTP

An example ZTP JSON file to update the firmware and configuration for Leaf3 in the figure is shown below:

{
"ztp": {
"01-firmware": {
"install": {
"url": "http://192.168.1.1/SONiC-OS-3.0.1-Enterprise_Base.bin"
}
},
"02-configdb-json": {
"dynamic-url": {
"source": {
"prefix": "http://192.168.1.1/",
"identifier": "Leaf3",
"suffix": "¬_config_db.json"
}
}
},
"03-provisioning-script": {
"plugin": {
"url":"http://192.168.1.1/post_install.sh"
},
"reboot-on-success": true
},
"04-connectivity-check": {
"ping-hosts": [ "172.16.11.1", "172.16.11.2" ]
}
}
}

This is from the Examples section of https://github.com/Azure/SONiC/blob/master/doc/ztp/ztp.md#62-configuration-commands .


The config_db.json is saved as Leaf3_config_db.json in the web root of the HTTP server with the address 192.168.1.1. This allows the
ZTP service to identify the configuration file associated with the switch uniquely. In this manner, multiple config_db.json files are
related to several switches. The post_install.sh postprovisioning script downloads and runs. After the system reboots, perform a
connectivity check to verify the connectivity by pinging hosts 172.16.11.1 and 172.16.11.2.

28 Zero Touch Provisioning


Example of configuring ZTP using CLI
The CLI facilitates the management of the ZTP service. The CLI can be used to do the following activities:
• Show the status of the ZTP service
• Enable the ZTP service
• Disable the ZTP service
In this section, sample configurations and show commands for ZTP are provided.

Enable ZTP
Use the ztp enable command to administratively enable ZTP.

NOTE: By default, the ZTP service is enabled. The ztp enable command reenables the ZTP after a user disables it.

Enable ZTP on the Leaf-3 switch by running the Leaf-3(config)# ztp enable command.

Verify ZTP
Use the show ztp-status command to view the current ZTP configuration of the switch and thorough information about the current
state of a ZTP session.
On the SONiC switch, the ZTP status is viewable:

Leaf-3 # show ztp-status


========================================
ZTP
========================================
ZTP Admin Mode : True
ZTP Service : Inactive
ZTP Status : SUCCESS
ZTP Source : dhcp-opt67 (eth0)
Runtime : 05m 31s
Timestamp : 2019-09-11 19:12:16 UTC
ZTP JSON Version : 1.0

ZTP Service is not running

----------------------------------------
01-configdb-json
----------------------------------------
Status : SUCCESS
Runtime : 02m 48s
Timestamp : 2019-09-11 19:11:55 UTC
Exit Code : 0
Ignore Result : False

----------------------------------------
02-connectivity-check
----------------------------------------
Status : SUCCESS
Runtime : 04s
Timestamp : 2019-09-11 19:12:16 UTC
Exit Code : 0
Ignore Result : False

Disable ZTP
To disable ZTP, a provision is available to stop and disable the ZTP service. Once the ZTP service is disabled, you must manually reenable
the service when rebooting or if the startup configuration file is not present.
To accomplish this, use the no ztp enable command.
On the Leaf-1 switch, ZTP can be disabled by running the Leaf-3(config)# no ztp enable command.

Zero Touch Provisioning 29


Summary
To meet the criteria of upgrading firmware and configuring many switches simultaneously, the Zero-Touch Provisioning (ZTP) service is
used. ZTP can also assist in performing connectivity checks after upgrade by adding it in a user-defined JSON file. In this chapter, we
further looked at the advantage of using ZTP, the process involved in ZTP, a use case example for a user-defined JSON file for use in
ZTP, and an example of invoking the ZTP service, verifying the status of ZTP, and stopping ZTP service.

30 Zero Touch Provisioning


5
Monitoring and Telemetry
Topics:
• Introduction
• Telemetry terminology
• gNMI modes
• Supported RPC operations
• Visibility with advanced telemetry
• Summary

Introduction
Network data is gaining further importance in scalable infrastructures, where the number of monitored devices grows. The collection of
this network data such as operational state, and configuration, can significantly assist in network analysis, which improves network stability
and troubleshooting. There are traditional methods for collecting network data. Some examples include using SNMP or Syslog. This can be
done in two ways:
• Pulling data from a network device
• Having the networking device stream data to the management system
The efficiency of streaming telemetry rests in the streaming of incremental data updates from the network device to the management
system. The administrator can subscribe to the data they need.
Various benefits are associated with streaming telemetry. It removes the inefficiencies that are associated with traditional network
telemetry (SNMP), which requires management systems to be polling for the data, irrespective of whether there is a change or not.
Streaming telemetry provides insight into the network data, which can be used for network analysis, remediation of issues, and network
optimization. SONiC NOS supports streaming telemetry using gNMI.

Telemetry terminology
This section describes the various terms for telemetry.

Term Definition
Sensor path The path used to collect data for streaming telemetry.

Sensor group A reusable group of multiple sensor paths and exclude filters.

Destination group The IP address and transport port on a destination server to which telemetry data is streamed. You can
configure multiple destinations and reuse the destination group in subscription profiles.

Subscription profile Data collector destinations and stream attribute that is associated with sensor paths. A subscription ties
sensor paths and a destination group with a transport protocol, encoding format, and streaming interval.
The telemetry agent in the switch attempts to establish a session with each collector in the subscription
profile, and streams data to the collector. If a collector is not reachable, the telemetry agent continuously
tries to establish the connection at one-minute intervals.

gNMI modes
The two modes that are based on the telemetry session initiation carried out by the switch are:
• Dial-in
• Dial-out

Monitoring and Telemetry 31


Dial-in
The switch initiates and establishes a session with the collector. Once the session is established, telemetry data is sent from the switch.

Dial-out
The collector initiates and establishes a session with the switch. Once the session is established, telemetry data is sent from the switch.

Supported RPC operations


NOTE: Enterprise SONiC Distribution by Dell Technologies supports only Dial-in mode.

The existing SONiC telemetry framework has been extended to support the new gNMI services. All four gNMI services are supported:
• Get
• Set
• Capabilities
• Subscribe
A use case example includes using gNMI for streaming telemetry. For more information about the supported gNMI services and examples,
see https://github.com/Azure/SONiC/blob/master/doc/mgmt/Management%20Framework.md.

Visibility with advanced telemetry


Networks are becoming more sophisticated and more extensive with an increase in business requirements. The advent of virtualization
and software defined networks has furthered the complexity in troubleshooting network issues. To troubleshoot the networks effectively,
superior visibility into the network is needed.
The advantages of using advanced telemetry include:
• Better network visibility
• Ease of managing scalable networks
• Assistance in real-time network issue solving
• Faster response time to network issues
• More in-depth insight into the root cause of the problem
These advantages provide administrators significant insight and ease of management of network in large enterprise data centers and
cloud. The complexity and effort needed to troubleshoot packet drops, congestions, and other network issues can be difficult without the
appropriate insight into the network. The administrator might be interested in finding the bottle-neck in the network or congestion of
network traffic. Another use case includes tracking the latency in the network. Another use case is the troubleshooting of dropped
packets and the causes associated with it. These use cases provide the incentive to use advanced telemetry.
For example, consider a fast-growing network in an enterprise data center or cloud, as shown in the following figure. With the increase in
business requirements, the scaling of network equipment results in numerous switches to be managed. The use cases that are listed
above can be useful to the network administrator.

Figure 17. Leaf-spine topology example in a scalable data center

32 Monitoring and Telemetry


SONiC provides granular, real-time insights from advanced merchant silicon-based telemetry to enable rapid provisioning and remediation
of network issues. These include:
• Inband flow analysis
• Congestion monitoring
• Mirror on drop
Inband Flow Analysis provides packet-level visibility for granular latency and path tracking at the flow level. Proactive port congestion
detection using buffer monitoring is carried out using congestion monitoring. Notification for packet drops with the reason and the flows
being impacted is monitored using Mirror on Drop. The collector gets the information from the switches using REST APIs.

Inband flow analysis


Inband flow analysis allows scalable flow monitoring solution using Inband Telemetry. It provides packet-level visibility for granular latency,
path tracking, and congestion analysis. The collector receives UDP encapsulated info to provide a network-level solution.
Some of the critical metadata parameters include:
• Ingress and egress timestamp
• Ingress and egress port
• Device ID
• Queue ID and congestion status
• IP TTL value

Congestion monitoring
The congestion monitoring provides real-time visibility into the network. It offers scalable tracking of buffer occupancies and the ability to
monitor peak or current occupancy. The granular details can be accessed at a port level, port group level, or service pool level counters. It
also provides the advantage of monitoring various types of drop counters.

Mirror on drop
Mirror on drop captures the drops in both ingress and egress pipelines. It captures the first dropped packet. It provides details on the
packet drop by generating an event. The generate event contains the drop reason and is sent to the collector for a mirror on drop flow.
The details include the first dropped packet with drop reason and drop reason change.
Inband flow analysis, congestion monitoring, and mirror on drop provide several advantages and solutions in network management. The
information obtained from granular, real-time insights from advanced merchant silicon-based telemetry will help with rapid provisioning,
troubleshooting, and resolving network issues.

Summary
In this chapter, the methods of obtaining network data using gNMI were covered. The information about gNMI, gNMI modes, supported
RPC operations, use cases, and sample examples for gNMI were covered. Visibility into scalable networks is orchestrated using advanced
telemetry. Thus, the use of gNMI for the streaming of data from the network device and the functionality to configure and retrieve
operational and configuration states provides a powerful method for switch management.

Monitoring and Telemetry 33


6
Flow Analysis
Topics:
• Introduction
• sFlow
• Introduction to packet-level network telemetry
• Benefits of packet-level network telemetry
• Everflow
• Summary

Introduction
Flow-based services improve the switch's ability to have better control over the traffic by providing a generic framework for the Match
and Set features. Incoming packets are classified according to match rules using fields from L2-L4 headers, and defined actions are taken
accordingly. Examples include QoS remarking and policing, monitoring using SPAN, sFlow, and Forwarding such as PBR, L2 redirect.
Flow-based monitoring conserves bandwidth by only inspecting specified traffic instead of inspecting all interface traffic. Flow-based
tracking allows you to monitor only the traffic that is received by the source port, and that matches criteria in the ingress access-lists
(ACLs). IPv4 ACLs, IPv6 ACLs, and MAC ACLs support flow-based monitoring.

sFlow
sFlow is a standard-based sampling technology that is embedded within switches and routers that monitor network traffic. sFlow provides
traffic monitoring for high-speed networks with many switches and routers. There are two types of sampling:
• A packet-based sampling of packet flows
• A time-based sampling of interface counters
sFlow monitoring consists of a sFlow agent that is embedded in the device and a sFlow collector. The sFlow agent resides anywhere
within the path of the packet. The agent combines the flow samples and interface counters into sFlow datagrams and forwards them to
the sFlow collector at regular intervals. The datagrams include information about the packet header, ingress and egress interfaces,
sampling parameters, and interface counters. Application-specific integrated circuits, or ASICs, handle packet sampling. The sFlow
collector analyses the datagrams that are received from the different devices and produce a network-wide view of the traffic flows.

Advantages of sFlow
Networking monitoring on PowerSwitches running Enterprise SONiC Distribution by Dell Technologies NOS is possible using sFlow. The
benefits of using sFlow include:
• Standards-based
• Allows scaling
• Monitoring accuracy

sFlow operations
The following operations can be performed using sFlow:
• Enable sFlow
• Disable sFlow
• Configure polling interval
• Disable polling interval
• Add a collector

34 Flow Analysis
• Add a collector with a port number
• Delete a collector
• Add agent-id information
• Disable sFlow agent
• Enable sFlow on the interface
• Disable sFlow on the interface
• Configure sampling rate on the interface
• Disable sampling rate on the interface

sFlow use case and example


sFlow provides a convenient way of monitoring traffic using flow-based sampling technology. Some of the use cases for sFlow include
network security monitoring in large enterprise data centers, monitoring traffic for different tenants in a logical network, monitoring traffic
on interfaces of interest, and Quality of Service (QoS).
Consider the use case example of monitoring traffic on switch S5248-Leaf2 on the ports that are connected to the Z9100-Spine1 switch,
and the Z9100-Spine2 switch, as shown in Figure 18. The Ethernet 0 interface on the S5248-Leaf2 switch connects to the Z9100-Spine1
switch, and Ethernet 4 interface connects to the Z9100-Spine2 switch.

Figure 18. Leaf-spine topology use case for sFlow

sFlow allows the sampled packet to be analyzed using the sFlow collector. Other functions include the polling interval and the sampling
rate. All these details can be configured using the CLI to enable and monitor the sFlow packet samples. Configure sFlow globally or on
individual interfaces on a SONiC switch.
In this example, the collector IP address is 100.67.204.144/24. The broad steps include enabling sFlow for the interface, indicating the
collector IP address (port number can also be mentioned optionally), defining the sFlow agent-id, and optionally modifying the sampling
rate and polling intervals from the default values.
The polling interval is the time in seconds when traffic samples or counters are collected. The interval range is from 5 to 300, and the
default is 20. Enter 0 to disable sFlow traffic polling.

Enable sFlow for the interfaces


configure terminal
interface Ethernet 0
sflow enable

Flow Analysis 35
exit
interface Ethernet 4
sflow enable
exit

Define sFlow agent-id


sflow collector collector_1 100.67.204.144

Define polling interval


sflow polling-interval 44

Verify sFlow
Validate and verify the configurations for sFlow using the following validation examples:

sonic # show sflow


---------------------------------------------------------
Global sFlow Information
---------------------------------------------------------
admin state: up
polling-interval: 44
agent-id: default
configured collectors: 1
collector_1 100.67.204.144 6343

sonic# show sflow interface


-----------------------------------------------------------
sFlow interface configurations
Interface Admin State Sampling Rate
Ethernet0 up 10000
Ethernet4 up 10000

Disable sFlow
To disable the sFlow configuration later, run the following command in the CLI:

configure terminal
interface Ethernet 0
no sflow enable
exit
interface Ethernet 4
no sflow enable
exit

Introduction to packet-level network telemetry


Datacenter networks (DCNs) are crucial for extensive online services that operate at high utilization levels. The smallest downtime can
lead to a significant revenue loss. Due to these high risks, it is essential to introduce an active model of DCN management, where the
infrastructure observes, analyzes, and corrects faults in near-real-time. As the errors vary, it is difficult to debug such failures.
Some packets may experience an exceptionally high delay between two servers, but it may be challenging to track which link is
responsible. Even when the packet-drop counters at switches display no packet-loss, packets that are destined for a specific set of
servers might drop. TCP connections to a Virtual IP may encounter intermittent timeouts, where traceroute probes are blocked by the
load balancers to debug the issue. The load may not be balanced among the group of ECMP (Equal Cost MultiPath) links.
The packets need examination at a granular level. Errors such as faulty interface or software issues can cause random failures that are
difficult to troubleshoot. Collection and analysis at a packet-level are called packet-level network telemetry.

36 Flow Analysis
Benefits of packet-level network telemetry
Packet-level network telemetry solves various faults that conventional tools cannot solve. Errors that make debugging challenging are
described in the following table.

Table 1. Benefits of packet-level network telemetry


Challenge Benefit
Silent packet drops With many switches present in the DCN environment, it is challenging to track the switch that causes
the packet loss. With constant tracing of specific packets across switches, the last hop switch can be
spotted.

Silent blackhole Packet-level network tracing allows you to detect and localize a silent black hole. A silent blackhole
can happen when corrupted entries in the TCAM table cannot be examined by monitoring the
forwarding table entries.

End-to-end latency Hop-by-hop latencies between two endpoints are traced using packet-level network telemetry.

Load imbalance The problem of unevenly forwarded flows by a group of ECMP links can be detected by packet-level
network telemetry, as to count the number of specific 5-tuple pattern flows mapped to each link. This
offers a more reliable and direct method of detection and debugging.

Protocol bugs Bugs in the implementation of network protocols such as BGP, PFC, and RDMA can cause
performance and reliability issues in the network. Troubleshooting these protocols is difficult as the
protocols are implemented by a third party.

Everflow
Everflow is a network telemetry system that provides scalable and flexible access to packet-level information in large data centers.
Everflow uses “match and mirror” functionality. Commodity switches can apply actions on packets that match on flexible patterns over
packet headers or payloads and then mirror packets to analysis servers by the action.
Everflow leverages the capability of commodity DCN switches to match based on predefined rules and then execute specific actions like
mirror and encapsulate to reduce the tracing overhead. The standards designed to handle DCN faults are as follows:
• In the DCN environment, flow size distribution is highly skewed, and even small follows are often associated with customer-facing
interactive services with strict performance requirements. Everflow traces every flow in DCN by introducing new rules that match
based on TCP, SYN, FIN, and RST fields in the packet.
• To permit flexible tracing, the packets can be marked with an additional “debug” bit in the header. A new rule can be installed that
traces any packet with the “debug” bit set.
• Like regular data traffic, protocol traffic is also traced as it is critical for the health and performance of DCN.

Everflow components and operation


Everflow consists of four key components:
• Controller
• Analyzer
• Storage
• Reshuffler
Some applications use the packet-level information that is provided by Everflow to debug network faults. Everflow configures the rules on
switches. Packets that match these rules are mirrored to the reshufflers and directed to the analyzers, which output the analysis results
into storage. Everflow encapsulates Ethernet packets in a GRE tunnel.
The analyzers are a distributed set of servers that process a portion of tracing traffic. To complete a packet trace, the analyzer checks for
loop and drop problems. A loop happens when the same device appears multiple times in the evidence. The drop is spotted when the last
hop trace is different from the expected trace, which is computed from the topology.
Everflow applications interact with the controller using several APIs to debug various network faults. With these APIs, the applications can
query the packet traces, install load counters, and trace traffic by marking the debug bit.

Flow Analysis 37
Figure 19. Everflow architecture

Everflow use case and example


Everflow can be enabled on certain production switches on-demand to debug live incidents. The topology that is used for configuration
example is shown in Figure 20. Everflow configuration can be carried out using the config_db.json on Enterprise SONiC Distribution
by Dell Technologies NOS.
In the topology example, the traffic flows from Server1 (172.16.11.10) to Server2 (172.16.12.10), passing through switch Leaf1. The gateway
IP address for Server1 configured on Leaf 1 is 172.16.11.253. The mirror session is applied on Leaf1 with the destination IP address of
12.0.0.2 and the GRE Tunnel type of 0x88be to the collector. The interface Ethernet 8 on Leaf1 connects to Server1, while Ethernet 16
connects to Spine1. A route is added to the collector Linux VM for all the mirror traffic. The collected data is then used for filtering and
visualization.

Figure 20. Topology for Everflow configuration example

Everflow configuration can be carried out using the config_db.json file on the Enterprise SONiC Distribution by Dell Technologies
NOS. The high-level steps for configuring Leaf1 are as follows:

Configure mirror session


config mirror_session add Mirror_Ses 172.16.11.253 12.0.0.2 50 100 0x88be 0

38 Flow Analysis
Define ACL table for mirror_acl that lists the ports in a temporary
json file
printf '{"ACL_TABLE": {"MIRROR_ACL": {"stage": "INGRESS", "type": "mirror", "policy_desc":
"Mirror_ACLV4_CREATION", "ports": ["Ethernet8", "Ethernet16"]}}}\n' > /tmp/apply_json2.json

Load ACL table to the DB


config load -y /tmp/apply_json2.json

Define the ACL rule that lists the source and destination for L3
packets to be captured
printf '{"ACL_RULE": {"MIRROR_ACL|Mirror_Rule": {"PRIORITY": "999", "IP_PROTOCOL": "61",
"MIRROR_ACTION": "Mirror_Ses", "SRC_IP": "172.16.11.10/24", "DST_IP": "172.16.12.10/24"}}}\n'
> /tmp/apply_json2.json

Load ACL rule to the DB


config load -y /tmp/apply_json2.json

Save config to config DB


The config_db command displays as follows:

{
"ACL_RULE": {
"MIRROR_ACL|Mirror_Rule": {
"DST_IP": "172.16.12.10/24",
"IP_PROTOCOL": "61",
"MIRROR_ACTION": "Mirror_Ses",
"PRIORITY": "999",
"SRC_IP": "172.16.11.10/24"
}
},
"ACL_TABLE": {
"MIRROR_ACL": {
"policy_desc": "Mirror_ACLV4_CREATION",
"ports": [
"Ethernet8",
"Ethernet16"
],
"stage": "INGRESS",
"type": "mirror"
}
},

"MIRROR_SESSION": {
"Mirror_Ses": {
"dscp": "50",
"dst_ip": "12.0.0.2",
"gre_type": "0x88be",
"queue": "0",
"src_ip": "172.16.11.10/24",
"ttl": "100"
}
}

Flow Analysis 39
Summary
In this chapter, the introduction to the flow-based services was covered. The advantages of standard-based sampling technology sFlow,
its operations, use case, and example configuration were detailed. Further, the details on the network telemetry system for monitoring
packet-level information, namely EverFlow, its components and operation, use cases, and sample configuration were discussed.

40 Flow Analysis
7
References
Topics:
• SONiC documentation
• Ansible documentation
• Dell Technologies Networking Infrastructure Solutions documentation

SONiC documentation
The following pages provide additional information about SONiC.
SONiC GitHub
Management Documentation
Azure GitHub
SONiC Wiki page
SONiC quick start guide
SONiC Configuration Database Manual
SONiC Command Line Interface Guide
Extended SONiC Telemetry Architecture
SONiC EverFlow High Level Design

Ansible documentation
For additional information about Ansible, see:
Ansible Site

Dell Technologies Networking Infrastructure


Solutions documentation
The following documentation provides additional networking solutions information.
NOTE: Access to the documentation may require user credentials. If you do not have access to a document, contact
your Dell Technologies representative.
Networking solutions: https://infohub.delltechnologies.com/t/networking-solutions-57/

References 41

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy