rfc3549-Linux Netlink as an IP Services Protocol
rfc3549-Linux Netlink as an IP Services Protocol
Salim
Request for Comments: 3549 Znyx Networks
Category: Informational H. Khosravi
Intel
A. Kleen
Suse
A. Kuznetsov
INR/Swsoft
July 2003
Copyright Notice
Abstract
Table of Contents
1. Introduction ............................................... 2
1.1. Definitions ........................................... 3
1.1.1. Control Plane Components (CPCs)................ 3
1.1.2. Forwarding Engine Components (FECs)............ 3
1.1.3. IP Services ................................... 5
2. Netlink Architecture ....................................... 7
2.1. Netlink Logical Model ................................. 8
2.2. Message Format......................................... 9
2.3. Protocol Model......................................... 9
2.3.1. Service Addressing............................. 10
2.3.2. Netlink Message Header......................... 10
2.3.3. FE System Services’ Templates.................. 13
3. Currently Defined Netlink IP Services....................... 16
3.1. IP Service NETLINK_ROUTE............................... 16
3.1.1. Network Route Service Module................... 16
3.1.2. Neighbor Setup Service Module.................. 20
3.1.3. Traffic Control Service........................ 21
3.2. IP Service NETLINK_FIREWALL............................ 23
3.3. IP Service NETLINK_ARPD................................ 27
4. References.................................................. 27
4.1. Normative References................................... 27
4.2. Informative References................................. 28
5. Security Considerations..................................... 28
6. Acknowledgements............................................ 28
Appendix 1: Sample Service Hierarchy .......................... 29
Appendix 2: Sample Protocol for the Foo IP Service............. 30
Appendix 2a: Interacting with Other IP services................. 30
Appendix 3: Examples........................................... 31
Authors’ Addresses.............................................. 32
Full Copyright Statement........................................ 33
1. Introduction
We first give some concept definitions and then describe how Netlink
fits in.
1.1. Definitions
____ +---------------+
+->-| FW |---> | TCP, UDP, ... |
| +----+ +---------------+
| |
^ v
| _|_
+----<----+ | FW |
| +----+
^ |
| Y
To host From host
stack stack
^ |
|_____ |
Ingress ^ Y
device ____ +-------+ +|---|--+ ____ +--------+ Egress
->----->| FW |-->|Ingress|-->---->| Forw- |->| FW |->| Egress | device
+----+ | TC | | ard | +----+ | TC |-->
+-------+ +-------+ +--------+
The figure above shows the Linux FE model per device. The only
mandatory part of the datapath is the Forwarding module, which is RFC
1812 conformant. The different Firewall (FW), Ingress Traffic
Control, and Egress Traffic Control building blocks are not mandatory
in the datapath and may even be used to bypass the RFC 1812 module.
These modules are shown as simple blocks in the datapath but, in
fact, could be multiple cascaded, independent submodules within the
indicated blocks. More information can be found at [10] and [11].
Packets that are not for the NE may further traverse a policy routing
submodule (within the forwarding module), if so provisioned. Another
firewall module is walked next. The firewall module can drop or
munge/transform packets, depending on the configured sub-modules
encountered and their policies. If all goes well, the Egress TC
module is accessed next.
1.1.3. IP Services
The time span of the service is from the moment when the packet
arrives at the NE to the moment that it departs. In essence, an IP
service in this context is a Per-Hop Behavior. CP components running
on NEs define the end-to-end path control for a service by running
control/signaling protocol/management-applications. These
distributed CPCs unify the end-to-end view of the IP service. As
noted above, these CP components then define the behavior of the FE
(and therefore the NE) for a described packet.
2. Netlink Architecture
The interaction between the FEC and the CPC, in the Netlink context,
defines a protocol. Netlink provides mechanisms for the CPC
(residing in user space) and the FEC (residing in kernel space) to
have their own protocol definition -- kernel space and user space
just mean different protection domains. Therefore, a wire protocol
is needed to communicate. The wire protocol is normally provided by
some privileged service that is able to copy between multiple
protection domains. We will refer to this service as the Netlink
service. The Netlink service can also be encapsulated in a different
transport layer, if the CPC executes on a different node than the
FEC. The FEC and CPC, using Netlink mechanisms, may choose to define
a reliable protocol between each other. By default, however, Netlink
provides an unreliable communication.
Note that the FEC and CPC can both live in the same memory protection
domain and use the connect() system call to create a path to the peer
and talk to each other. We will not discuss this mechanism further
other than to say that it is available. Throughout this document, we
will refer interchangeably to the FEC to mean kernel space and the
CPC to mean user space. This denomination is not meant, however, to
restrict the two components to these protection domains or to the
same compute node.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| Netlink message header |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| IP Service Template |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| IP Service specific data in TLVs |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The Netlink message is used to communicate between the FEC and CPC
for parameterization of the FECs, asynchronous event notification of
FEC events to the CPCs, and statistics querying/gathering (typically
by a CPC).
The Netlink message header is generic for all services, whereas the
IP Service Template header is specific to a service. Each IP Service
then carries parameterization data (CPC->FEC direction) or response
(FEC->CPC direction). These parameterizations are in TLV (Type-
Length-Value) format and are unique to the service.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Process ID (PID) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Length: 32 bits
The length of the message in bytes, including the header.
Type: 16 bits
This field describes the message content.
It can be one of the standard message types:
NLMSG_NOOP Message is ignored.
NLMSG_ERROR The message signals an error and the payload
contains a nlmsgerr structure. This can be looked
at as a NACK and typically it is from FEC to CPC.
NLMSG_DONE Message terminates a multipart message.
Flags: 16 bits
The standard flag bits used in Netlink are
NLM_F_REQUEST Must be set on all request messages (typically
from user space to kernel space)
NLM_F_MULTI Indicates the message is part of a multipart
message terminated by NLMSG_DONE
NLM_F_ACK Request for an acknowledgment on success.
Typical direction of request is from user
space (CPC) to kernel space (FEC).
NLM_F_ECHO Echo this request. Typical direction of
request is from user space (CPC) to kernel
space (FEC).
One could create a heartbeat protocol between the FEC and CPC by
using the ECHO flags and the NLMSG_NOOP message.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Netlink message header |
| type = NLMSG_ERROR |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Error code |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| OLD Netlink message header |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
These are services that are offered by the system for general use by
other services. They include the ability to configure, gather
statistics and listen to changes in shared resources. IP address
management, link events, etc. fit here. We create this section for
these services for logical separation, despite the fact that they are
accessed via the NETLINK_ROUTE FEC. The reason that they exist
within NETLINK_ROUTE is due to historical cruft: the BSD 4.4 Route
Sockets implemented them as part of the IPv4 forwarding sockets.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Family | Reserved | Device Type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Interface Index |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Device Flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Change Mask |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Family: 8 bits
This is always set to AF_UNSPEC.
Applicable attributes:
Attribute Description
..........................................................
IFLA_UNSPEC Unspecified.
IFLA_ADDRESS Hardware address interface L2 address.
IFLA_BROADCAST Hardware address L2 broadcast
address.
IFLA_IFNAME ASCII string device name.
IFLA_MTU MTU of the device.
IFLA_LINK ifindex of link to which this device
is bound.
IFLA_QDISC ASCII string defining egress root
queuing discipline.
IFLA_STATS Interface statistics.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Family | Length | Flags | Scope |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Interface Index |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Family: 8 bits
Address Family: AF_INET for IPv4; and AF_INET6 for IPV6.
Length: 8 bits
The length of the address mask.
Flags: 8 bits
IFA_F_SECONDARY For secondary address (alias interface).
le attributes:
Attribute Description
IFA_UNSPEC Unspecified.
IFA_ADDRESS Raw protocol address of interface.
IFA_LOCAL Raw protocol local address.
IFA_LABEL ASCII string name of the interface.
IFA_BROADCAST Raw protocol broadcast address.
IFA_ANYCAST Raw protocol anycast address.
IFA_CACHEINFO Cache address information.
Although there are many other IP services defined that are using
Netlink, as mentioned earlier, we will talk only about a handful of
those integrated into kernel version 2.4.6. These are:
This service allows CPCs to modify the IPv4 routing table in the
Forwarding Engine. It can also be used by CPCs to receive routing
updates, as well as to collect statistics.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Family | Src length | Dest length | TOS |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Table ID | Protocol | Scope | Type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Family: 8 bits
Address Family: AF_INET for IPv4; and AF_INET6 for IPV6.
TOS: 8 bits
The 8-bit TOS (should be deprecated to make room for DSCP).
Table ID: 8 bits
Table identifier. Up to 255 route tables are supported.
RT_TABLE_UNSPEC An unspecified routing table.
RT_TABLE_DEFAULT The default table.
RT_TABLE_MAIN The main table.
RT_TABLE_LOCAL The local table.
Protocol: 8 bits
Identifies what/who added the route.
Protocol Route origin.
..............................................
RTPROT_UNSPEC Unknown.
RTPROT_REDIRECT By an ICMP redirect.
RTPROT_KERNEL By the kernel.
RTPROT_BOOT During bootup.
RTPROT_STATIC By the administrator.
Scope: 8 bits
Route scope (valid distance to destination).
RT_SCOPE_UNIVERSE Global route.
RT_SCOPE_SITE Interior route in the
local autonomous system.
RT_SCOPE_LINK Route on this link.
RT_SCOPE_HOST Route on the local host.
RT_SCOPE_NOWHERE Destination does not exist.
Type: 8 bits
The type of route.
Flags: 32 bits
Further qualify the route.
RTM_F_NOTIFY If the route changes, notify the
user.
RTM_F_CLONED Route is cloned from another route.
RTM_F_EQUALIZE Allow randomization of next hop
path in multi-path routing
(currently not implemented).
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Family | Reserved1 | Reserved2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Interface Index |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| State | Flags | Type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Family: 8 bits
Address Family: AF_INET for IPv4; and AF_INET6 for IPV6.
State: 16 bits
A bitmask of the following states:
NUD_INCOMPLETE Still attempting to resolve.
NUD_REACHABLE A confirmed working cache entry
NUD_STALE an expired cache entry.
NUD_DELAY Neighbor no longer reachable.
Traffic sent, waiting for
confirmation.
NUD_PROBE A cache entry that is currently
being re-solicited.
NUD_FAILED An invalid cache entry.
NUD_NOARP A device which does not do neighbor
discovery (ARP).
NUD_PERMANENT A static entry.
Flags: 8 bits
NTF_PROXY A proxy ARP entry.
NTF_ROUTER An IPv6 router.
queuing discipline and has a queue associated with it. The queue may
be subject to a simple algorithm, like FIFO, or a more complex one,
like RED or a token bucket. The outermost queuing discipline, which
is referred to as the parent is typically associated with a
scheduler. Within this scheduler hierarchy, however, may be other
scheduling algorithms, making the Linux Egress TC very flexible.
The service message template that makes this possible is shown below.
This template is used in both the ingress and the egress queuing
disciplines (refer to the egress traffic control model in the FE
model section). Each of the specific components of the model has
unique attributes that describe it best. The common attributes are
described below.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Family | Reserved1 | Reserved2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Interface Index |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Qdisc handle |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Parent Qdisc |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TCM Info |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Family: 8 bits
Address Family: AF_INET for IPv4; and AF_INET6 for IPV6.
except when used in the context of filters. In that case, this 32-
bit field is split into a 16-bit priority field and 16-bit protocol
field. The protocol is defined in kernel source
<include/linux/if_ether.h>, however, the most commonly used one is
ETH_P_IP (the IP protocol).
Two types of messages exist that can be sent from CPC to FEC. These
are: Mode messages and Verdict messages. Mode messages are sent
immediately to the FEC to describe what the CPC would like to
receive. Verdict messages are sent to the FEC after a decision has
been made on the fate of a received packet. The formats are
described below.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Mode | Reserved1 | Reserved2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Range |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Mode: 8 bits
Control information on the packet to be sent to the CPC. The
different types are:
Range: 32 bits
If IPQ_COPY_PACKET, this defines the maximum length to copy.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Packet ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Mark |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp_m |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp_u |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| hook |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| indev_name |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| outdev_name |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| hw_protocol | hw_type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| hw_addrlen | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| hw_addr |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data_len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload . . . |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Mark: 32 bits
The internal metadata value set to describe the rule in which
the packet was picked.
timestamp_m: 32 bits
Packet arrival time (seconds)
timestamp_u: 32 bits
Packet arrival time (useconds in addition to the seconds in
timestamp_m)
hook: 32 bits
The firewall module from which the packet was picked.
hw_protocol: 16 bits
Hardware protocol, in network order.
hw_type: 16 bits
Hardware type.
hw_addrlen: 8 bits
Hardware address length.
hw_addr: 64 bits
Hardware address.
data_len: 32 bits
Length of packet data.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Value |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Packet ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload . . . |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Value: 32 bits
Payload:
Size as defined by the Data Length field.
This service is used by CPCs for managing the neighbor table in the
FE. The message format used between the FEC and CPC is described in
the section on the Neighbor Setup Service Module.
4. References
[3] Blake, S., Black, D., Carlson, M., Davies, E, Wang, Z. and W.
Weiss, "An Architecture for Differentiated Services", RFC 2475,
December 1998.
[4] Durham, D., Boyle, J., Cohen, R., Herzog, S., Rajan, R. and A.
Sastry, "The COPS (Common Open Policy Service) Protocol", RFC
2748, January 2000.
[5] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998.
[8] Bernet, Y., Blake, S., Grossman, D. and A. Smith, "An Informal
Management Model for DiffServ Routers", RFC 3290, May 2002.
[10] http://www.netfilter.org
[11] http://diffserv.sourceforge.net
5. Security Considerations
6. Acknowledgements
3) Jeremy Ethridge for taking the role of someone who did not
understand Netlink and reviewing the document to make sure that it
made sense.
CP
[--------------------------------------------------------.
| .-----. |
| | . -------. |
| | CLI | / \ |
| | | | CP protocol | |
| /->> -. | component | <-. |
| __ _/ | | For | | |
| | | IP service | ^ |
| Y | foo | | |
| | ___________/ ^ |
| Y 1,4,6,8,9 / ^ 2,5,10 | 3,7 |
--------------- Y------------/---|----------|-----------
| ^ | ^
**|***********|****|**********|**********
************* Netlink layer ************
**|***********|****|**********|**********
FE | | ^ ^
.-------- Y-----------Y----|--------- |----.
| | / |
| Y / |
| . --------^-------. / |
| |FE component/module|/ |
| | for IP Service | |
--->---|------>---| foo |----->-----|------>--
| ------------------- |
| |
| |
------------------------------------------
The control plane protocol for IP service foo does the following to
connect to its FE counterpart. The steps below are also numbered
above in the diagram.
Our example IP service foo is used again to demonstrate how one can
deploy a simple IP service control using Netlink.
Appendix 3: Examples
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length (52) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type (RTM_NEWQDISC) | Flags (NLM_F_EXCL | |
| |NLM_F_CREATE | NLM_F_REQUEST)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number(arbitrary number) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Process ID (0) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Family(AF_INET)| Reserved1 | Reserved1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Interface Index (4) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Qdisc handle (0x1000001) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Parent Qdisc (0x1000000) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TCM Info (0) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type (TCA_KIND) | Length(4) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Value ("pfifo") |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type (TCA_OPTIONS) | Length(4) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Value (limit=100) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Authors’ Addresses
EMail: hadi@znyx.com
Hormuzd M Khosravi
Intel
2111 N.E. 25th Avenue JF3-206
Hillsboro OR 97124-5961
USA
Andi Kleen
SuSE
Stahlgruberring 28
81829 Muenchen
Germany
EMail: ak@suse.de
Alexey Kuznetsov
INR/Swsoft
Moscow
Russia
EMail: kuznet@ms2.inr.ac.ru
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assignees.
Acknowledgement