0% found this document useful (0 votes)
160 views300 pages

18csc310j Unit 5

This document discusses layer 3 data center technologies. It begins with an introduction to layer 3 networks, describing them as operating at the network layer and facilitating communication between networks through routing and forwarding. It then covers several key layer 3 data center technologies, including multipath route forwarding, bonding vs. equal-cost multipath routing (ECMP), and addressing some complexities of a pure layer 3 solution using Linux features like unnumbered interfaces and anycast IP addresses. The document aims to explain how standard ARP processing still works in a layer 3 environment.

Uploaded by

Dhruv Atri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
160 views300 pages

18csc310j Unit 5

This document discusses layer 3 data center technologies. It begins with an introduction to layer 3 networks, describing them as operating at the network layer and facilitating communication between networks through routing and forwarding. It then covers several key layer 3 data center technologies, including multipath route forwarding, bonding vs. equal-cost multipath routing (ECMP), and addressing some complexities of a pure layer 3 solution using Linux features like unnumbered interfaces and anycast IP addresses. The document aims to explain how standard ARP processing still works in a layer 3 environment.

Uploaded by

Dhruv Atri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 300

18CSC310J

Data Centric Networking and


System Design

for
V Sem
B.Tech (CSE - Cloud Computing)

Department of NWC
CO - Course Learning Outcomes
• CO1 - Apply various data centric networking concepts.
• CO2 - Identify the different data centre architectures & core network
connectivity issues.
• CO3 - Design of server architectures in layer 2 — 3 level for Data centres.
• CO4 - Demonstrate various networking protocols in layer 2 networks.
• CO5 - Evaluate and choose the appropriate networking techniques used in
Layer 3 networks

18CSC310J-DCNSD NWC/SRMIST 2
18CSC310J
Data Centric Networking and
System Design

Unit I
Outline of the Presentation
• Introduction to Layer 3 Networks
• Layer 3 Data Center Technologies
• Locator Identifier Separation Protocol (LISP)
• Layer 3 Multicasting
• Protocol: IPv4
• Protocol: IPv6
• Protocols: MPLS, OSPF
• Protocols: IS-IS, BGP
• OTV & VPLS Layer 2 Extension

18CSC310J-DCNSD NWC/SRMIST 4
Introduction to Layer 3 Networks

18CSC310J-DCNSD NWC/SRMIST 5
Layer 3 Networks
• In today’s rapidly evolving digital landscape, data centers
play a critical role in ensuring the seamless functioning of
businesses and organizations.
• Among the various types of data centers, Layer 3 data
centers stand out as the backbone of modern networking.

18CSC310J-DCNSD NWC/SRMIST
Layer 3 Networks
• A Layer 3 data center, also known as a network layer data
center, is an advanced infrastructure that operates at the
network layer of the OSI (Open Systems Interconnection)
model.
• It acts as a gateway, facilitating communication between
various networks, both internally and externally.
• Layer 3 data centers are responsible for routing and
forwarding data packets across multiple networks,
ensuring efficient and secure transmission.
18CSC310J-DCNSD NWC/SRMIST
Layer 3 Networks
• A Layer 3 Data Center is a type of data center that utilizes
Layer 3 switching technology to provide network
connectivity and traffic control.
• Layer 3 Data Centers are typically used in large-scale
enterprise networks, providing reliable services and high
performance.

18CSC310J-DCNSD NWC/SRMIST
Layer 3 Networks
• Layer 3 Data Centers are differentiated from other data
centers using Layer 3 switching.
• Layer 3 switching, also known as Layer 3 networking, is a
switching technology that operates at the third layer of the
Open Systems Interconnection (OSI) model, the network
layer.
• This switching type manages network routing, addressing,
and traffic control and supports various protocols.

18CSC310J-DCNSD NWC/SRMIST
Layer 3 Networks
• Layer 3 Data Centers are typically characterized by their
use of high-performance routers and switches.
• These routers and switches are designed to deliver robust
performance, scalability, and high levels of security.
• In addition, by using Layer 3 switching, these data centers
can provide reliable network services such as network
access control, virtual LANs, and Quality of Service (QoS)
management.
Layer 3 Networks
Layer 3 DC Key Features and Functionalities:
1. Network Routing: Layer 3 data centers excel in routing data packets
across networks, using advanced routing protocols such as OSPF
(Open Shortest Path First) and BGP (Border Gateway Protocol). This
enables efficient traffic management and optimal utilization of
network resources.
2. IP Addressing: Layer 3 data centers assign and manage IP
addresses, allowing devices within a network to communicate with
each other and external networks. IP addressing helps in identifying
and locating devices, ensuring reliable data transmission.
Layer 3 Networks
Layer 3 DC Key Features and Functionalities:
3. Interconnectivity: Layer 3 data centers provide seamless
connectivity between different networks, whether they are local area
networks (LANs), wide area networks (WANs), or the internet. This
enables organizations to establish secure and reliable connections
with their branches, partners, and customers.
4. Load Balancing: Layer 3 data centers distribute network traffic
across multiple servers or network devices, ensuring that no single
device becomes overwhelmed. This helps to maintain network
performance, improve scalability, and prevent bottlenecks.
Layer 3 Networks
Benefits of Layer 3 Data Centers:
1. Enhanced Performance: Layer 3 data centers optimize network
performance by efficiently routing traffic, reducing latency, and
ensuring faster data transmission. This results in improved application
delivery, enhanced user experience, and increased productivity.
2. Scalability: Layer 3 data centers are designed to support the
growth and expansion of networks. Their ability to route data across
multiple networks enables organizations to scale their operations
seamlessly, accommodate increasing traffic, and add new devices
without disrupting the network infrastructure.
Layer 3 Networks
Benefits of Layer 3 Data Centers:
3. High Security: Layer 3 data centers provide enhanced security
measures, including firewall protection, access control policies,
and encryption protocols. These measures safeguard sensitive
data, protect against cyber threats, and ensure compliance with
industry regulations.
4. Flexibility: Layer 3 data centers offer network architecture and
design flexibility. They allow organizations to implement different
network topologies based on their specific requirements, such as
hub-and-spoke, full mesh, or partial mesh.
Layer 3 Data Center Technologies

18CSC310J-DCNSD NWC/SRMIST 15
Layer 3 Data Center
Technologies
Multipath Route Forwarding
• Many networks implement VLANs to support random IP
address assignment and IP mobility.
• The switches perform layer-2 forwarding even though
they might be capable of layer-3 IP forwarding.
• For example, they forward packets based on MAC
addresses within a subnet, yet a layer-3 switch does not
need Layer 2 information to route IPv4 or IPv6 packets.
Layer 3 Data Center
Technologies
Multipath Route Forwarding
• Cumulus has gone one step further and made it possible
to configure every server-to-ToR interface as a Layer 3
interface.
• Their design permits multipath default route
forwarding, removing the need for ToR interconnects and
common broadcast domain sharing of uplinks.
Layer 3 Data Center
Technologies
Bonding Vs. ECMP
• A typical server environment consists of a single server with two
uplinks.
• For device and link redundancy, uplinks are bonded into a port
channel and terminated on different ToR switches, forming an
MLAG.
• As this is an MLAG design, the ToR switches need an inter-
switch link.
• Therefore, you cannot bond server NICs to two separate ToR
switches without creating an MLAG.
Layer 3 Data Center
Technologies
Bonding Vs. ECMP
Layer 3 Data Center
Technologies
Pure layer-3 solution complexities
• Firstly, we cannot have one IP address with two MAC
addresses. To overcome this, we implement additional Linux
features.
• First, Linux has the capability for an unnumbered interface,
permitting the assignment of the same IP address to both
interfaces, one IP address for two physical NICs.
• Next, we assign a /32 Anycast IP address to the host via a
loopback address.
Layer 3 Data Center
Technologies
Layer 3 Data Center
Technologies
Pure layer-3 solution complexities
• Secondly, the end hosts must send to a next-hop, not a shared
subnet.
• Linux allows you to specify an attribute to the received default
route, called “on-link.”
• This attribute tells end-hosts, “I might not be on a directly
connected subnet to the next hop, but trust me, the next hop is
on the other side of this link.”
• It forces hosts to send ARP requests regardless of common
subnet assignment.
Layer 3 Data Center
Technologies
Pure layer-3 solution complexities
• These techniques enable the assignment of the same IP
address to both interfaces and permit forwarding a default
route out of both interfaces.
• Each interface is on its broadcast domain.
• Subnets can span two ToRs without requiring bonding
or an inter-switch link.
Layer 3 Data Center
Technologies
Standard ARP processing still works.
• Although the Layer 3 ToR switch doesn’t need Layer 2 information to
route IP packets, the Linux end-host believes it has to deal with the
traditional L2/L3 forwarding environment.
• As a result, the Layer 3 switch continues to reply to incoming ARP
requests.
• The host will ARP for the ToR Anycast gateway (even though it’s not
on the same subnet), and the ToR will respond with its MAC address.
• The host ARP table will only have one ARP entry because the default
route points to a next-hop, not an interface.
Layer 3 Data Center
Technologies
Standard ARP processing still works.
• Return traffic is slightly different, depending on what the ToR
advertises to the network.
• There are two modes; firstly, if the ToR advertises a /24 to
the rest of the network, everything works fine until
the server-to-ToR link fails.
• Then, it becomes a layer-2 problem; as you said, you could
reach the subnet. This results in return traffic traversing an
inter-switch ToR link to get back to the server.
Layer 3 Data Center
Technologies
Standard ARP processing still works.
• But this goes against our previous design requirement of
removing any ToR inter-switch links.
• Essentially, you need to opt for the second mode and
advertise a /32 for each host back into the network.
Layer 3 Data Center
Technologies
Standard ARP processing still works.
• Take the information learned in ARP, consider it a host routing
protocol, and redistribute it into the data center protocol, i.e.,
redistribute ARP.
• The ARP table gets you the list of neighbors, and the redistribution
pushes those entries into the routed fabric as /32 host routes.
• This allows you to redistribute only what /32 are active and present
in ARP tables.
• It should be noted that this is not a default mode and is currently an
experimental feature.
Locator Identifier Separation Protocol
(LISP)

18CSC310J-DCNSD NWC/SRMIST 28
LISP
• Cisco Locator ID Separation Protocol (LISP) is a mapping
and encapsulation protocol, originally developed to
address the routing scalability issues on the Internet.

• Internet routing tables have grown exponentially, putting a


burden on BGP routers.
• Routing on the Internet is meant to be hierarchical, but
because of disaggregation, a full Internet routing table
nowadays contains over 800.000 prefixes.
LISP
• Disaggregation is the opposite of aggregation (
route summarization). We inject more specific routes when there
is an aggregate (summary route). There are two main reasons
why this happens:
• Multihoming: Customers connect to two different ISPs and
advertise their provider-independent address space (PI) to both
ISPs.
• Traffic engineering: A common practice for ingress traffic
engineering is to advertise a more specific route. This works, but
it increases the size of the Internet routing table.
LISP
• You need powerful routers with enough RAM and TCAM to store
all prefixes in the Internet routing table.
• Injecting more specific prefixes also increases the risk of route
instability.
• We need routers with powerful CPUs to process changes in the
routing table.
• With traditional IP routing, an IP address has two functions:
– Identity: To identify the device.
– Location: The location of the device in the network; we use this for
routing.
LISP
• LISP separates these two functions of an IP address into
two separate functions:
• Endpoint Identifier (EID): Assigned to hosts like
computers, laptops, printers, etc.
• Routing Locators (RLOC): Assigned to routers. We use
the RLOC address to reach EIDs.
LISP
• Cisco created LISP, but it’s not a proprietary solution, it’s
an open standard, defined in RFC 6830.

• Originally it was designed for the Internet, but nowadays,


you also see LISP in other environments like data
centers, IoT, WAN, and the campus (Cisco SD-Access).
LISP
• LISP is a map and encapsulation protocol. There are
three essential environments in a LISP environment:
• LISP sites: This is the EID namespace, where EIDs are.
• non-LISP sites: This is the RLOC namespace where we
find RLOCs. For example, the Internet.
• LISP mapping service: This is the infrastructure that
takes care of EID-to-RLOC mappings.
LISP
LISP
• We have two LISP sites, site 1 and site 2.
• In each site, there is a host and a router configured to use LISP.
• The hosts have an EID address:
• H1 EID 192.168.1.101
• H2 EID 192.168.2.102.
• The routers have an RLOC address:
• R1 RLOC 192.168.123.1
• R2 RLOC 192.168.123.2

• The RLOC space is a non-LISP area. For example, the


Internet.
LISP
When H1 wants to send an IP packet to H2, here’s what
happens:
1.H1 doesn’t have anything to do with LISP and sends an
IP packet to its default gateway (R1).
2.R1 receives the IP packet and asks the LISP mapping
system where it can find EID 192.168.2.102.
3.The mapping system replies with an EID-to-RLOC
mapping.
LISP
When H1 wants to send an IP packet to H2, here’s what
happens:
4. R1 now knows that it can reach EID 192.168.2.102
through RLOC 192.168.123.2. The router encapsulates the
IP packet with LISP encapsulation and transmits the packet.
5. R2 receives the LISP encapsulated IP packet, de-
encapsulates it, and forwards the original IP packet to H2.
LISP
• A very simplified one-sentence explanation is that LISP is a
tunneling protocol that uses a DNS-like system to figure out to
which router they should send IP packets.
• The LISP routers that encapsulate and de-encapsulate have a
name:
– Ingress Tunnel Router (ITR): Router, which encapsulates IP packets.
– Egress Tunnel Router (ETR): Router, which de-encapsulates LISP
encapsulated IP packets.
– Tunnel Router (xTR): Router which performs both the ITR and ETR
functions.
LISP
Control Pane
The LISP control plane is similar to how DNS works:
– DNS resolves a hostname to an IP address.
– LISP resolves an EID to an RLOC.
LISP
Control Pane
LISP
Control Pane
• With traditional IP routing, we install prefixes in the routing
table. LISP doesn’t install EID-prefixes in the routing table.
• Instead, LISP uses a distributed mapping system where we
map EIDs to RLOCs.
• We store these mappings in a distributed EID-to-RLOC
database.
• When an ITR needs to find an RLOC address, it sends a Map-
Request query to the mapping system.
LISP
Data Pane
Once an ITR has figured out which RLOC to use to reach
an EID, it encapsulates the IP packet.
LISP
Data Pane
• When the ITR receives the IP packet from a host, it adds the
following headers:
• LISP Header: This header includes some LISP information
needed to forward the packet.
• The instance ID is a 24-bit value that has a similar function as the
Route Distinguisher (RD) in MPLS VPN .
• The instance ID is a unique identifier, which keeps prefixes apart
when you have overlapping (private) EID addresses in your LISP
sites.
LISP
Data Pane
• Outer LISP UDP header: The source port is selected by
the ITR to prevent traffic from one LISP site to another
LISP site to take the same path if you have equal-cost
multipath (ECMP) links to the destination. Different source
ports prevent polarization. The destination port is 4341.
• Outer LISP IP header: Contains the source and
destination RLOC IP addresses needed to route the
packet from the ITR to ETR.
Layer 3 Multicasting

18CSC310J-DCNSD NWC/SRMIST 46
Layer 3 Multicasting
Layer 3 multicast protocols include multicast group
management protocols and multicast routing protocols.
Layer 3 Multicasting
Multicast group management protocols:

• Internet Group Management Protocol (IGMP) and


Multicast Listener Discovery (MLD) protocol are multicast
group management protocols.
• Typically, they run between hosts and Layer 3 multicast
devices that directly connect to the hosts to establish and
maintain multicast group memberships.
Layer 3 Multicasting
Multicast routing protocols:
• A multicast routing protocol runs on Layer 3 multicast
devices to establish and maintain multicast routes and
correctly and efficiently forward multicast packets.
• Multicast routes constitute loop-free data transmission
paths (also known as multicast distribution trees) from a
data source to multiple receivers.
• In the ASM model, multicast routes include intra-domain
routes and inter-domain routes.
Layer 3 Multicasting

Multicast routing protocols:


• An intra-domain multicast routing protocol discovers
multicast sources and builds multicast distribution trees
within an AS to deliver multicast data to receivers.
• Among a variety of mature intra-domain multicast
routing protocols, PIM is most widely used.
• Based on the forwarding mechanism, PIM has dense
mode (often referred to as PIM-DM) and sparse mode
(often referred to as PIM-SM).
Layer 3 Multicasting
Multicast routing protocols:

• An inter-domain multicast routing protocol is used for delivering


multicast information between two ASs.
• So far, mature solutions include Multicast Source Discovery Protocol
(MSDP) and MBGP.
• MSDP propagates multicast source information among different ASs.
• MBGP is an extension of the MP-BGP for exchanging multicast routing
information among different ASs.
Layer 3 Multicasting
Multicast routing protocols:
• For the SSM model, multicast routes are not divided into
intra-domain routes and inter-domain routes.
• Because receivers know the positions of the multicast
sources, channels established through PIM-SM are sufficient
for the transport of multicast information.
IP Addressing

18CSC310J-DCNSD NWC/SRMIST 53
IP Service
• IP supports the following services:
• one-to-one (unicast)
• one-to-all (broadcast)
• one-to-several (multicast)

unicast
broadcast multicast

• IP multicast also supports a many-to-many service.


• IP multicast requires support of other protocols (IGMP, multicast routing)
54
DATAGRAM
• A packet in the IP layer is called a datagram, a variable-
length packet
• Consisting of two parts: header and data.
• The header is 20 to 60 bytes in length and contains
information essential to routing and delivery.

55
IP datagram

56
IP Datagram Format
bit # 0 7 8 15 16 23 24 31
header
version DS ECN total length (in bytes)
length
D M
Identification 0 Fragment offset
F F
time-to-live (TTL) protocol header checksum

source IP address

destination IP address

options (0 to 40 bytes)

payload

4 bytes

• 20 bytes ≤ Header Size < 24 x 4 bytes = 60 bytes


• 20 bytes ≤ Total Length < 216 bytes = 65536 bytes
57
Fields of the IP Header
• Version (4 bits): current version is 4, next version will be 6.
-This field specifies the version of IP used for transferring data. The size of
the Version field is 4 bits. Both the sender and the receiver must use the
same version of IP to ensure proper interpretation of the fields in the
datagram.
• Header length (4 bits): length of IP header, in multiples of 4 bytes.
-You must multiply the value in this field by four to get the length of
the IP header. For example, if the value in this field is 3, the length of the
header is 3*4, which is 12 bytes.
• TOTAL LENGTH: Total Length field specifies the total length of the
datagram. The size of the field is 16 bits. The Total Length field can be
calculated as follows:

Total length of the datagram = Length of the header + Length of the data
58
Fields of the IP Header
• DS/ECN field (1 byte)
– This field was previously called as Type-of-Service (TOS) field.
The role of this field has been re-defined, but is “backwards
compatible” to TOS interpretation
– Differentiated Service (DS) (6 bits):
• Used to specify service level (currently not supported in the
Internet)
– Explicit Congestion Notification (ECN) (2 bits):
• New feedback mechanism used by TCP
Type of Service

The precedence subfield was designed, but never used in version 4.


Fields of the IP Header
• Identification (16 bits): Unique identification of a
datagram from a host. Incremented whenever a datagram
is transmitted

• Flags (3 bits):
– First bit always set to 0
– DF bit (Do not fragment)
– MF bit (More fragments)
Will be explained later Fragmentation
61
Fields of the IP Header
• Time To Live (TTL) (1 byte):
– Specifies longest paths before datagram is dropped
– Role of TTL field: Ensure that packet is eventually dropped
when a routing loop occurs
Used as follows:
– Sender sets the value (e.g., 64)
– Each router decrements the value by 1
– When the value reaches 0, the datagram is dropped

62
Fields of the IP Header
• Protocol (1 byte):
• Specifies the higher-layer protocol.
• Used for demultiplexing to higher layers.

4 = IP-in-IP
encapsulation

6 = TCP 17 = UDP

1 = ICMP 2 = IGMP

IP

63
Fields of the IP Header
Value (Decimal) Protocol

1 Internet Control Message Protocol (ICMP)


2 Internet Group Management Protocol (IGMP)
3 Gateway-to-Gateway Protocol (GGP)
4 Internet Protocol (IP)
6 Transmission Control Protocol (TCP)
8 Exterior Gateway Protocol (EGP)
9 Interior Gateway Protocol (IGP)
17 User Datagram Protocol (UDP)
41 Internet Protocol Version 6 (IPv6)
86 Dissimilar Gateway Protocol (DGP)
88 Interior Gateway Routing Protocol (IGRP)
89 Open Shortest Path First (OSPF)
Fields of the IP Header

• Header checksum (2 bytes): A simple 16-bit long checksum


which is computed from the header of the datagram.
• The Header Checksum field contains the checksum, which is
used by the destination to check for the integrity of the
transmitted data by applying an algorithm on the IP
header.
• The size of this field is 16 bits. The Header Checksum value
is calculated by the sender and is sent along with the IP
header.
Fields of the IP Header

• The sender uses a specific algorithm for arriving at


the checksum value. When the datagram reaches the
destination, the checksum is calculated by the
destination by using the same algorithm.
• If the value that is calculated is not equal to the specified
Header Checksum value in the header of the datagram,
the packet is discarded.
Fields of the IP Header
• The Header Checksum value is calculated only with
the header values and not by using the data. Every
intermediate device that receives the data must calculate
the Header Checksum value before forwarding it.
• The Header Checksum value is not calculated based on
the data because
– Reduce the efficiency of the network because the datagrams
will be held by the intermediate device for a long time.
– Requires more time
Fields of the IP Header
• Options:
• Security restrictions
• Record Route: each router that processes the packet adds its IP address to
the header.
• Timestamp: each router that processes the packet adds its IP address and
time to the header.
• (loose) Source Routing: specifies a list of routers that must be traversed.
• (strict) Source Routing: specifies a list of the only routers that can be
traversed.
• Padding: Padding bytes are added to ensure that header
ends on a 4-byte boundary
68
Maximum Transmission Unit
• Maximum size of IP datagram is 65535, but the data link layer protocol generally
imposes a limit that is much smaller.
• Example:
– Ethernet frames have a maximum payload of 1500 bytes
 IP datagrams encapsulated in Ethernet frame cannot be longer than 1500 bytes

• The limit on the maximum IP datagram size, imposed by the data link protocol is
called maximum transmission unit (MTU)

• MTUs for various data link protocols:


Ethernet: 1500 FDDI: 4352
802.3: 1492 ATM AAL5: 9180
802.5: 4464 PPP: negotiated

69
MTU
• The amount of data that can be transmitted in a single
frame is called Maximum Transfer Unit (MTU) and varies
with the network technology that is used.
• MTU size is measured in bytes.
• For example, the MTU for Ethernet is 1,500 bytes,
whereas it is 4,352 bytes for FDDI.
IP Fragmentation
• What if the size of an IP datagram exceeds the MTU?
IP datagram is fragmented into smaller units.
• What if the route contains networks with different MTUs?

Ethernet
FDDI
Ring
Host A Router Host B
MTUs: FDDI: 4352 Ethernet: 1500

• Fragmentation:
• IP router splits the datagram into several datagram
• Fragments are reassembled at receiver
71
IP Fragmentation
• If a datagram can be accommodated in a frame, data transmission
becomes very simple
• However, if the size of the datagram is more than the value that can
be accommodated in the frame, the datagram must be divided into
logical groups called fragments.
• If a datagram cannot be accommodated in a single frame, it is
divided or fragmented and sent in multiple frames. The process of
dividing a datagram into multiple groups called fragments is
called fragmentation.
Where is Fragmentation done?

• Fragmentation can be done at the sender or at intermediate routers


• The same datagram can be fragmented several times.
• Reassembly of original datagram is only done at destination hosts !!

IP datagram H Fragment 2 H2 Fragment 1 H1

Router
73
What’s involved in Fragmentation?
• The following fields in the IP header are involved:

header
version
length
DS ECN total length (in bytes)
DM
Identification 0 Fragment offset
F F
time-to-live (TTL) protocol header checksum

Identification When a datagram is fragmented, the identification is the same in


all fragments
Flags
DF bit is set: Datagram cannot be fragmented and must
be discarded if MTU is too small
MF bit set: This datagram is part of a fragment and an
additional fragment follows this one.
The last bit in this field is used to indicate whether there are any fragments following
the current one. If the value of this field is set to 0, it indicates that the current
fragment is the last fragment in the datagram. 74
What’s involved in Fragmentation?
• The following fields in the IP header are involved:

header
version
length
DS ECN total length (in bytes)
DM
Identification 0 Fragment offset
F F
time-to-live (TTL) protocol header checksum

Fragment offset Offset of the payload of the current


fragment in the original datagram

The Fragmentation Offset field is used to indicate the relative position of a


fragment with respect to the other fragments of a datagram. The position of the
datagram is not represented in terms of the exact position. It contains the number of
octets that the fragment contains. The Fragmentation Offset is numbered starting
from zero.

Total length Total length of the current fragment 75


Example of Fragmentation
• A datagram with size 2400 bytes must be fragmented according to an
MTU limit of 1000 bytes

Header length: 20 Header length: 20 Header length: 20 Header length: 20


Total length: 2400 Total length: 448 Total length: 996 Total length: 996
Identification: 0xa428 Identification: 0xa428 Identification: 0xa428 Identification: 0xa428
DF flag: 0 DF flag: 0 DF flag: 0 DF flag: 0
MF flag: 0 MF flag: 0 MF flag: 1 MF flag: 1
Fragment offset: 0 Fragment offset: 244 Fragment offset: 122 fragment offset: 0

IP datagram Fragment 3 Fragment 2 Fragment 1

MTU: 4000 MTU: 1000


Router
76
Internet Protocol
Transports a datagram from source host to
destination, possibly via several intermediate nodes
(“routers”)

Service is:
• Unreliable: Losses, duplicates, out-of-order delivery
• Best effort: Packets not discarded capriciously,
delivery failure not necessarily reported
• Connectionless: Each packet is treated
independently
8/23/2023 77
IP Service
• Delivery service of IP is minimal

• IP provide provides an unreliable connectionless best effort service (also


called: “datagram service”).
– Unreliable: IP does not make an attempt to recover lost packets
– Connectionless: Each packet (“datagram”) is handled independently.
IP is not aware that packets between hosts may be sent in a logical
sequence
– Best effort: IP does not make guarantees on the service (no
throughput guarantee, no delay guarantee,…)

• Consequences:

• Higher layer protocols have to deal with losses or with duplicate


packets

• Packets may be delivered out-of-sequence

78
21-Mar-20 79
Figure 24-
3 IP
Datagram
Why IPv6?
• Deficiency of IPv4
• Address space exhaustion
• New types of service 🡪 Integration
– Multicast
– Quality of Service
– Security
– Mobility (MIPv6)
• Header and format limitations
Problems with IPv4: Address Space Exhaustion

• IPv4 has 32 bit addresses.


• Flat routing infrastructure
• Results in inefficient use of address space.
• Class B addresses are almost over.
• Mostly only class C addresses remain
• Addresses will exhaust in the next 5 years.

8/23/2023 82
Problems with IPv4: Routing Table Explosion

• IP does not permit route aggregation


-Route aggregation is an alternate term for route
summarization, which is a method used to minimize the
number of routing tables required in an IP network.
• Number of networks is increasing very fast
(number of routes to be advertised goes up)
• Very high routing overhead
– lot more memory needed for routing table
– lot more bandwidth to pass routing information
– lot more processing needed to compute routes
8/23/2023 83
Problems with IPv4: Header Limitations

• Maximum header length is 60 octets/bytes.


(Restricts options)
• Maximum packet length is 64K octets.

• ID for fragments is 16 bits. Repeats every 65537th packet.


(Will two packets in the network have same ID?)
• Variable size header.
(Slower processing at routers.)
• No ordering of options.
(All routers need to look at all options.)

8/23/2023 84
Problems with IPv4: Other Limitations

• Lack of quality-of-service support.


– Only an 8-bit ToS field, which is hardly used.
– Problem for multimedia services.
• No support for security at IP layer.
• Mobility support is limited.

8/23/2023 85
IP Address Extension
• Strict monitoring of IP address assignment
• Private IP addresses for intranets
– Only class C or a part of class C to an organization
– Encourage use of proxy services
• Application level proxies
• Network Address Translation (NAT)
• Remaining class A addresses may use CIDR
• Reserved addresses may be assigned

But these will only postpone address exhaustion.


They do not address problems like QoS, mobility, security.
8/23/2023 86
IPng – next generation
• At least 109 networks, 1012 end-systems
• Datagram service (best effort delivery)
• Independent of physical layer technologies
• Robust (routing) in presence of failures
• Flexible topology
• Better routing structures (e.g., aggregation)
• High performance (fast switching)
• Support for multicasting
8/23/2023 87
IPng – next generation
• Support for mobile nodes
• Support for quality-of-service
• Provide security at IP layer
• Extensible
• Auto-configuration (plug-and--play)
• Straight-forward transition plan from IPv4
• Minimal changes to upper layer protocols

8/23/2023 88
IPv6: Advanced Features
• Header format simplification
• Expanded routing and addressing capabilities
• Improved support for extensions and options
• Flow labeling (for QoS) capability
• Auto-configuration and Neighbour discovery
• Authentication and privacy capabilities
• Simple transition from IPv4

8/23/2023 89
IPv6: Datagram

8/23/2023 90
Format of the Base header

8/23/2023 91
IPv6 Header Fields
• Version number (4-bit field)
The value is always 6.
• Flow label (20-bit field)
Used to label packets requesting special handling by routers.
• Traffic class (8-bit field)
Used to mark classes of traffic.
The nodes that originate a packet must identify different classes or different priorities of IPv6
packets. The nodes use the Traffic Class field in the IPv6 header to make this identification.
The routers that forward the packets also use the Traffic Class field for the same purpose.
• Payload length (16-bit field)
Length of the packet following the IPv6 header, in octets.
• Next header (8-bit field)
The type of header immediately following the IPv6 header.

8/23/2023 92
IPv6 Header Fields
• Hop limit (8-bit field)
Decremented by 1 by each node that forwards the packet.
Packet discarded if hop limit is decremented to zero.
• Source Address (128-bit field)
An address of the initial sender of the packet.
• Destination Address (128-bit field)
An address of the intended recipient of the packet. May not be the
ultimate recipient, if Routing Header is present.

8/23/2023 93
IPv6 Header Fields

8/23/2023 94
FLOW LABEL
• In version 6, the flow label has been directly added to
the format of the IPv6 datagram to allow us to use IPv6
as as connection-oriented protocol.
• To a router, a flow is a sequence of packets that share
the same characteristics, such as traveling the same
path, using the same resources, having the same
kind of security, and so on.

8/23/2023 95
Flow label
• A router that supports the handling of flow labels has a flow
label table.
• The table has an entry for each active flow label; each entry
defines the services required by the corresponding flow
label.
• When the router receives a packet, it consults its flow label
table to find the corresponding entry for the flow label value
defined in the packet.
• It then provides the packet with the services mentioned in
the entry.
8/23/2023 96
Flow label
• In its simplest form, a flow label can be used to
speed up the processing of a packet by a router.
• When a router receives a packet, instead of
consulting the routing table and going through a
routing algorithm to define the address of the next
hop, it can easily look in a flow label table for the
next hop.

8/23/2023 97
Flow label
• Flow label can be used to support the transmission of real-time
audio and video.
• Real-time audio or video, particularly in digital form, requires
resources such as high bandwidth, large buffers, long
processing time, and so on.
• A process can make a reservation for these resources
beforehand to guarantee that real-time data will not be delayed
due to a lack of resources.
• The use of real-time data and the reservation of these
resources require other protocols such as Real-Time Protocol
(RTP) and Resource Reservation Protocol (RSVP) in addition to
IPv6
8/23/2023 98
Flow label
• To allow the effective use of flow labels, three rules have been
defined:
1. The flow label is assigned to a packet by the source host. The
label is a random number between 1 and 2^24 – 1. A source
must not reuse a flow label for a new flow while the existing flow
is still alive.
2. If a host does not support the flow label, it sets this field to zero.
If a router does not support the flow label, it simply ignores it.
3. All packets belonging to the same flow have the same source,
same destination, same priority, and same options.

8/23/2023 99
Header: from IPv4 to IPv6
Changed Removed
IPv4 Vs IPv6

21-Mar-20 101
IPv4 & IPv6 Header Comparison
IPv4 Header IPv6 Header
Version IHL Type of Service Total Length
Version Traffic Class Flow Label

Fragment
Identification Flags
Offset
Next
Payload Length Hop Limit
Header
Time to Live Protocol Header Checksum

Source Address
Destination Address
Source Address
Options Padding

- field’s name kept from IPv4 to IPv6


Legend

- fields not kept in IPv6 Destination Address


- Name & position changed in IPv6
- New field in IPv6
21-Mar-20 103
Extension Headers
• To give more functionality to the IP datagram, the base header can be
followed by up to six extension headers. Many of these headers are
options in IPv4.
• Only present when needed.
• Less used functions moved to extension headers.
Eliminates IPv4’s 40-byte limit on options
• Order of extension headers in a packet is defined.
• Headers are aligned on 8-byte boundaries.

8/23/2023 104
Extension Headers

8/23/2023 105
Extension Headers

8/23/2023 106
Extension Headers

8/23/2023 107
MPLS

18CSC310J-DCNSD NWC/SRMIST 108


What is MPLS?
• From MPLS Resource center:
– “MPLS stands for "Multiprotocol Label Switching". In an MPLS
network, incoming packets are assigned a "label" by a "label
edge router (LER)". Packets are forwarded along a "label
switch path (LSP)" where each "label switch router (LSR)"
makes forwarding decisions based solely on the contents of the
label. At each hop, the LSR strips off the existing label and
applies a new label which tells the next hop how to forward the
packet.
What is MPLS?
• From MPLS Resource center:
– Label Switch Paths (LSPs) are established by network
operators for a variety of purposes, such as to guarantee a
certain level of performance, to route around network
congestion, or to create IP tunnels for network-based virtual
private networks. In many ways, LSPs are no different than
circuit-switched paths in ATM or Frame Relay networks, except
that they are not dependent on a particular Layer 2
technology.
What is MPLS?
• From MPLS Resource center:

– An LSP can be established that crosses multiple Layer 2


transports such as ATM, Frame Relay or Ethernet. Thus, one
of the true promises of MPLS is the ability to create end-to-end
circuits, with specific performance characteristics, across any
type of transport medium, eliminating the need for overlay
networks or Layer 2 only control mechanisms.”
What is MPLS?
• OK now in plain English now please?
– Packets enter MPLS Network at a “Label Edge Router” (LER)
– LER Affix a label to packet and forwards it to the MPLS network
– Label switches in the network at each hop makes forwarding
decision solely based on label. That decision is made based on
a pre-established “Label Switch Path” (LSP).
– Labels can be integrated with existing L2 info such as DLCI or
ATM VCs.
• Diagram in class.
MPLS Motivation
• Original drivers towards label switching:
– Designed to make routers faster
• ATM switches were faster than routers
• Fixed length label lookup faster than longest match used by IP routing
• Allow a device to do the same job as a router with performance of ATM
switch
– Enabled IP + ATM integration
• Mapping of IP to ATM had become very complex, hence simplify by
replacing ATM signalling protocols with IP control protocols
MPLS Motivation
• Growth and evolution of the Internet
– The need to evolve routing algorithm
– The need for advanced forwarding algorithm
– routing vs. forwarding (switching)
• routing: flexibility
• forwarding: price/performance
• Can we forward/switch IP packets?
– Allow speed of L2 switching at L3
– Router makes L3 forwarding decision based on a single field: similar
to L2 forwarding  Sppppppeeeeed
Some MPLS Benefits
• Traffic Engineering - the ability to set the path traffic will take through the
network, and the ability to set performance characteristics for a class of traffic

• VPNs - using MPLS, service providers can create IP tunnels throughout their
network, without the need for encryption or end-user applications
• Layer 2 Transport - New standards being defined by the IETF's PWE3 and
PPVPN working groups allow service providers to carry Layer 2 services
including Ethernet, Frame Relay and ATM over an IP/MPLS core
Some MPLS Benefits
• Elimination of Multiple Layers - Typically most carrier networks employ an
overlay model where SONET/SDH is deployed at Layer 1, ATM is used at
Layer 2 and IP is used at Layer 3. Using MPLS, carriers can migrate many
of the functions of the SONET/SDH and ATM control plane to Layer 3,
thereby simplifying network management and network complexity.
Eventually, carrier networks may be able to migrate away from SONET/SDH
and ATM all-together, which means elimination of ATM's inherent "cell-tax" in
carrying IP traffic.
MPLS History
• IP over ATM
• IP Switching by Ipsilon
• Cell Switching Router (CSR) by Toshiba
• Tag switching by Cisco
• Aggregate Route-based IP Switching (IBM)
• IETF – MPLS
– http://www.ietf.org/html.charters/mpls-charter.html
– RFC3031 – MPLS Architecture
– RFC2702 – Requirements for TE over MPLS
– RFC3036 – LDP Specification
MPLS and ISO model
(MPLS is a layer 2.5 protocol)

Applications

TCP UDP
IP
MPLS MPS
PPP FR ATM Ethernet DWDM
Physical

When a layer is added, no modification is needed


on the existing layers.
Label Switching

• What is it?
• Goal: sending a packet from A to B
– We can do it in a broadcast way.
– We can use source routing where the source
determines the path.
– How do we do it on the Internet today?
• Hop-by-hop routing: continue asking who is
closer to B at every stop (hop).
Using Label on the network
(This is not new!)

• ATM: VPI/VCI
• Frame Relay: DLCI
• X.25: LCI (logical Channel Identifier)
• TDM: the time slot (Circuit Identification Code)
• Ethernet switching: ???
Q: do you see any commonality of these labels?
Label Substitution (swapping)

Label-A1 Label-B1

Label-A2 Label-B2

Label-A3 Label-B3

Label-A4 Label-B4
MPLS
• A protocol to establish an end-to-end path from
source to the destination
• A hop-by-hop forwarding mechanism
• Use labels to set up the path
– Require a protocol to set up the labels along the path
• It builds a connection-oriented service on the IP
network
Terminology
• LSR - Routers that support MPLS are called Label Switch Router
• LER - LSR at the edge of the network is called Label Edge Router (a.k.a
Edge LSR)
– Ingress LER is responsible for adding labels to unlabeled IP packets.
– Egress LER is responsible for removing the labels.
• Label Switch Path (LSP) – the path defined by the labels through LSRs
between two LERs.
• Label Forwarding Information Base (LFIB) – a forwarding table (mapping)
between labels to outgoing interfaces.
• Forward Equivalent Class (FEC) – All IP packets follow the same path on
the MPLS network and receive the same treatment at each node.
How does it work?

Add label at the remove label at


ingress LER the egress LER

LSR LSR LER


LER

IP IP #L1 IP #L2 IP #L3 IP

IP Label Label IP
Routing Switching Switching Routing
18CSC310J-DCNSD NWC/SRMIST 124
MPLS Operation

Label Path: R1 => R2 => R3 => R4


Label Forwarding Information Base (LFIB)

Router Incoming Incoming Destination Outgoing Outgoing


Interface Network
Label Interface Label
(FEC)

R1 --- E0 172.16.1.0
S1 6

R2 6 S0 172.16.1.0
S2 11

R3 11 S0 172.16.1.0
S3 7

R4 7 S1 172.26.1.0
Q: create LFIB for R4 => R3 => R2 => R1
E0 --
MPLS process

Label Switch Path

Routing Protocol

FEC FEC FEC

Label Swapping Label removal


Classification
LFIB LFIB LFIB
Label assignment

Layer 2 Layer 2 Layer 2

Layer 1 Layer 1 Layer 1

Ingress Core Egress


Node Node Node
Label Encapsulation

Label information can be carried in a packet in a variety of ways:


• A small, shim label header inserted between the Layer 2 and
network layer headers.
• As part of the Layer 2 header, if the Layer 2 header provides
adequate semantics (such as ATM).
• As part of the network layer header (future, such as IPv6).

• In general, MPLS can be implemented over any media type,


including point-to-point, Ethernet, Frame Relay, and ATM links. The
label-forwarding component is independent of the network layer
protocol.
Label Encapsulation

L2 ATM FR Ethernet PPP


Label VPI/VCI DLCI Shim Label

L2 Label IP Datagram
Header Header
MPLS Encapsulation is specified over various media
types. Labels may use existing format (e.g., VPI/VCI)
or use a new shim label format.
Shim Header

 The Label (Shim Header) is represented as a


sequence of Label Stack Entry
 Each Label Stack Entry is 4 bytes (32 bits)
 20 Bits is reserved for the Label Identifier (also named
Label)
Label Exp S TTL
(20 bits) (3 bits) (1 bit) (8bits)

Label : Label value (0 to 15 are reserved for special use)


Exp : Experimental Use
S: Bottom of Stack (set to 1 for the last entry in the label)
TTL : Time To Live
Forward Equivalent Class
(FEC) Classification

A packet can be mapped to a particular FEC based on the


following criteria:
• destination IP address,
• source IP address,
• TCP/UDP port,
• class of service (CoS) or type of service (ToS),
• application used,
•…
• any combination of the previous criteria.

Ingress Label FEC Egress Label


6 138.120.6.0/24 9
Forwarding Equivalence Classes (FEC)

LER LSR LER


LSR

IP1 IP1 #L1 IP1 #L2 IP1 #L3 IP1

IP2 IP2 #L1 IP2 #L2 IP2 #L3 IP2

IP3 IP3 #L4 IP3 #L5 IP3 #L6 IP3


IP4 IP4 #L4 IP4 #L5 IP4 #L6 IP4
• FEC = A group of packets that are treated the same way by a router.
• The concept of FECs provides for flexibility, scalability, and traffic engineering.
• In legacy routing, the ToS field is used to determine FEC at each hop. In MPLS
132
it is only done once at the network ingress.
MPLS Applications

• Traffic Engineering
• Virtual Private Network
• Quality of Service (QoS)
Traffic Engineering
• Traffic engineering allows a network administrator to make the path
deterministic and bypass the normal routed hop-by-hop paths. An
administrator may elect to explicitly define the path between stations
to ensure QoS or have the traffic follow a specified path to reduce
traffic loading across certain hops.
• The network administrator can reduce congestion by forcing the
frame to travel around the overloaded segments. Traffic engineering,
then, enables an administrator to define a policy for forwarding
frames rather than depending upon dynamic routing protocols.
• Traffic engineering is similar to source-routing in that an explicit
path is defined for the frame to travel. However, unlike source-
routing, the hop-by-hop definition is not carried with every frame.
Rather, the hops are configured in the LSRs ahead of time along with
the appropriate label values.
MPLS – Traffic Engineering

Overload !!
LER 1 LER 4 IP
IP Overload !!
IP L
IP L

Forward to IP L
LSR 2
LSR 3
LSR 4 LSR 2 LSR 3
LSR X

 End-to-End forwarding decision determined by ingress


node.
 Enables Traffic Engineering
MPLS-based VPN
• One of most popular MPLS applications is the implementation
of VPN.
• The basic concept is the same as ATM transparent LAN.
• Using label (instead of IP address) to interconnect multiple
sites over a carrier’s network. Each site has its own private IP
address space.
• Different VPNs may use the same IP address space.
• Same as Frame Relay separation of different user traffic… but
more” fashionable” to use word “VPN” today.
MPLS VPN Connection Model

MPLS MPLS
Edge Edge
VPN_A MPLS Core VPN_A
10.2.0.0 11.5.0.0

VPN_B VPN_A
10.2.0.0 10.1.0.0
VPN_A
11.6.0.0 VPN_B
10.3.0.0
VPN_B
10.1.0.0

VPN_A: 10.2.0.0/24, 11.6.0.0/24, 11.5.0.0/24


VPN_B: 10.2.0.0/24, 10.1.0.0/24, 10.3.0.0/24
MPLS VPN - Example
192.168.1.0 192.168.2.0

E1 E1

E3 E3
E1 E2 E2
E2

192.168.3.0 -- E1 10 E3
10 E1 30 E2 30 E3 -- E1 192.168.4.0
-- E2 20 E3
20 E1 40 E2 40 E3 -- E2
LSP

70 E3 -- E1 50 E2 70 E1 -- E1 50 E3
80 E3 -- E1 60 E2 80 E1 -- E2 60 E3
LSP
MPLS and QoS
• An important proposed MPLS capability is quality of service (QoS) support.
• QoS mechanisms:
– Pre-configuration based on physical interface
– Classification of incoming packets into different classes
– Classification based on network characteristics (such as congestion,
throughput, delay, and loss)
• A label corresponding to the resultant class is applied to the packet.
• Labeled packets are handled by LSRs in their path without needing to be
reclassified.
• MPLS enables simple logic to find the state that identifies how the packet should be
scheduled.
• The exact use of MPLS for QoS purposes depends a great deal on how QoS is
deployed.
• Support various QoS protocols, such as IntServ, DiffServ, and RSVP.
FEC QoS Classification

LER LSR

MPLS label based on A priority scheme for


1. physical interface different label switch path (LSP)
2. Source IP address
3. Destination IP address
4. Type of Service (ToS)
5. Protocol information
6. etc.
IP Differentiated Model

Layer 3
IPV4
Version ToS Data
other IP header info
Length 1 Byte

7 6 5 4 3 2 1 0

IP Precedence Unused
Bits;

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Label | EXP |S| TTL |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
MPLS between Carriers?
Carrier-B
Carrier-A

Carrier-C Internet Carrier-D

Q: Does LDP work on different carriers’ network?


A (short): not yet
A (long): no network-to-network interface (NNI) signaling
.. And I really don’t expect it in the near future…
Summary
• MPLS is accepted by the industry to migrate ATM-based core
to IP/MPLS-based core.
• It is applied to carrier networks and large enterprise networks.
• How do we set the label path: LDP
• What is the need: traffic classification
• What are the applications: traffic engineering, VPN, QoS, etc.
• Challenges:
– NNI for MPLS
– MPLS for the Internet
OSPF

18CSC310J-DCNSD NWC/SRMIST 144


Routing and Forwarding
• Routing is not the same as Forwarding
• Routing is the building of maps
– Each routing protocol usually has its own routing database
– Routing protocols populate the forwarding table
• Forwarding is passing the packet to the next hop device
– Forwarding table contains the best path to the next hop for
each prefix
– There is only ONE forwarding table
OSPF Background
• Developed by IETF – RFC1247
– Designed for Internet TCP/IP environment
• OSPF v2 described in RFC2328/STD54
• OSPF v3 described in RFC2740 - IPv6
• Link state/Shortest Path First Technology
• Dynamic Routing
• Fast Convergence
• Route authentication
Link State Algorithm
• Each router contains a database containing a map of the
whole topology
– Links
– Their state (including cost)
• All routers have the same information
• All routers calculate the best path to every destination
• Any link state changes are flooded across the network
– “Global spread of local knowledge”
Link State Routing
• Automatic neighbour discovery
– Neighbours are physically connected routers
• Each router constructs a Link State Packet (LSP)
– Distributes the LSP to neighbours…
– …using an LSA (Link State Announcement)
• Each router computes its best path to every destination
• On network failure
– New LSPs are flooded
– All routers recompute routing table
Low Bandwidth Requirements

FDDI
Dual Ring
LSA

X R1

LSA

• Only changes are propagated


• Multicast used on multi-access broadcast
networks
– 224.0.0.5 used for all OSPF speakers
– 224.0.0.6 used for DR and BDR routers
“Shortest Path First”

• The optimal path is determined by the sum of the


interface costs
Cost = 1 Cost = 1
FDDI FDDI
N2 N3
Dual Ring Dual Ring

R2

R3

N1 R1 N5
Cost = 10 Cost = 10
R4

N4 Cost = 10
OSPF: How it works
• Hello Protocol
– Responsible for establishing and maintaining
neighbour relationships
– Elects Designated Router on broadcast networks

Hello

FDDI
Dual Ring
Hello Hello
OSPF: How it works
• Hello Protocol
– Hello Packets sent periodically on all OSPF enabled interfaces
– Adjacencies formed between some neighbours
• Hello Packet
– Contains information like Router Priority, Hello Interval, a list of
known neighbours, Router Dead Interval, and the network
mask
OSPF: How it works
• Trade Information using LSAs
– LSAs are added to the OSPF database
– LSAs are passed on to OSPF neighbours
• Each router builds an identical link state database
• SPF algorithm run on the database
• Forwarding table built from the SPF tree
OSPF: How it works
• When change occurs:
– Announce the change to all OSPF neighbours
– All routers run the SPF algorithm on the revised database
– Install any change in the forwarding table
Broadcast Networks
• These are network technologies such as Ethernet and
FDDI
• Introduces Designated and Backup Designated routers
(DR and BDR)
– Only DR and BDR form full adjacencies with other routers
– The remaining routers remain in a “2-way” state with each other
• If they were adjacent, we’d have n-squared scaling problem
– If DR or BDR “disappear”, re-election of missing router takes
place
Designated Router

• One per multi-access network


– Generates network link advertisements for the multi-access
network
– Speeds database synchronisation

Backup
Designated Designated
Router Router

Designated Backup
Router Designated Router
Designated Router
• All routers are adjacent to the DR
– All routers are adjacent to the BDR also
• All routers exchange routing information with DR (..)
– All routers exchange routing information with the BDR
• DR updates the database of all its neighbours
– BDR updates the database of all its neighbours
• This scales! 2n problem rather than having an n-squared
problem.
Designated Router

DR BDR

• Adjacencies only formed with DR and BDR


• LSAs propagate along the adjacencies
Designated Router Priority

• Determined by interface priority


• Otherwise by highest router ID
– (For Cisco IOS, this is address of loopback
interface, otherwise highest IP address on
router)
131.108.3.2 131.108.3.3

DR

R1 Router ID = 144.254.3.5 R2 Router ID = 131.108.3.3

144.254.3.5
More Advanced OSPF
• OSPF Areas
• Virtual Links
• Router Classification
• OSPF route types
• External Routes
• Route authentication
• Equal cost multipath
OSPF Areas
• Group of contiguous
hosts and networks
• Per area topological
database
– Invisible outside the area
– Reduction in routing traffic
Area 2 Area 3
• Backbone area
contiguous Area 0
Backbone Area
– All other areas must be
connected to the
backbone
• Virtual Links
Area 1
Area 4
OSPF Areas
• Reduces routing traffic in area 0
• Consider subdividing network into areas
– Once area 0 is more than 10 to 15 routers
– Once area 0 topology starts getting complex
• Area design often mimics typical ISP core network design
• Virtual links are used for “awkward” connectivity
topologies (…)
Virtual Links
• OSPF requires that all areas MUST be connected to area
0
• If topology is such that an area cannot have a physical
connection to a device in area 0, then a virtual link must
be configured
• Otherwise the disconnected area will only be able to have
connectivity to its immediately neighbouring area, and not
the rest of the network
Classification of Routers

IR

Area 2 Area 3

ABR/BR
Area 0 • Internal Router (IR)
• Area Border Router
ASBR (ABR)
• Backbone Router (BR)
To other AS
Area 1
• Autonomous System
Border Router (ASBR)
OSPF Route Types

Area 2 Area 0 Area 3


• Intra-Area route
ABR – All routes inside an area
• Inter-Area route
ASBR – Routes advertised from one area
to another area by an ABR
To other AS • External route
– Routes imported into OSPF from
Area 1
another routing protocol by an
ASBR
External Routes

• Type 1 external metric: metrics are


to N1
added to the summarised internal link
External Cost = 1
cost Cost = 10
R1
to N1
R2 External Cost = 2

Cost = 8

R3
Network Type 1 Next Hop
N1 11 R2
N1 10 R3 Selected Route
External Routes
• Type 2 external metric: metrics are compared without
adding to the internal link cost
to N1
External Cost = 1

Cost = 10
R1
to N1
R2 External Cost = 2

Cost = 8

R3
Network Type 2 Next Hop
N1 1 R2 Selected Route
N1 2 R3
Route Authentication
• Now recommended to use route authentication for OSPF
– …and all other routing protocols
• Susceptible to denial of service attacks
– OSPF runs on TCP/IP
– Automatic neighbour discovery
• Route authentication – Cisco example:
router ospf <pid>
network 192.0.2.0 0.0.0.255 area 0
area 0 authentication
interface ethernet 0/0
ip ospf authentication-key <password>
Equal Cost Multipath
• If n paths to same destination have equal cost, OSPF will
install n entries in the forwarding table
– Loadsharing over the n paths
– Useful for expanding links across an ISP backbone
• Don’t need to use hardware multiplexors
• Don’t need to use static routing
Summary
• Link State Protocol
• Shortest Path First
• OSPF operation
• Broadcast networks
– Designated and Backup Designated Router
• Advanced Topics
– Areas, router classification, external networks, authentication,
multipath
IS-IS

18CSC310J-DCNSD NWC/SRMIST 171


IS-IS
• IS-IS is an IGP, link-state routing protocol, similar to
OSPF.
• It forms neighbor adjacencies, has areas, exchanges link-
state packets, builds a link-state database and runs the
Dijkstra SPF algorithm to find the best path to each
destination, which is installed in the routing table.
IS-IS
• ISO also uses some different terminology, for example:
• Router = Intermediate system
• Host = End system
• Unlike OSPF which was developed by the IETF (Internet
Engineering Task Force), IS-IS was originally developed
by DEC for CLNS, not IP and this is why it’s called IS-IS
(Intermediate System – Intermediate System).
IS-IS
• Later, IS-IS was adapted so that it could also route IP and
is then called integrated IS-IS.
• Nowadays, we use IP everywhere so you might wonder
why we care about this.
• When working with IS-IS, you will see some references to
CLNP/CLNS here and there.
IS-IS
• For example, when configuring a router ID (called a
Network Entity Title), it has to be configured with the
NSAP (Network Service Access Point Address) format.
• NSAP is similar to an IP address, and it is not
automatically configured so we have to understand its
format.
IS-IS
• IS-IS also rides directly on top of an Ethernet header,
using its own header format. It’s not encapsulated in an IP
packet like other routing protocols (OSPF and EIGRP)
are:
IS-IS
• IS-IS is a highly scalable routing protocol, which is why it
is used often on large service provider network
backbones.
IS-IS
Areas and Router Roles
• IS-IS uses different areas where the entire router sits in an area,
not just one of its interfaces like with OSPF.
• There is no backbone area, the backbone is formed by a string of
routers.
There are three types of routers:
Level 1 system: this is an intra-area router, it only knows what the
local area looks like and will only learn prefixes from its own area. It
creates a level 1 link-state database and SPF tree for the area.
IS-IS
Areas and Router Roles
• Level 2 system: this is a backbone router that knows all
intra-area and inter-area routes. It creates a level 2 link-
state database and SPF tree for the backbone.
• Level 1-2 system: this is a router that performs both
roles. It creates a separate level 1 and 2 link-state
database and two SPF trees, one for each database.
IS-IS
Areas and Router Roles
• Similar to other routing protocols like OSPF and EIGRP,
IS-IS routers will send hello packets.
• When you send and receive hello packets, you will form a
neighbor adjacency. Routers will only form neighbor
adjacencies with routers that use the same level.
IS-IS
Areas and Router Roles

Above we have two routers in a single area. There is


only one area so these two routers are configured as
level 1 routers. These two routers will form a level 1
neighbor adjacency.
IS-IS
Areas and Router Roles

Level 1 routers only know what the local area looks like.
If a level 1 router wants to reach something outside of its
area, it has to use a level 2 router. In each area, we
configure one router as a level 1-2 router.
IS-IS
Areas and Router Roles
These level 1-2 routers will establish two neighbor
adjacencies:

• Level 1 neighbor adjacency with the router in the same


area.
• Level 2 neighbor adjacency with the router in the other
area.
IS-IS
Areas and Router Roles
IS-IS
Areas and Router Roles
The router in area 4 is a level 2 backbone router. There
are no level 1 routers in area 4 so we don’t need a level
1-2 router there.
Area 3 has two level 1-2 routers. These routers will form
two neighbor adjacencies with each other:
• Level 1 adjacency
• Level 2 adjacency
IS-IS
Areas and Router Roles
The level two routers form a continuous string of backbone
routers:
IS-IS
Link State Packets
It uses LSPs (Link State Packet) which is similar to OSPF’s
LSAs. In the LSP you will find:

• One or more prefixes


• Adjacent neighbors
• Metric
IS-IS
Link State Packets
It uses LSPs (Link State Packet) which is similar to OSPF’s
LSAs. In the LSP you will find:

• One or more prefixes


• Adjacent neighbors
• Metric
IS-IS
Link State Packets

Each router will create an LSP (illustrated with the green


jigsaw) . In the LSP we find the directly connected
networks that are advertised in IS-IS. A few seconds
later, these routes become neighbors
IS-IS
Link State Packets

R1 and R2 are in the same area so they will establish a


level 1 neighbor adjacency. These routers will flood their
LSPs within the area so that everyone knows about all
LSPs in the area.
IS-IS
Link State Packets

The two routers add each other’s LSP in their database.


These routers can now run SPF on their level 1 database
and figure out the shortest path to each destination.
IS-IS
Link State Packets
• Connect area 12 to another area, this means we need a
level 2 router.
• Convert R2 into a level 1-2 router so I can show you what
will happen.
• At this moment, we start with a clean slate so there is no
neighbor adjacency between R1 and R2:
IS-IS
Link State Packets

R2 now has a second database, the level 2 database. Besides its level 1
database and level 1 LSP, it now also has a level 2 database. It generates a level
2 LSP and all prefixes for interfaces that are directly connected and advertised in
IS-IS.
IS-IS
Link State Packets
A few seconds later, R1 and R2 form a level 1 neighbor adjacency:
IS-IS
Link State Packets
• Once again, R1 and R2 will exchange their level 1
LSPs.
• R2 receives the level 1 LSP from R1 and it copies
new prefixes from its level 1 database to the LSP
in the level 2 database.
• In this example, that is 1.1.1.1/32 from R1.
IS-IS
Link State Packets
• Add a second area now, similar to area 12.
• There is no connection yet between the two
areas but the routers have formed a level 1
neighbor adjacency within the area:
IS-IS
Link State Packets

R4 has learned about the 3.3.3.3/32 prefix from R3 and copies this
prefix from the LSP in the level 1 database to its own LSP in the level
2 database.
BGP

18CSC310J-DCNSD NWC/SRMIST 198


Border Gateway Protocol (BGP)
• Routing/Forwarding basics
• Building blocks
• Exercises
• BGP protocol basics
• Exercises
• BGP path attributes
• Best path computation
• Exercises
Border Gateway Protocol (BGP)...

• Typical BGP topologies


• Routing Policy
• Exercises
• Redundancy/Load sharing
• Best current practices
Routing/Forwarding
Basics
IP route lookup:Longest
match routing
R3
All 10/8 except
Packet: Destination 10.1/16
IP address: 10.1.1.1

R1 R2 R4

10/8 -> R3
10.1/16
10.1/16 -> R4
20/8 -> R5
30/8 -> R6
…..
R2’s IP routing table
IP route lookup: Longest match
routing

R3
All 10/8 except
Packet: Destination 10.1/16
IP address: 10.1.1.1

R1 R2 R4
10.1/16
10/8 -> R3 10.1.1.1 & FF.0.0.0
10.1/16 -> R4 is equal to Match!
20/8 -> R5 10.0.0.0 & FF.0.0.0

…..
R2’s IP routing table
IP route lookup: Longest match
routing

R3
All 10/8 except
Packet: Destination 10.1/16
IP address: 10.1.1.1

R1 R2 R4
10.1/16
10/8 -> R3
10.1/16 -> R4 10.1.1.1 & FF.FF.0.0
20/8 -> R5 is equal to Match as well!
10.1.0.0 & FF.FF.0.0
…..
R2’s IP routing table
IP route lookup: Longest match
routing

R3
All 10/8 except
Packet: Destination 10.1/16
IP address: 10.1.1.1

R1 R2 R4
10.1/16
10/8 -> R3
10.1/16 -> R4
20/8 -> R5 10.1.1.1 & FF.0.0.0
….. is equal to
Does not match!
20.0.0.0 & FF.0.0.0
R2’s IP routing table
IP route lookup: Longest match
routing

R3
All 10/8 except
Packet: Destination 10.1/16
IP address: 10.1.1.1

R1 R2 R4
10.1/16
10/8 -> R3
10.1/16 -> R4 Longest match, 16 bit netmask
20/8 -> R5

…..
R2’s IP routing table
IP route lookup: Longest match
routing
• default is 0.0.0.0/0
• can handle it using the normal longest match algorithm
• matches everything. Always the shortest match.
Forwarding

• Uses the routing table built by routing protocols


• Performs the lookup to find next-hop and outgoing
interface
• Switches the packet with new encapsulation as per the
outgoing interface
Building Blocks

• Autonomous System (AS)


• Types of Routes
• IGP/EGP
• DMZ
• Policy
• Egress
• Ingress
Autonomous System (AS)

AS 100

• Collection of networks with same policy


• Single routing protocol
• Usually under single administrative control
• IGP to provide internal connectivity
Autonomous System(AS)...

• Identified by ‘AS number’


• Public & Private AS numbers
• Examples:
– Service provider
– Multi-homed customers
– Anyone needing policy discrimination
Routing flow and packet flow
packet flow
egress

accept announce
AS 1 announce
Routing flow
accept
AS2
ingress
packet flow

For networks in AS1 and AS2 to communicate:


AS1 must announce routes to AS2
AS2 must accept routes from AS1
AS2 must announce routes to AS1
AS1 must accept routes from AS2
Egress Traffic

• Packets exiting the network


• Based on
– Route availability (what others send you)
– Route acceptance (what you accept from
others)
– Policy and tuning (what you do with routes
from others)
– Peering and transit agreements
Ingress Traffic

• Packets entering your network


• Ingress traffic depends on:
– What information you send and to who
– Based on your addressing and ASes
– Based on others’ policy (what they accept from you and what
they do with it)
Types of Routes

• Static Routes
– configured manually
• Connected Routes
– created automatically when an interface is ‘up’
• Interior Routes
– Routes within an AS
• Exterior Routes
– Routes exterior to AS
What Is an IGP?

• Interior Gateway Protocol


• Within an Autonomous System
• Carries information about internal prefixes
• Examples—OSPF, ISIS, EIGRP…
What Is an EGP?

• Exterior Gateway Protocol


• Used to convey routing information between ASes
• De-coupled from the IGP
• Current EGP is BGP4
Why Do We Need an EGP?

• Scaling to large network


– Hierarchy
– Limit scope of failure
• Define administrative boundary
• Policy
– Control reachability to prefixes
Interior vs. Exterior
Routing Protocols

• Interior • Exterior
– Automatic Specifically configured
discovery peers
– Generally trust Connecting with outside
your IGP routers networks
– Routes go to all
Set administrative
IGP routers
boundaries
Hierarchy of Routing Protocols

Other ISP’s

BGP4

BGP4 / OSPF

BGP4 BGP4/Static
Local NAP
FDDI
Customers
Demilitarized Zone (DMZ)

A C
DMZ
AS 100 Network AS 101
B D

AS 102

• Shared network between ASes


Addressing - ISP
• Need to reserve address space for its
network.
• Need to allocate address blocks to its
customers.
• Need to take “growth” into consideration
• Upstream link address is allocated by
upstream provider
BGP Basics

• Terminology
• Protocol Basics
• Messages
• General Operation
• Peering relationships (EBGP/IBGP)
• Originating routes
Terminology
• Neighbor
– Configured BGP peer
• NLRI/Prefix
– NLRI - network layer reachability information
– Reachability information for a IP address & mask
• Router-ID
– Highest IP address configured on the router
• Route/Path
– NLRI advertised by a neighbor
Protocol Basics
Peering

A C

AS 100 AS 101
B D

• Routing protocol used between E


ASes
–if you aren’t connected to multiple AS 102
ASes, you don’t need BGP :)
• Runs over TCP
• Path vector protocol
• Incremental update
BGP Basics ...

• Each AS originates a set of NLRI


• NLRI is exchanged between BGP peers
• Can have multiple paths for a given prefix
• Picks the best path and installs in the IP forwarding table
• Policies applied (through attributes) influences BGP path
selection
BGP Peers

A C

AS 100 AS 101
220.220.8.0/24 220.220.16.0/24
B D

BGP speakers E
are called peers
Peers in different AS’s
AS 102
220.220.32.0/24
are called External Peers
eBGP TCP/IP
Peer Connection
Note: eBGP Peers normally should be directly connected.
BGP Peers

A C

AS 100 AS 101
220.220.8.0/24 220.220.16.0/24
B D

BGP speakers are E


called peers
Peers in the same AS
AS 102
220.220.32.0/24
are called Internal Peers
iBGP TCP/IP
Peer Connection
Note: iBGP Peers don’t have to be directly connected.
BGP Peers

A C

AS 100 AS 101
220.220.8.0/24 220.220.16.0/24
B D

BGP Peers exchange E


Update messages
containing Network Layer AS 102
Reachability Information 220.220.32.0/24

(NLRI)
BGP Update
Messages
Configuring BGP Peers
AS 100 eBGP TCP Connection AS 101
222.222.10.0/30
A .2 220.220.8.0/24 .1 B .2 .1 C .2 220.220.16.0/24 .1 D

interface Serial 0 interface Serial 0


ip address 222.222.10.2 255.255.255.252 ip address 222.222.10.1 255.255.255.252

router bgp 100 router bgp 101


network 220.220.8.0 mask 255.255.255.0 network 220.220.16.0 mask 255.255.255.0
neighbor 222.222.10.1 remote-as 101 neighbor 222.222.10.2 remote-as 100

• BGP Peering sessions are established using the BGP


“neighbor” configuration command
– External (eBGP) is configured when AS numbers are different
BGP Updates — NLRI
• Network Layer Reachability Information
• Used to advertise feasible routes
• Composed of:
– Network Prefix
– Mask Length
BGP Updates — Attributes
• Used to convey information associated with NLRI
– AS path
– Next hop
– Local preference
– Multi-Exit Discriminator (MED)
– Community
– Origin
– Aggregator
AS-Path Attribute
• Sequence of ASes a AS 200 AS 100
170.10.0.0/16 180.10.0.0/16
route has traversed
• Loop detection Network Path
180.10.0.0/16 300 200 100
• Apply policy AS 300
170.10.0.0/16 300 200

AS 400
150.10.0.0/16

Network Path
AS 500 180.10.0.0/16 300 200 100
170.10.0.0/16 300 200
150.10.0.0/16 300 400
Next Hop Attribute
AS 300
AS 200 192.10.1.0/30 140.10.0.0/16
150.10.0.0/16 C .1 .2 D
E
B
.2

/30
Network Next-Hop Path
.2.0
192
.20 160.10.0.0/16 192.20.2.1 100

.1
• Next hop to reach a network
A
• Usually a local network is the next
AS 100 hop in eBGP session
160.10.0.0/16

BGP Update
Messages
Next Hop Attribute
AS 300
AS 200 192.10.1.0/30 140.10.0.0/16
150.10.0.0/16 C .1 .2 D
E
B
.2 Network Next-Hop Path

/30
150.10.0.0/16 192.10.1.1 200
.20
.2.0
• Next hop to reach a network
160.10.0.0/16 192.10.1.1 200 100
• Usually a local network is the
192

.1

A next
hop in eBGP session
AS 100
160.10.0.0/16
• Next Hop updated between
eBGP Peers
BGP Update
Messages
Next Hop Attribute
AS 300
AS 200 192.10.1.0/30 140.10.0.0/16
150.10.0.0/16 C .1 .2 D
E
B
.2

/30
Network Next-Hop Path
.2.0
150.10.0.0/16 192.10.1.1 200
.20
160.10.0.0/16 192.10.1.1 200 100 • Next hop not
192

.1
changed
A between iBGP
AS 100
peers
160.10.0.0/16

BGP Update
Messages
Next Hop Attribute (more)
• IGP should carry route to next hops
• Recursive route look-up
• Unlinks BGP from actual physical topology
• Allows IGP to make intelligent forwarding decision
BGP Updates —
Withdrawn Routes

• Used to “withdraw” network reachability


• Each Withdrawn Route is composed of:
– Network Prefix
– Mask Length
BGP Updates —
Withdrawn Routes

AS 321
AS 123
.1 192.168.10.0/24 .2
BGP Update
Message

Withdraw
Withdraw Routes
Routes
192.192.25.0/24
192.192.25.0/24

x
Connectivity lost 192.192.25.0/24

Network Next-Hop Path


150.10.0.0/16 192.168.10.2 321 200
192.192.25.0/24 192.168.10.2 321
BGP Routing Information Base
BGP RIB
Network Next-Hop Path
*>i160.10.1.0/24 192.20.2.2 i
*>i160.10.3.0/24 192.20.2.2 i

router bgp 100


network 160.10.0.0 255.255.0.0
no auto-summary
D 10.1.2.0/24
D 160.10.1.0/24
D 160.10.3.0/24
R 153.22.0.0/16
S 192.1.1.0/24
BGP ‘network’ commands are normally
used to populate the BGP RIB with routes
Route Table from the Route Table
BGP Routing Information Base
BGP RIB
Network Next-Hop Path
*> 160.10.0.0/16 0.0.0.0 i
* i 192.20.2.2 i
s> 160.10.1.0/24 192.20.2.2 i
s> 160.10.3.0/24 192.20.2.2 i

router bgp 100


network 160.10.0.0 255.255.0.0
aggregate-address 160.10.0.0 255.255.0.0 summary-only
no auto-summary

D 10.1.2.0/24
D 160.10.1.0/24
D 160.10.3.0/24
R 153.22.0.0/16
S 192.1.1.0/24
BGP ‘aggregate-address’ commands may
be used to install summary routes in the
Route Table BGP RIB
BGP Routing Information Base
BGP RIB
Network Next-Hop Path
*> 160.10.0.0/16 0.0.0.0 i
* i 192.20.2.2 i
s> 160.10.1.0/24 192.20.2.2 i
s> 160.10.3.0/24 192.20.2.2 i
*> 192.1.1.0/24 192.20.2.2 ?

router bgp 100


network 160.10.0.0 255.255.0.0
redistribute static route-map foo
no auto-summary

D 10.1.2.0/24 access-list 1 permit 192.1.0.0 0.0.255.255


D 160.10.1.0/24
D 160.10.3.0/24 route-map foo permit 10
R 153.22.0.0/16 match ip address 1
S 192.1.1.0/24
BGP ‘redistribute’ commands can also be
used to populate the BGP RIB with routes
Route Table from the Route Table
BGP Routing Information Base
IN Process OUT Process
BGP RIB
Network Next-Hop Path
*>i160.10.1.0/24 192.20.2.2 i
*>i160.10.3.0/24 192.20.2.2 i
Update Update *> 173.21.0.0/16 192.20.2.1 100

Network Next-Hop Path


173.21.0.0/16 192.20.2.1 100

• BGP “in” process


• receives path information from peers
• results of BGP path selection placed in the BGP table
• “best path” flagged (denoted by “>”)
BGP Routing Information Base
IN Process OUT Process
BGP RIB
Network Next-Hop Path
*>i160.10.1.0/24 192.20.2.2 i
*>i160.10.3.0/24 192.20.2.2 i
* > 173.21.0.0/16 192.20.2.1 100 Update Update

Network Next-Hop Path


160.10.1.0/24 192.20.2.2 200
160.10.3.0/24 192.20.2.2 200
173.21.0.0/16 192.20.2.1
192.20.2.2 200 100
• BGP “out” process
• builds update using info from RIB
Next-Hop changed
• may modify update based on config
• Sends update to peers
BGP Routing Information Base
BGP RIB
Network Next-Hop Path
*>i160.10.1.0/24 192.20.2.2 i
*>i160.10.3.0/24 192.20.2.2 i
*> 173.21.0.0/16 192.20.2.1 100

D 10.1.2.0/24
D 160.10.1.0/24 • Best paths installed in routing table if:
D 160.10.3.0/24
R 153.22.0.0/16
• prefix and prefix length are unique
S 192.1.1.0/24 • lowest “protocol distance”
B 173.21.0.0/16

Route Table
The ‘Bible’ & other resources
• Route-views.oregon-ix.net

• Internet Routing Architectures


– Bassam Halabi
– pg. 168 BGP Decision Process Summary
Types of BGP Messages
• OPEN
– To negotiate and establish peering
• UPDATE
– To exchange routing information
• KEEPALIVE
– To maintain peering session
• NOTIFICATION
– To report errors (results in session reset)
Internal BGP Peering (IBGP)
AS 100
D
A
B

• BGP peer within the same AS


• Not required to be directly connected
• Maintain full IBGP mesh or use Route Reflection
External BGP Peering (EBGP)

AS 100 AS 101
C

• Between BGP speakers in different AS


• Directly connected or peering address is reachable
35.0.0.0/8
An Example…
A AS3561

AS200
F

B AS21
C

D
AS101 AS675
E

Learns about 35.0.0.0/8 from F & D


Update message

• Withdrawn routes
• Path Attributes
• Advertised routes
BGP Path Attributes: Why ?
• Encoded as Type, Length & Value (TLV)
• Transitive/Non-Transitive attributes
• Some are mandatory
• Used in path selection
• To apply policy for steering traffic
BGP Path Attributes...

• Origin
• AS-path
• Next-hop
• Multi-Exit Discriminator (MED)
• Local preference
• BGP Community
• Others...
AS-PATH

• Updated by the sending router with its AS number


• Contains the list of AS numbers the update traverses.
• Used to detect routing loops
– Each time the router receives an update, if it finds its AS
number, it discards the update
AS-Path

AS 200 AS 100
170.10.0.0/16 180.10.0.0/16
• Sequence of ASes a route
has traversed 180.10.0.0/16
dropped
• Loop detection AS 300
AS 400
150.10.0.0/16

180.10.0.0/16 300 200 100


AS 500 170.10.0.0/16 300 200
150.10.0.0/16 300 400
Next-Hop
150.10.1.1 150.10.1.2

AS 200
150.10.0.0/16 AS 300
A B

150.10.0.0/16 150.10.1.1
160.10.0.0/16 150.10.1.1

AS 100
160.10.0.0/16
• Next hop router to reach a network
• Advertising router/Third party in
EBGP
• Unmodified in IBGP

0799_04F7_c2 Cisco Systems Confidential 20


Third Party Next Hop

AS 200
192.68.1.0/24 150.1.1.3

C
150.1.1.1
peering

150.1.1.2 150.1.1.3
150.1.1.3

A B

192.68.1.0/24

AS 201

• More efficient, but


bad idea!
Next Hop...

• IGP should carry route to next hops


• Recursive route look-up
• Unlinks BGP from actual physical topology
• Allows IGP to make intelligent forwarding
decision
Local Preference

• Not for EBGP, mandatory for IBGP


• Default value is 100 on Ciscos
• Local to an AS
• Used to prefer one exit over another
• Path with highest local preference wins
Local Preference

AS 100
160.10.0.0/16

AS 200 AS 300

D 500 800 E

A B

160.10.0.0/16 500
AS 400
> 160.10.0.0/16 800
C
Multi-Exit Discriminator

• Non-transitive
• Represented as a numeric value (0-0xffffffff)
• Used to convey the relative preference of entry points
• Comparable if paths are from the same AS
• Path with lower MED wins
• IGP metric can be conveyed as MED
Multi-Exit Discriminator (MED)

AS 200

C
preferred
192.68.1.0/24 2000 192.68.1.0/24 1000

A B

192.68.1.0/24

AS 201
Origin

• Conveys the origin of the prefix


• Three values:
– IGP - Generated using “network” statement
• ex: network 35.0.0.0
– EGP - Redistributed from EGP
– Incomplete - Redistribute IGP
• ex: redistribute ospf
• IGP < EGP < INCOMPLETE
Communities

• Transitive, Non-mandatory
• Represented as a numeric value (0-0xffffffff)
• Used to group destinations
• Each destination could be member of multiple
communities
• Flexibility to scope a set of prefixes within or across
AS for applying policy
Community...

Community Local Preference


201:110 110
Service Provider AS 200 201:120 120

C D

Community:201:110 Community:201:120

A B
192.68.1.0/24
Customer AS 201
Synchronization
1880

C
A
D OSPF
690 35/8
• C not running BGP (non-pervasive BGP) 209
• A won’t advertise 35/8 to D until the IGP is inBsync
• Turn synchronization off!
– Run pervasive BGP

router bgp 1880


no sync
BGP Route Selection (bestpath)
Only one path as the bestpath !

• Route has to be synchronized


Prefix in forwarding table
• Next-hop has to be accessible
Next-hop in forwarding table
• Largest weight
Local to the router
• Largest local preference
Spread within AS
• Locally sourced
Via redistribute or network statement
BGP Route Selection ...
• Shortest AS-path length
number of ASes in the AS-path attribute
• Lowest origin
IGP < EGP < INCOMPLETE
• Lowest MED
between paths from same AS
• External over internal
closest exit from a router
• Closest next-hop
Lower IGP metric, closer exit from as AS
• Lowest router-id
• Lowest IP address of neighbor
BGP Route Selection...
AS 100

AS 200 AS 300
D
Increase AS path attribute length
by at least 1
A B
AS 400
AS 400’s Policy to reach AS100
AS 200 preferred path
AS 300 backup
OTV

18CSC310J-DCNSD NWC/SRMIST 270


Overlay Transport Virtualization
(OTV)
• Overlay transportation introduces the concept of “MAC routing,”
which means a control plane protocol is used to exchange MAC
reachability information between network devices providing LAN
extension functionality.
• This is a significant shift from Layer 2 switching that traditionally
leverages data plane learning, and it is justified by the need to
limit flooding of Layer 2 traffic across the transport infrastructure.

18CSC310J-DCNSD NWC/SRMIST 271


Overlay Transport Virtualization
(OTV)
• Layer 2 communication between sites resembles routing more
than switching.
• If the destination MAC address information is unknown, traffic is
dropped (not flooded), preventing the waste of precious
bandwidth across the WAN.

18CSC310J-DCNSD NWC/SRMIST 272


Overlay Transport Virtualization
(OTV)
• OTV also introduces the concept of the dynamic encapsulation for
Layer 2 flows that need to be sent to remote locations.
• Each Ethernet frame is individually encapsulated into an IP packet
and delivered across the transport network.
• This eliminates the need to establish virtual circuits, called
pseudowires, between the data center locations.

18CSC310J-DCNSD NWC/SRMIST 273


Overlay Transport Virtualization
(OTV)
• Immediate advantages include improved flexibility when adding or
removing sites to the overlay, more optimal bandwidth utilization
across the WAN (specifically when the transport infrastructure is
multicast enabled), and independence from the transport
characteristics (Layer 1, Layer 2, or Layer 3).
• OTV provides a native built-in multihoming capability with
automatic detection.

18CSC310J-DCNSD NWC/SRMIST 274


Overlay Transport Virtualization
(OTV)
• Two or more devices can be leveraged in each data center to
provide LAN extension functionality without running the risk of
creating an end-to-end loop that would jeopardize the overall
stability of the design.
• This is achieved by leveraging the same control plane protocol
used for the exchange of MAC address information, without the
need of extending the Spanning Tree Protocol (STP) across the
overlay.

18CSC310J-DCNSD NWC/SRMIST 275


Overlay Transport Virtualization
(OTV)
OTV Terminology
To understand how OTV works in an existing IP transport
environment, the OTV interfaces and terms are

18CSC310J-DCNSD NWC/SRMIST 276


Overlay Transport Virtualization
(OTV)
OTV Terminology
Edge device (ED): This device connects the site to the (WAN/MAN)
core and is responsible for performing all the OTV functions.
• An edge device receives Layer 2 traffic for all VLANs that need to
be extended to remote locations and dynamically encapsulates
the Ethernet frames into IP packets that are then sent across the
OTV transport infrastructure.
• For resiliency, two OTV edge devices can be deployed on each
site to provide redundancy.

18CSC310J-DCNSD NWC/SRMIST 277


Overlay Transport Virtualization
(OTV)
OTV Terminology
Internal interfaces: These are the L2 interfaces (usually 802.1q
trunks) of the ED that face the site.

• Internal interfaces are regular access or trunk ports.

18CSC310J-DCNSD NWC/SRMIST 278


Overlay Transport Virtualization
(OTV)
OTV Terminology

• Trunk configuration will extend more than one VLAN across the
overlay. There is no need to apply OTV-specific configuration to
these interfaces.

• Typical Layer 2 functions (like local switching, spanning tree


operation, data plane learning, and flooding) are performed on the
internal interfaces.

18CSC310J-DCNSD NWC/SRMIST 279


Overlay Transport Virtualization
(OTV)
OTV Terminology
Join interface: This is the L3 interface of the ED that faces the core.
The join interface is used by the edge device for different purposes:

• “Join” the overlay network and discover the other remote OTV
edge devices.
• Form OTV adjacencies with the other OTV edge devices
belonging to the same VPN.

18CSC310J-DCNSD NWC/SRMIST 280


Overlay Transport Virtualization
(OTV)
OTV Terminology

• Send/receive MAC reachability information.


• Send/receive unicast and multicast traffic.

18CSC310J-DCNSD NWC/SRMIST 281


Overlay Transport Virtualization
(OTV)
OTV Terminology
Overlay interface: This is a logical multiaccess multicast-capable
interface. It encapsulates Layer 2 frames in IP unicast or multicast
headers.
Every time the OTV edge device receives a Layer 2 frame destined
for a remote data center site, the frame is logically forwarded to the
overlay interface.
This instructs the edge device to perform the dynamic OTV
encapsulation on the Layer 2 packet and send it to the join interface
toward the routed domain.
18CSC310J-DCNSD NWC/SRMIST 282
Overlay Transport Virtualization
(OTV)
OTV Control Plane Function
• The principle of OTV is to build a control plane between the OTV
edge devices to advertise MAC address reachability information
instead of using data plane learning.
• However, before MAC reachability information can be exchanged,
all OTV edge devices must become “adjacent” to each other from
an OTV perspective.

18CSC310J-DCNSD NWC/SRMIST 283


Overlay Transport Virtualization
(OTV)
OTV Control Plane Function
Edge devices can be made adjacent in two ways, depending on the
nature of the transport network interconnecting the various sites:

• If the transport is multicast enabled, a specific multicast group can


be used to exchange the control protocol messages between the
OTV edge devices.

18CSC310J-DCNSD NWC/SRMIST 284


Overlay Transport Virtualization
(OTV)
OTV Control Plane Function
• If the transport is not multicast enabled, an alternative deployment
model is where one (or more) OTV edge device can be configured
as an adjacency server to which all other edge devices register;
this server communicates to them the list of devices belonging to
a given overlay.

18CSC310J-DCNSD NWC/SRMIST 285


Overlay Transport Virtualization
(OTV)
OTV Data Plane Function
• After the control plane adjacencies established between the OTV
edge devices and MAC address reachability information are
exchanged, traffic can start flowing across the overlay. Similar to
any L2 Switch, data plane traffic can be
• Unicast traffic
• Multicast traffic
• Broadcast traffic

18CSC310J-DCNSD NWC/SRMIST 286


VPLS

18CSC310J-DCNSD NWC/SRMIST 287


Virtual Private LAN Service
(VPLS)
• Virtual private LAN service (VPLS) is a type of virtual private
network technology that enables the connection of one or more
local area networks (LANs) over the Internet through a single
bridged connection.
• VPLS use Internet Protocol/Multiprotocol Label Switching to
provide an Ethernet interface or connection to the customers or
subscribers on an Internet connection.
• This is done through the use of edge routers.

18CSC310J-DCNSD NWC/SRMIST 288


Virtual Private LAN Service
(VPLS)
• VPLS is primarily implemented to provide remote
subscribers with an LAN-type connection over a virtual
private network (VPN).
• VPLS supports most network connectivity types, including
point-to-point, point-to-multipoint, multipoint-to-point and
more.
• When subscribers log in to the virtual private LAN, they
are provided with the features and impression of a
standard LAN connection.
18CSC310J-DCNSD NWC/SRMIST 289
Virtual Private LAN Service
(VPLS)
• The VPLS uses a VPN to create and manage
connections and to move subscriber data within the
network.
• Moreover, the subscriber can change location and
still connect to the virtual LAN.

18CSC310J-DCNSD NWC/SRMIST 290


Virtual Private LAN Service
(VPLS)
• VPLS allows you to securely connect multiple LANs
over the internet, making them appear as if they
were all on the same LAN (hence the “virtual”
qualifier) even though traffic is moving across a
service provider’s network.
• With VPLS, you are able to extend a Layer 2
network across geographically dispersed sites
using a shared core network infrastructure.
18CSC310J-DCNSD NWC/SRMIST 291
Virtual Private LAN Service
(VPLS)
• A VPLS network consists of three core
components:
customer edge equipment (CE),
provider edge equipment (PE), and a
core MPLS network.
• CE devices are routers or switches that lie at the
customer premises.
18CSC310J-DCNSD NWC/SRMIST 292
Virtual Private LAN Service
(VPLS)
• PE devices are routers at the service provider’s
network edge where VPN intelligence resides and
terminates, and where tunnels are set up to
connect to other PEs in point - multipoint format.
• The core MPLS network resides between PEs and
switches traffic based on MPLS labels.

18CSC310J-DCNSD NWC/SRMIST 293


Virtual Private LAN Service
(VPLS)
• The full mesh of MPLS tunnels between PE
devices are referred to as a set of “pseudo wires”
and makes up the core of the VPLS instance.

18CSC310J-DCNSD NWC/SRMIST 294


Virtual Private LAN Service
(VPLS)

18CSC310J-DCNSD NWC/SRMIST 295


Virtual Private LAN Service
(VPLS)

18CSC310J-DCNSD NWC/SRMIST 296


Virtual Private LAN Service
(VPLS)
• VPLS has lots of common elements with MPLS service.
Like MPLS, VPLS provides similar levels of QoS and
network visibility.
• As mentioned above, VPLS effectively sits on top of an
MPLS network with the MPLS acting as the “engine” for
traffic routing.
• Like with MPLS, need to roll out VPLS with a one carrier
approach, using the same carrier at each site within the
VPLS network.
18CSC310J-DCNSD NWC/SRMIST 297
Virtual Private LAN Service
(VPLS)
• However, there are a few distinct differences worth noting
that may dictate WAN architecture choices.
• VPLS sets up virtualized LAN-like environments with
static routing between sites, while MPLS has dynamic
routing capabilities that may come in handy in the event of
a network outage.
• VPLS places network control primarily on the customer
side, while MPLS places network control primarily in the
hands of the carrier.
18CSC310J-DCNSD NWC/SRMIST 298
Virtual Private LAN Service
(VPLS)
• With VPLS, each location and device appears to be within
the same LAN, using the same range of IP addresses.
• This approach allows end users to utilize their own
equipment and take control of Layer 3 traffic without
depending on the carrier for this.
• Also, because you’re using only one set of IP addresses,
troubleshooting any network element changes is
simplified a bit.

18CSC310J-DCNSD NWC/SRMIST 299


Virtual Private LAN Service
(VPLS)
• For smaller WANs where control, ease of install, and cost
are paramount, VPLS likely reigns supreme.
• For larger WANs with more mission-critical traffic (voice or
important data services), MPLS may be better suited.
• MPLS’s self-healing routes provide improved redundancy,
and the lack of control it provides comes with less
administrative burden and an improved ability to scale
across higher site counts.

18CSC310J-DCNSD NWC/SRMIST 300

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy