Understanding Policy-Based Routing in Data Center Fabrics: VxLAN EVPN

I. Executive Summary: Policy Based Routing in the DC Fabric

The modern Data Center (DC) fabric, built upon VXLAN EVPN principles, mandates a flexible and resilient approach to traffic management, particularly for North-South (N/S) flows exiting the environment. While conventional routing, based on the destination IP address (longest prefix match), efficiently handles standard traffic delivery, specific compliance, security, and application optimization requirements often necessitate diverting traffic from its calculated optimal path. This concept, often referred to as “traffic steering” or “steering,” requires the selective redirection of traffic flows originating from specific internal subnets toward Layer 4 through Layer 7 (L4-L7) service nodes, such as firewalls, load balancers, or deep packet inspection (DPI) platforms, or specific external path different than the routing protocol calculates and installs for external destination.   

Policy-Based Routing (PBR) is the foundational mechanism for achieving this redirection, utilizing access lists (ACLs) to match criteria beyond the destination IP, combined with route maps to define an alternate next-hop. This blog focuses on selective redirection of traffic flows originating from specific internal subnets toward external path different than the routing protocol calculates and installs for external destination.  

II. Architectural Foundation: The Border Leaf Policy Enforcement Point

The Border Leaf (BL) switch plays a pivotal role in a VXLAN EVPN fabric, serving as the critical demarcation point for all N/S connectivity. It acts as a VXLAN Tunnel Endpoint (VTEP) for transit traffic, facilitating the necessary protocol and encapsulation translation between the internal VXLAN domain (BGP EVPN) and the external network (traditional IP/BGP routing).   

The Border Leaf North-South Handoff Role

Within the context of VXLAN EVPN, the Border Leaf terminates the VXLAN encapsulation for traffic destined outside the fabric. When an internal Leaf VTEP sends VXLAN-encapsulated traffic destined for an external prefix, the Border Leaf receives the packet, decapsulates it, and performs a Layer 3 lookup within the relevant tenant VRF. External network routes are typically advertised into the fabric as EVPN Type-5 IP prefix routes, which the Border Leaf receives via an external BGP (eBGP) session or other routing protocols. These Type-5 routes are then propagated throughout the fabric via the EVPN control plane.   

Because the Border Leaf is the device that performs the ultimate L3 routing decision for egress traffic, it represents the ideal, centralized point of policy enforcement. The policy application must take place at the Layer 3 termination point of the tenant segment, which is the Switch Virtual Interface (SVI) associated with the L3 Virtual Network Identifier (L3VNI) and the specific Tenant VRF. Applying the policy to the ingress direction of this SVI ensures that the policy mechanism can inspect the source IP address (e.g., the internal subnet requiring steering) immediately after decapsulation and before the routing lookup consults the VRF’s FIB (Forwarding Information Base).   

Policy Application Context on the SVI

Traffic originating from an internal Leaf (Host VTEP) destined for the external network first arrives at the Border Leaf via the VXLAN tunnel. It is then mapped to the appropriate Tenant VRF via the L3VNI. The Border Leaf performs the decapsulation, and the packet enters the local ingress SVI of the tenant VRF. If the policy is applied to this SVI, the match criteria (the source IP of the internal subnet) are visible, and the policy can override the default routing behavior (e.g., routing to the WAN Edge Router via a default route) by redirecting the packet to an alternative next-hop (e.g., a firewall, or secondary path).   

If the Border Leaf policy dictates a successful PBR match, the packet is steered toward the service next-hop. If the policy results in a denial, or if no match occurs, the packet falls through and is subjected to the normal destination-based routing lookup defined in the VRF’s routing table. This architectural placement is crucial for ensuring that policy steering is applied only to selected flows, while the vast majority of traffic maintains efficient, hardware-forwarded default routing.   

Network Topology

A physical and logical topology illustrating this scenario typically involves the following elements:

  1. Internal Fabric (VXLAN EVPN): Leaf switches (Host VTEPs) and Spine switches (Route Reflectors/Underlay transit).
  2. Border Leaf (Policy Enforcement Point): NX-OS switch acting as VTEP and physical links to external devices.
  3. External Network: An Edge Router or WAN device, typically the default gateway for external prefixes, reachable through a VRF-Lite peering link.
  4. Service Node (Alternate Path): An alternate path Edge Router and connected to the Border Leaf for specialize service access.

III. Redirecting internal subnet traffic to specialized services

To demonstrate the application of traffic steering at the Border Leaf, consider redirecting internal subnet (10.10.30.0/24) traffic to specialized service (host 192.168.10.100)

Scenario Definition

A multi-tenant VXLAN EVPN fabric hosts several business units. vrfPROD contains mission-critical applications, including a sensitive Internal Subnet (10.10.30.0/24). Due to strict regulatory compliance mandates, systems from this subnet need to send data to an external server for deep packet inspection, and comprehensive logging by a highly resilient SaaS service (represented by 192.168.10.100 for this lab purpose).

  • Internal Subnet (Source): 10.10.30.0/24 within vrf PROD.
  • External Destination (General): Any external IP address (N/S traffic) 192.168.10.100/32.
  • Default Path (Standard Traffic): The default route (0.0.0.0/0) in VRF PROD points directly to the external Edge Router (10.1.3.1).
  • Alternate Path (Steered Traffic): The steered traffic from 10.10.30.0/24 to 192.168.10.100/32 in VRF PROD points directly to the alternate external Edge Router (10.1.3.5).

The objective is to implement a policy on the Border Leaf such that only traffic sourced from 10.10.30.0/24 is directed to the alternate Edge Router, while all other traffic within vrf follows the normal default route to the external Edge Router.

Configuration Steps

1. Feature and TCAM resource check

  • Verify Ingress RACL [ing-racl] is configured with value (configuring Hardware TCAM needs a reload)
  • PBR Feature: Policy-Based Routing functionality must be enabled: LEAF-BORDER-1(config)# feature pbr

2. PBR Configuration Flow

Policy-Based Routing in NX-OS follows a straightforward three-step process: matching the traffic of interest (using an ACL or prefix-list), defining the policy action (using a route-map to set a next-hop), and applying the policy to the ingress interface.

  1. Define the Source Traffic (ACL): An extended IP access list is created to explicitly match the source subnet and the desired destination (in this case, any external destination).
  2. Define the Policy (Route-Map): A route-map is created to permit the matching traffic and apply the redirection action using the set ip next-hop command. Traffic that does not match the permitted ACL implicitly falls through to be handled by the normal routing table lookups.
  3. Apply Policy to Tenant SVI: The policy is applied to the ingress direction of the Tenant VRF SVI on the Border Leaf, ensuring the policy is processed as soon as the packet exits the VXLAN tunnel.
# access list to match the source and destination IPs / subnets
ip access-list PBR
  10 permit ip 10.10.30.0/24 192.168.10.100/32 

# Route map to apply the action for the matched traffic
route-map PBR-RM permit 10
  match ip address PBR 
  set ip next-hop 10.1.3.5 
route-map PBR-RM pbr-statistics

# apply the route map in the ingress SVI
interface Vlan300
  description PROD-VRF
  no shutdown
  mtu 9216
  vrf member PROD
  ip forward
  ip policy route-map PBR-RM

IV. Testing & Verification

1. Ping 192.168.10.100 from 10.10.30.100 (an IP from 10.10.30.0/24) subnet BEFORE the ‘ip policy route-map PBR-RM’ is applied to the vlan 300 SVI.



Endpoint# ping 192.168.10.100 vrf prOD 
PING 192.168.10.100 (192.168.10.100): 56 data bytes
Request 0 timed out
Request 1 timed out
Request 2 timed out
Request 3 timed out
Request 4 timed out

2. Ping 192.168.10.100 from 10.10.30.100 (an IP from 10.10.30.0/24) subnet AFTER the ‘ip policy route-map PBR-RM’ is applied to the vlan 300 SVI.

Endpoint# ping 192.168.10.100 vrf prOD  # PBR policy is applied while ping is going - ping started replying
PING 192.168.10.100 (192.168.10.100): 56 data bytes
Request 0 timed out
Request 1 timed out
Request 2 timed out
64 bytes from 192.168.10.100: icmp_seq=3 ttl=252 time=1.159 ms
64 bytes from 192.168.10.100: icmp_seq=4 ttl=252 time=0.648 ms

Endpoint# ping 192.168.10.100 vrf prOD # ping started after PBR policy is applied
PING 192.168.10.100 (192.168.10.100): 56 data bytes
PING 192.168.10.100 (192.168.10.100): 56 data bytes
64 bytes from 192.168.10.100: icmp_seq=0 ttl=252 time=1.153 ms
64 bytes from 192.168.10.100: icmp_seq=1 ttl=252 time=0.706 ms
64 bytes from 192.168.10.100: icmp_seq=2 ttl=252 time=0.683 ms
64 bytes from 192.168.10.100: icmp_seq=3 ttl=252 time=0.734 ms
64 bytes from 192.168.10.100: icmp_seq=4 ttl=252 time=0.65 ms

Endpoint# traceroute 192.168.10.100 vrf prOD 
traceroute to 192.168.10.100 (192.168.10.100), 30 hops max, 40 byte packets
 1  10.10.30.1 (10.10.30.1)  1.052 ms  0.568 ms  0.482 ms
 2  10.1.3.2 (10.1.3.2)  23.387 ms  0.869 ms  0.696 ms
 3  192.168.10.100 (192.168.10.100)  0.923 ms  0.724 ms  0.686 ms

V. Operational Verification and Troubleshooting

Verification is paramount in a policy-driven environment to ensure that targeted traffic is being steered correctly and that untargeted traffic continues to follow the standard destination-based routing path.

Key NX-OS Verification Commands for Policy-Based Routing

Table 2: Key NX-OS Verification Commands

FunctionPBR Command (NX-OS)
Feature Statusshow feature pbr
Policy Applicationshow ip policy vrf PROD detail
Traffic Hits/Countersshow route-map PBR-RM pbr-statistics
Route-mapshow route-map PBR-RM
Routing Profile Manager (RPM) checkshow run rpm
Next hop routing requests informationshow system internal rpm pbr vrf PROD ip nexthop
Next-Hop Statusshow track (if configured)
Routing Contextshow ip route vrf PROD

Troubleshooting Flow

If traffic redirection fails, a systematic troubleshooting approach is necessary:

  1. Policy Interface Check: Verify that the ip policy route-map command is correctly applied to the ingress direction of the Tenant VRF SVI (Vlan300).
  2. ACL and Match Check: Check the underlying ACL counters (show access-list PBR). If traffic is not matching the ACL, the policy will be ignored, and traffic will follow the default routing table.
  3. Next-Hop Reachability (Critical Step): In the context of PBR the next-hop IP addresses must be reachable in the defined VRF context. For PBR, this means checking show ip route vrf to ensure the route to the next-hop is present (usually via a connected interface).

VI. Conclusion and Best Practices

For network environments built on Cisco NX-OS VXLAN EVPN, the implementation of policy-based traffic steering at the Border Leaf is essential for integrating services with special routing need and security and compliance services.

For successful deployment, network architects must ensure VxLAN EVPN infrastructure is properly designed and implemented using Cisco best practices. Sample configuration of a working VxLAN EVPN and PBR based on the topology shown above is included below.

VII. Configuration Example

# Spine-1

nv overlay evpn
feature ospf
feature bgp
feature pim
feature interface-vlan
feature vn-segment-vlan-based
feature nv overlay

ip pim rp-address 10.254.254.254 group-list 239.239.239.0/24
ip pim ssm range 232.0.0.0/8
ip pim anycast-rp 10.254.254.254 10.10.100.1
ip pim anycast-rp 10.254.254.254 10.10.100.2

interface Ethernet1/50
  description Link to leaf1
  mtu 9216
  ip address 10.1.2.1/30
  ip ospf network point-to-point
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode
  no shutdown

interface Ethernet1/51
  description Link to leaf1
  mtu 9216
  ip address 10.1.2.5/30
  ip ospf network point-to-point
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode
  no shutdown

interface loopback0
  description Loopback for Router ID
  ip address 10.10.100.1/32 tag 54321
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode

interface loopback1
  description Loopback for VTEP (PIP) 
  ip address 10.10.100.11/32 tag 54321
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode

interface loopback254
  description Loopback for PIM
  ip address 10.254.254.254/32
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode
icam monitor scale

router ospf UNDERLAY
  router-id 10.10.100.1
  log-adjacency-changes detail
router bgp 65501
  router-id 10.10.100.1
  neighbor 10.10.100.3
    remote-as 65501
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
      route-reflector-client
  neighbor 10.10.100.4
    remote-as 65501
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
      route-reflector-client
# Spine-2

nv overlay evpn
feature ospf
feature bgp
feature pim
feature interface-vlan
feature vn-segment-vlan-based
feature nv overlay

ip pim rp-address 10.254.254.254 group-list 239.239.239.0/24
ip pim ssm range 232.0.0.0/8
ip pim anycast-rp 10.254.254.254 10.10.100.1
ip pim anycast-rp 10.254.254.254 10.10.100.2

interface Ethernet1/35
  description Link to leaf1
  mtu 9216
  ip address 10.1.2.9/30
  ip ospf network point-to-point
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode
  no shutdown

interface Ethernet1/36
  description Link to leaf1
  mtu 9216
  ip address 10.1.2.13/30
  ip ospf network point-to-point
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode
  no shutdown

interface loopback0
  description Loopback for Router ID
  ip address 10.10.100.2/32 tag 54321
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode

interface loopback1
  description Loopback for VTEP (PIP) 
  ip address 10.10.100.12/32 tag 54321
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode

interface loopback254
  description Loopback for PIM
  ip address 10.254.254.254/32
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode
icam monitor scale

router ospf UNDERLAY
  router-id 10.10.100.2
  log-adjacency-changes detail
router bgp 65501
  router-id 10.10.100.2
  neighbor 10.10.100.3
    remote-as 65501
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
      route-reflector-client
  neighbor 10.10.100.4
    remote-as 65501
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
      route-reflector-client
# Leaf-1

nv overlay evpn
feature ospf
feature bgp
feature pim
feature interface-vlan
feature vn-segment-vlan-based
feature nv overlay

fabric forwarding anycast-gateway-mac eeee.eeee.eeee
ip pim rp-address 10.254.254.254 group-list 239.239.239.0/24
vlan 1,20,30,40,300
vlan 20
  vn-segment 20020
vlan 30
  vn-segment 20030
vlan 40
  vn-segment 20040
vlan 300
  name PROD-VRF
  vn-segment 30300

route-map FABRIC-REDIST-SUBNET permit 10
  match tag 12345 
vrf context PROD
  vni 30300
  rd auto
  address-family ipv4 unicast
    route-target both auto
    route-target both auto evpn
  address-family ipv6 unicast
    route-target both auto
    route-target both auto evpn

interface Vlan20
  no shutdown
  vrf member PROD
  ip address 10.10.20.1/24 tag 12345
  fabric forwarding mode anycast-gateway

interface Vlan30
  no shutdown
  vrf member PROD
  ip address 10.10.30.1/24 tag 12345
  fabric forwarding mode anycast-gateway

interface Vlan300
  description PROD-VRF
  no shutdown
  mtu 9216
  vrf member PROD
  ip forward

interface nve1
  no shutdown
  host-reachability protocol bgp
  source-interface loopback1
  member vni 20020
    mcast-group 239.239.239.20
  member vni 20030
    mcast-group 239.239.239.30
  member vni 30300 associate-vrf

interface Ethernet1/49
  description Link to Spine1
  mtu 9216
  ip address 10.1.2.10/30
  ip ospf network point-to-point
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode
  no shutdown

interface Ethernet1/50
  description Link to Spine1
  mtu 9216
  ip address 10.1.2.2/30
  ip ospf network point-to-point
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode
  no shutdown

interface Ethernet1/52
  switchport
  switchport mode trunk
  switchport trunk allowed vlan 20,30
  no shutdown

interface loopback0
  description Loopback for Router ID
  ip address 10.10.100.3/32
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode

interface loopback1
  description Loopback for VTEP (PIP) 
  ip address 10.10.100.13/32
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode
icam monitor scale

router ospf UNDERLAY
  router-id 10.10.100.3
  log-adjacency-changes detail
router bgp 65501
  router-id 10.10.100.3
  neighbor 10.10.100.1
    remote-as 65501
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
  neighbor 10.10.100.2
    remote-as 65501
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
  vrf PROD
    address-family ipv4 unicast
      advertise l2vpn evpn
      redistribute direct route-map FABRIC-REDIST-SUBNET
      maximum-paths ibgp 2
evpn
  vni 20020 l2
    rd auto
    route-target import auto
    route-target export auto
  vni 20030 l2
    rd auto
    route-target import auto
    route-target export auto

# Leaf-2-BL

nv overlay evpn
feature ospf
feature bgp
feature pim
feature pbr
feature interface-vlan
feature vn-segment-vlan-based
feature nv overlay

fabric forwarding anycast-gateway-mac eeee.eeee.eeee
ip pim rp-address 10.254.254.254 group-list 239.239.239.0/24
vlan 1,20,30,40,300
vlan 300
  name PROD-VRF
  vn-segment 30300

route-map FABRIC-REDIST-SUBNET permit 10
  match tag 12345 
route-map PBR-RM permit 10
  match ip address PBR 
  set ip next-hop 10.1.3.5    
vrf context PROD
  vni 30300
  rd auto
  address-family ipv4 unicast
    route-target both auto
    route-target both auto evpn
  address-family ipv6 unicast
    route-target both auto
    route-target both auto evpn

interface Vlan300
  description PROD-VRF
  no shutdown
  mtu 9216
  vrf member PROD
  ip forward
  ip policy route-map PBR-RM

interface nve1
  no shutdown
  host-reachability protocol bgp
  source-interface loopback1
  member vni 30300 associate-vrf

interface Ethernet1/49
  description Link to Spine1
  mtu 9216
  ip address 10.1.2.14/30
  ip ospf network point-to-point
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode
  no shutdown

interface Ethernet1/51
  description Link to Spine1
  mtu 9216
  ip address 10.1.2.6/30
  ip ospf network point-to-point
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode
  no shutdown

interface Ethernet1/53
  vrf member PROD
  ip address 10.1.3.2/30
  no shutdown

interface Ethernet1/54
  vrf member PROD
  ip address 10.1.3.6/30
  no shutdown

interface loopback0
  description Loopback for Router ID
  ip address 10.10.100.4/32
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode

interface loopback1
  description Loopback for VTEP (PIP) 
  ip address 10.10.100.14/32
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode
icam monitor scale

router ospf UNDERLAY
  router-id 10.10.100.4
  log-adjacency-changes detail
router bgp 65501
  router-id 10.10.100.4
  neighbor 10.10.100.1
    remote-as 65501
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
  neighbor 10.10.100.2
    remote-as 65501
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
  vrf PROD
    address-family ipv4 unicast
      advertise l2vpn evpn
      redistribute direct route-map FABRIC-REDIST-SUBNET
      maximum-paths ibgp 2
    neighbor 10.1.3.1
      remote-as 65502
      address-family ipv4 unicast
    neighbor 10.1.3.5
      remote-as 65503
      address-family ipv4 unicast


# Ext-router-1


feature bgp

interface Ethernet1/50
  ip address 10.1.3.1/30
  no shutdown

interface loopback100
  ip address 192.168.1.100/32

icam monitor scale
 
router bgp 65502
  address-family ipv4 unicast
    network 0.0.0.0/0
  neighbor 10.1.3.2
    remote-as 65501
    address-family ipv4 unicast
# if no default route coming from upstream # use default null 0
ip route 0.0.0.0/0 Null0

# Ext-router-2 


feature bgp

interface Ethernet1/50
  ip address 10.1.3.5/30
  no shutdown

interface loopback100
  ip address 192.168.10.100/32

icam monitor scale

router bgp 65503
  address-family ipv4 unicast
  neighbor 10.1.3.6
    remote-as 65501
    address-family ipv4 unicast

https://www.cisco.com/c/en/us/td/docs/dcn/nx-os/nexus9000/104x/unicast-routing-configuration/cisco-nexus-9000-series-nx-os-unicast-routing-configuration-guide/m_configuring_policy-based_routing_101x.html#:~:text=the%20following%20sections:-,About%20Policy%2DBased%20Routing,determines%20where%20to%20forward%20packets.

Leave a Comment

Your email address will not be published. Required fields are marked *