ACI TCAM Resource Utilization and Optimization

Policy CAM and TCAM

In this article, I will discuss ACI TCAM resource utilization and optimization. In ACI policy, contract/filters are programmed in Policy CAM and Overflow TCAM (OTCAM). The policy CAM, Content Addressable Memory, is the hardware resource used by Cisco switches. Cisco ACI leaf switches use policy CAM to allow filtering of traffic between EPGs. Policy CAM only do exact matches on ones and zeros (binary CAMs). Policy CAM is implemented using forwarding table tile memory as a hash table memory lookup.The hash is derived from numerous fields in the contract, such as Source EPG, Destination EPG, Protocol, L4 ports, etc. TCAM, Ternary Content Addressable Memory, also know as overflow TCAM (OTCAM) is a specialized hardware resource designed for rapid table lookups. TCAM can match a third state, which is any value. This makes TCAM suitable for policies / filters with multiple port range. Both Policy CAM and TCAM resource are used for filters (Access Control Lists (ACLs)) to define which EPG (security zone) can talk to which EPG (security zone). The policy CAM size varies depending on the hardware. The way in which the policy CAM handles Layer 4 operations and bidirectional contracts also varies depending on the hardware and the forwarding scale profile. Generally, -FX, FX3 and -GX leaf switches offer more capacity compared with -EX and -FX2.

TCAM and policy CAM are very precious hardware resources of ACI Nexus switch. Usage of these hardware resources should be planned with great care. With many EPGs, and if all EPGs need to talk to all EPGs, the hardware consumption of the policy CAM entry becomes in the order of magnitude of # EPGs * (# EPG – 1) * the number of filters. In an environment with many to many EPG contract relationship, multiple filters, and EPGs concentrated on few leaf switches can easily hit the policy CAM limit and exhaust the hardware resources. If the resources are exhausted additional policies / contracts won’t be programmed on the hardware. As a result system will see unexpected behaviors.

Configuration of contract and filters are done on APIC and pushed to switches. The policy interaction and flow for contracts and rules are listed below:

  1. The Policy Manager on the APIC communicates with the Policy Element Manager on the switch
  2. The Policy Element Manager on the switch programs the Object Store on the switch
  3. The Policy Manager on the switch communicates with the Access Control List Quality of Service (ACLQOS) client on the switch
  4. The ACLQOS client programs the hardware.

Understanding the Utilization of Policy-CAM and Overflow TCAM

Monitoring and baselining policy CAM and OTCAM resources are critical to avoid the exhaustion of policy resources. It’s also help for planning of end point allocation across the fabric leaves, contract/filter types, and deployment of contracts/rules optimization techniques. Policy utilization and scale limits are available through the APIC GUI or through CLI on switches.

* Policy CAM and OTCAM are switch resources not fabric wide resources.

Verifying utilization using APIC GUI

TCAM utilization is available using APIC GUI under:

  1. Operations > Capacity Dashboard
  2. Fabric > Inventory > Pod > ‘click on one of the leaf switches’ > Summary
  3. Tenants > Application profiles > ‘Application profile name’ > Application EPGs > ‘EPG name’ > Operational > Deployed Leaves

Operations > Capacity Dashboard > Leaf Capacity

ACI TCAM Resource Utilization and Optimization
figure 1a – TCAM utilization under capacity dashboard

Fabric > Inventory > Pod > ‘click on one of the leaf switches’ > Summary

figure 1b – TCAM utilization under inventory

Tenants > Application profiles > ‘Application profile name’ > Application EPGs > ‘EPG name’ > Operational > Deployed Leaves

figure 1c – TCAM utilization under tenant/AP/EPG

On some of the ACI versions, a high TCAM utilization is reported under tenants > Application Profiles > ‘Application Profile name’ > Application EPGs > ‘EPG name’ > Operational > Deployed leaves. It display a higher (incorrect) value than the one displayed in the capacity dashboard.  bug ID – CSCwb19242 

From the above APIC screenshot of node 101, under capacity dashboard, has 221 rules out of the 64K available resource. Looks plenty but it doesn’t give the whole picture. The 64K is not the TCAM resource, it is actually the Policy CAM carved up capacity of the leaves memory (Static RAM). The TCAM is a fixed 8K. 

LEAF-101# vsh_lc
module-1# show platform internal hal health-stats | grep _count
mcast_count                   : 0 
max_mcast_count               : 8192 
policy_count                  : 221 
max_policy_count              : 65536 > policy CAM curved based on the forwarding scale profile applied
policy_otcam_count            : 322 
max_policy_otcam_count            : 8192 > Max 8K entries for OTCAM
policy_label_count                : 0 
max_policy_label_count            : 0 

Proactive monitoring of policy CAM and OTCAM using syslog messages

Both policy CAM and OTCAM resources can be configured with thresholds to generate syslog messages when it crossed. The syslog messages can be used to proactively monitor, manage, and plan the resources.

1. Configure the thresholds

For test purpose I configured the warning to happen at 5% of the OTCAM usage. Same need to be configured for policy CAM (replace Policy Overflow TCAM entry with Policy entry in screenshot below)

Fabric > Fabric Policies > Policies > Monitoring > default > Stats Collection Policies

figure 2a – Configure OTCAM usage thresholds

Fabric > Fabric Policies > Policies > Monitoring > default > Callhome/Smart Callhome/SNMP/Syslog/TACACS

figure 2b – Add ‘Equipment Capacity Entity’ as a monitored object for syslog
figure 2c – Syslog message when the OTCAM usage crossed 5%

Policy CAM and TCAM Optimization Techniques

Depending on the leaf switch hardware, Cisco ACI offers many optimizations to either allocate more policy CAM space or to reduce the policy CAM consumption. The most common Cisco ACI policy resource optimization techniques are listed below:

  1. Configure Cisco ACI leaf switches with policy CAM intensive forwarding scale profile
  2. Plan the placement of hosts with EPG distribution across leaf switches in mind
  3. Use vzAny, preferred group contracts
  4. Optimize range usage in filters
  5. Use Bidirectional subjects if possible
  6. Filters can be reused with an indirection feature (at the cost of granularity of hardware statistics that you may be using when troubleshooting)

1. Forwarding Scale Profiles Overview

https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/all/forwarding-scale-profiles/cisco-apic-forwarding-scale-profiles/m-overview-and-guidelines.html

ACI can be configured with different Forwarding Scale Profiles to suit the topology and deployment use cases. Here are the forwarding scale profiles and their purpose.

  • Dual Stack—The default profile for all new supported switches that allows both IPv4 and IPv6 configurations.
  • High Dual Stack—Provides increased IPv4, IPv6, and policy scale numbers compared to the default Dual Stack profile. This profile supports different scalability limits based on the specific switch platforms.
  • High LPM—Provides scalability similar to the Dual Stack profile, but for deployments that require higher scale for longest prefix match (LPM) and lower policy scale.
  • High Policy—This profile is similar to the Dual Stack profile but with higher policy scale. This profile has specific hardware requirements.
  • IPv4 Scale—This profile is designed for IPv4-only deployments and allows you to configure higher IPv4 scale where no IPv6 configurations are needed.
  • High IPv4 EP Scale—This profile is recommended to be used only for the ACI border leaf (BL) switches in Multi-Domain (ACI-SDA) Integration. It provides enhanced IPv4 EP and LPM scales specifically for these BLs and has specific hardware requirements.
  • Multicast Heavy—This profiles provides an enhanced multicast scale and has specific hardware requirements.

Before you change the forwarding scale profile make sure all the requirements are throughly understood and vetted. Changing the profile because of one requirement like, policy may have unintended impact on other requirements like LPM(Longest Prefix, Match), IPv4, IPv6 routing table requirements.

For a fabric with dedicated boarder leaves, it recommended to change the forwarding scale profile for increased IPv4 and IPv6 routing table capacity. ‘High LPM’ has 128,000 IPv4 and IPv6 prefixes capacity at the expense of low policy numbers. ‘High LPM’ forwarding scale profile has 8,000 policy capacity. But in a dedicated boarder leaves scenario this is not an issue since no endpoints are expected to connect to the BL and the contract requirement is minimal.

LEAF-201# vsh_lc -c "show platform internal hal l3 routingthresholds"
Executing Custom Handler function 

OBJECT 0: 
trie debug threshold                                      : 0 
tcam debug threshold                                      : 3072 
Supported UC lpm entries                                  : 16384 
Supported UC lpm Tcam entries                             : 4096 
Current v4 UC lpm Routes                                  : 30 
Current v6 UC lpm Routes                                  : 0 
Current v4 UC lpm Tcam Routes                             : 45 
Current v6 UC lpm Tcam Routes                             : 34 
Current v6 wide UC lpm Tcam Routes                        : 0 
Maximum HW Resources for LPM                              : 20480 
Current LPM Usage in Hardware                             : 211 
Number of times limit crossed                             : 0 
Last time limit crossed                                   : N/A
Number of times host route limit crossed                  : 0 
Last time host route limit crossed                        : N/A

For compute leaves with high contract requirement, ‘High policy’ profile may be required. ‘High Policy’ profile has some limitation and specific hardware requirements we need to be aware of.

1.1 Forwarding Scale Profiles – Scalability

The following table provides forwarding scale profiles scalability information in Release 5.2(5).

* The LPM scale listed in the following table includes any combination of IPv4 and IPv6 prefixes. The total of (# of IPv4 prefixes) + 2*(# of IPv6 prefixes) must not exceed the listed scale.

Scale ProfileScale for EX, FX2, FX2-E, and GX2A SwitchesScale for FX, FX3, FXP, GX and GX2B Switches
Dual StackEP MAC: 24,000
EP IPv4: 24,000
EP IPv6: 12,000
LPM: 20,000
Policy: 64,000
Multicast: 8,000
EP MAC: 24,000
EP IPv4: 24,000
EP IPv6: 12,000
LPM: 20,000
Policy: 64,000
Multicast: 8,000
High Dual StackEP MAC: 64,000
EP IPv4: 64,000
EP IPv6: 24,000
LPM: 38,000
Policy: 8,000
Multicast: 512
EP MAC: 64,000
EP IPv4: 64,000
EP IPv6: 48,000
LPM: 38,000
Policy: 128,000
Multicast: 32,000
High IPv4 EP ScaleNot supportedEP MAC: 24,000
EP IPv4 Local: 24,000
EP IPv4 Total: 280,000
EP IPv6: 12,000
LPM: 40,000
Policy: 64,000
Multicast: 8,000
High LPMEP MAC: 24,000
EP IPv4: 24,000
EP IPv6: 12,000
LPM: 128,000
Policy: 8,000
Multicast: 8,000
EP MAC: 24,000
EP IPv4: 24,000
EP IPv6: 12,000
LPM: 128,000
Policy: 8,000
Multicast: 8,000
High PolicyEP MAC: 16,000
EP IPv4: 16,000
EP IPv6: 8,000
LPM: 8,000
Policy: 100,000
Multicast: 8,000
EP MAC: 24,000
EP IPv4: 24,000
EP IPv6: 12,000
LPM: 20,000
Policy: 256,000
Multicast: 8,000
IPv4 ScaleEP MAC: 48,000
EP IPv4: 48,000
EP IPv6: 0
LPM: 38,000
Policy: 64,000
Multicast: 8,000
EP MAC: 48,000
EP IPv4: 48,000
EP IPv6: 0
LPM: 38,000
Policy: 64,000
Multicast: 8,000
Multicast HeavyNot supportedEP MAC: 24,000
EP IPv4 Local: 24,000
EP IPv4 Total: 64,000
EP IPv6: 4,000
LPM: 20,000
Policy: 64,000
Multicast: 90,000 with (S,G) scale not exceeding 72,000

1.2 Configuring Forwarding Scale Profile

The configuration of forwarding scale profile is done in three steps, create a policy, add the policy to a policy group and associate the policy group to a node. It can also be done from the capacity dashboard.

  1. Create the forwarding scale policy

Fabric > Access Policies > Policies > Switch > Forwarding Scale Profile > Create Forwarding Scale Profile Policy

ACI TCAM Resource Utilization and Optimization
figure 3 – Forwarding scale profile policy configuration

2. Add the policy to a policy group

Fabric > Access Policies > Switch > Leaf Switches > Policy Group > Create Access Switch Policy Group > Select Forward Scale Profile Policy

ACI TCAM Resource Utilization and Optimization
figure 4 – Assign the forwarding scale profile policy to the switch policy group

3. Associate the policy group to a node

Fabric > Access Policies > Switch > Leaf Switches > Profile > Create Leaf Profile

figure 5 – Associate the switch policy group to the leaf profile (Leaf selectors)

4. Configuring the forwarding scale profile from the capacity dashboard

This approach is a bit tricky. Modifying the leaf profile from the capacity dashboard changes the profile already associated with the leaf switch so if a switch is using the default switch profile then the change will impact all leaf switches using the default switch profile not only the specific switch. Before modifying the forwarding scale profile, it’s a good practice to configure non-default switch profiles to avoid the unintentional change of forwarding scale profile using the default switch profile.

Operations > Capacity Dashboard > Configure Profile

ACI TCAM Resource Utilization and Optimization
figure 6 – Configure forwarding scale profile from capacity dashboard

1.3 View the Capacity of Leaf Switch Forwarding Scale Profile

The capacity of a specific leaf switch forwarding scale profiles can be viewed from APIC at operations > capacity dashboard.

Operations > Capacity Dashboard > Leaf Capacity > Choose a leaf

figure 7 – Capacity of forwarding scale profile for specific switch (node -101 in this case)

2. Plan the placement of Application EPGs by distributing across leaf switches

ACI provides flexibility and automated way of optimizing EPG deployment to leaf switches as necessary through deployment immediacy for bare metal servers and through deployment and resolution immediacy for VMs. Using on-demand deployment immediacy improve policy resource consumption as EPG won’t deployed and hardware resources won’t consumed if no endpoint requires them. If no host assigned to an EPG is connected to a specific leaf switch, the EPG won’t be deployed. Distributing application EPGs across the leaf switches will improve the policy CAM and TCAM utilization.

Example – let’s say we have 50 application EPGs and 10 leaf switches

If we have Application servers from all EPGs randomly connected across all 10 leaf switches, then each switches will have all 50 application EPGs programmed. If each EPG needs to communicate with all EPGs, with 5 rules then policy resource requirement will be 50*49*5 = 12250. All leaf switches will consume 12,250 of their policy resources.

If we properly planned and distributed the application EPGs like each leaves only has 10 or less EPGs, then policy resource requirement will be 10*49*5 = 2450. All leaf switches will consume 2450 of their policy resources than 12250 on the first case. Proper planning of application placement will pay a big policy resource dividend.

3. Use vzAny, preferred group

vzAny

If contract/rule is needed by all EPGs in a given VRF using vzAny as provided or/and consumed contract/rule will reduce the policy resource consumption.

Example – if hosts in all EPGs in a given VRF need access to syslog server, deploying the contract/rule as consumed by vzAny and provided by the syslog EPG will minimize the resource usage significantly as the number of EPGs grow.

Example – output with syslog EPG and four application EPGs in the ACI fabric in a given VRF. Deploying the syslog contract/rule as consumed by individual application EPGs in the VRF and provided by the syslog EPG. As the EPG grows the policy resource requirement grows.

ACI TCAM Resource Utilization and Optimization
figure 8 – when the example contract / rule applied at EPG level

Example – output with syslog EPG and four application EPGs in the ACI fabric in a given VRF. Deploying the syslog contract/rule as consumed by vzAny and provided by the syslog EPG. As the EPG grows the policy resource requirement remain the same.

figure 9 – when the example contract / rule applied at vzAny level

EPG pcTag – 16396 for syslog EPG and 16391, 49167, 16390, 25 are for the four application EPGs.

The pcTag can be found using an APIC UI or moquery of the objects.

Tenant > Networking > VRF

ACI TCAM Resource Utilization and Optimization
figure 10 – VRF scope and Class ID

Tenant > Networking > VRF > VRF name

ACI TCAM Resource Utilization and Optimization
figure 11 – EPG PC Tag under VRF
APIC1# moquery -c fvCtx | grep SG- -A 8 
name                 : SG-PBR
annotation           : 
bdEnforcedEnable     : no
childAction          : 
descr                : 
dn                   : uni/tn-Belete-Test/ctx-SG-PBR
extMngdBy            : 
ipDataPlaneLearning  : enabled
knwMcastAct          : permit
lcOwn                : local
modTs                : 2022-10-10T23:19:16.022-04:00
monPolDn             : uni/tn-common/monepg-default
nameAlias            : 
ownerKey             : 
--
rn                   : ctx-SG-PBR
scope                : 2752513
seg                  : 2752513
status               : 
uid                  : 15374
userdom              : :all:
vrfId                : 0
vrfIndex             : 0
APIC1# moquery -c fvEPg | grep PBR- -A 15 -B 5
# fv.AEPg
name                 : PBR-client
annotation           : 
childAction          : 
configIssues         : 
configSt             : applied
descr                : 
dn                   : uni/tn-Belete-Test/ap-AP/epg-PBR-client
exceptionTag         : 
extMngdBy            : 
floodOnEncap         : disabled
fwdCtrl              : 
hasMcastSource       : no
isAttrBasedEPg       : no
isSharedSrvMsiteEPg  : no
lcOwn                : local
matchT               : AtleastOne
modTs                : 2023-02-08T09:15:37.662-04:00
monPolDn             : uni/tn-common/monepg-default
nameAlias            : 
pcEnfPref            : enforced
pcTag                : 16391
pcTagAllocSrc        : idmanager
prefGrMemb           : exclude
prio                 : level3
rn                   : epg-PBR-client
scope                : 2752513
shutdown             : no
status               : 
triggerSt            : triggerable
txId                 : 13835058055324251201
uid                  : 15374
userdom              : :all:

for detail about pcTag and contract –

ACI Contract Priority

Preferred Group

Preferred group may also be a possible solution to optimize a policy resource consumption when full mesh is required between a subset group of EPGs. Preferred group is used to allow unrestricted communication/access between a subset of EPGs in a VRF that are member of the preferred group. Communication between members of the preferred group and non-members uses the regular contract application. The number of filtering rules used by Preferred Groups depends on the number of EPGs that are not in the Preferred Group.

If there are ‘X’ EPGs outside of the Preferred Group and ‘Y’ EPGs within the preferred group, the policy-cam rules requirement to achieve full mesh between preferred group members are (X*2) + 1, a deny from any to EPG excluded from preferred group, a deny from EPG excluded from preferred group to any, and one permit all to allow communication between EPGs within the preferred group. With a large number of EPGs in the preferred group, the added rules for preferred group are normally less than the number of rules for a EPG full mesh.

To compare the policy resource requirement for ACI setup with ‘Y’ number of EPGs required full mesh and ‘X’ number of EPGs no full mesh requirement:

  1. Without preferred group Y*(Y-1)*2 rules is required to achieve full mesh between the EPGs with one contract filter
  2. With preferred group (X*2) + 1 rules is required to achieve full mesh between the EPGs

With ‘X’ = 6 and ‘Y’ =3 the policy rule with preferred group is 13 and without preferred group is 12 so preferred group doesn’t help on optimizing the policy CAM usage. But with ‘X’ = 120 and “Y’ = 120 the policy rule with preferred group is 241 and without preferred group is 28,560 so preferred group significantly improved the policy resource usage. AS the number of EPGs in the preferred group increases the level of optimization increases.

4. Optimize range usage in filters

Linear rules

In an environment with heavy filter/contract usage optimizing the filters/rules is important to efficiently utilize the resources. Applying a linear (filter without range) contracts between EPGs increase both the Policy CAM and the OTCAM.

module-1# show platform internal hal health-stats | grep _count
mcast_count                   : 0 
max_mcast_count               : 8192 
policy_count                  : 221 
max_policy_count              : 65536 
policy_otcam_count            : 322 
max_policy_otcam_count            : 8192 
policy_label_count                : 0 
max_policy_label_count            : 0 
module-1# show platform internal hal health-stats | grep _count
mcast_count                   : 0 
max_mcast_count               : 8192 
policy_count                  : 223 > Policy CAM increased by 2 adding a contract with one filter between two EPGs 
max_policy_count              : 65536 
policy_otcam_count            : 324 > OTCAM increased by 2 after adding a contract with one filter between two EPGs 
max_policy_otcam_count            : 8192 
policy_label_count                : 0 
max_policy_label_count            : 0 

OTCAM is 8K but policy CAM is usually higher so, how this align if both resources linearly increase.  The OTCAM will be a bottleneck, the leaf policy support would be limited by the OTCAM 8K capacity, right? actually, no. The leaf switch has a mechanism called hash banks / forwarding table tile memory to do overflow and freeing up space of the OTCAM. Hash bank can be viewed from the leaf CLI.

figure 12 – Hash bank output

Rules with port range

In an environment with filter containing port ranges proper care should be exerted to avoid unnecessary exhausting of OTCAM resource. One range takes one entry only in the policy-cam but if the same EPG pair defines additional ranges they may need to go to the OTCAM and expanded. As a best practice Keep the range definition to less than 4 ranges per EPG pair. Consolidate ranges to align with the best practice to avoid hitting the 8K OTCAM limit even before the policy CAM is exhausted.

Another important consideration for port ranges is to align the usage with 16 multiples (1-15, 16-31, 32-47, 48-63, …). if the range is small define as individual ports, rule of thumb is if the range is 10 or less define it as individual port filter.

module-1# show platform internal hal health-stats | grep _count
mcast_count                   : 0 
max_mcast_count               : 8192 
policy_count                  : 221 
max_policy_count              : 65536 
policy_otcam_count            : 322 
max_policy_otcam_count            : 8192 
policy_label_count                : 0 
max_policy_label_count            : 0 

module-1# show platform internal hal health-stats | grep _count
mcast_count                   : 0 
max_mcast_count               : 8192 
policy_count                  : 223 > increased by 2 after a filter with a range 16-31 applied
max_policy_count              : 65536 
policy_otcam_count            : 324 > increased by 2 after a filter with a range 16-31 applied
max_policy_otcam_count            : 8192 
policy_label_count                : 0 
max_policy_label_count            : 0 
module-1# 
module-1# 
module-1# show platform internal hal health-stats | grep _count
mcast_count                   : 0 
max_mcast_count               : 8192 
policy_count                  : 223 > increased by 2 after a filter with a range 16-30 (not aligned) applied
max_policy_count              : 65536 
policy_otcam_count            : 330 > increased by 10 after a filter with a range 16-30 (not aligned) applied
max_policy_otcam_count            : 8192 
policy_label_count                : 0 
max_policy_label_count            : 0 

If a range based entries, that is not in line with 16 multiples, are pushed they might expand into multiple blocks of OTCAM consuming many entries.  Use 16 multiples range as much as possible.

5. Use Bidirectional subjects if possible

In an environment with heavy usage of rules, it is important to properly identify the provider and consumer EPGs and apply contract accordingly. The optimization gain is huge as the number of EPGs, filters and contracts increase. The following output shows how a three EPGs, with one filter rule/contract improved the policy resource utilization from 12 policy entries to 4 policy entries by converting the full mesh consumed/provided to one EPG providing and the two EPGs consuming.

If the filter doesn’t specify ports then this optimization technique won’t have an impact; example is ‘permit any any’.

module-1# show platform internal hal health-stats | grep _count
mcast_count                   : 0 
max_mcast_count               : 8192 
policy_count                  : 221 
max_policy_count              : 65536 
policy_otcam_count            : 322 
max_policy_otcam_count            : 8192 
policy_label_count                : 0 
max_policy_label_count            : 0 
module-1# 
module-1# show platform internal hal health-stats | grep _count
mcast_count                   : 0 
max_mcast_count               : 8192 
policy_count                  : 233 > increased by 12 for three EPGs all consuming and providing
max_policy_count              : 65536 
policy_otcam_count            : 334 > increased by 12 for three EPGs all consuming and providing
max_policy_otcam_count            : 8192 
policy_label_count                : 0 
max_policy_label_count            : 0 
module-1# 
module-1# show platform internal hal health-stats | grep _count
mcast_count                   : 0 
max_mcast_count               : 8192 
policy_count                  : 225 > increased by 4 for three EPGs two consuming and one providing
max_policy_count              : 65536 
policy_otcam_count            : 326 > increased by 4 for three EPGs two consuming and one providing
max_policy_otcam_count            : 8192 
policy_label_count                : 0 
max_policy_label_count            : 0 

6. Policy Compression

Policy compression is used to compress hardware policy resource entries to utilize less resource. Policy compression was introduced to compress bidirectional rules to one entry (bidirectional rule compression) and the capability to reuse filters (policy table compression) across multiple EPG pairs.

‘Enable Policy Compression’ under contract/subject/filter enables both bi-directional rule compression and policy table compression. Bi-directional rule compression requires both ‘apply both directions’ and ‘reverse filter ports’ selected under contract subject configuration.

Policy table compression requires Cisco APIC Release 4.0 or later and FX leaf nodes or later. By default, even if multiple pairs of EPGs use the same contract with the same filter, separate entries are allocated in the TCAM for every pair of EPGs. If you select the option “enable policy compression,” Cisco ACI programs an indirect association between EPG pairs and filters by using a label called policy group (PG). Policy Compression creates an indirect association between policies and policy CAM entries via Policy Group (PG) labels. The EPG pairs are programmed in the Policy-Group table with a label that points to the policy-cam. The space for PG label lookup table is repurposed dynamically from existing Policy CAM space. If compression eligible rules are present, the PG label lookup table is carved out. Once those rules are deleted the space can be reused for regular policy CAM entries. If PG label lookup Table space runs out and there is space in regular TCAM those rules will be programmed in uncompressed form.

The optimization impact of “enable policy compression” depends on the contract design, such as how many EPG pairs can be combined to one policy group label. If only few EPG pair can reuse contract then the optimization impact is negative because the curved resource from the policy CAM may not be used.

The PG usage can be monitored from the CLI or the capacity dashboard.

module-1# show platform internal hal health-stats | grep _count
mcast_count                   : 0 
max_mcast_count               : 8192 
policy_count                  : 221 
max_policy_count              : 65536 
policy_otcam_count            : 322 
max_policy_otcam_count            : 8192 
policy_label_count                : 0 
max_policy_label_count            : 0 

module-1# show platform internal hal health-stats | grep _count
mcast_count                   : 0 
max_mcast_count               : 8192 
policy_count                  : 222 > increased only by one, three EPGs using the contract
max_policy_count              : 55296 > 10K used for PG
policy_otcam_count            : 323 > increased only by one, three EPGs using the contract
max_policy_otcam_count            : 8192 
policy_label_count                : 2 
max_policy_label_count            : 40960 > PG curved out

ACI TCAM Resource Utilization and Optimization
figure 13 – Capacity dashboard showing curved PG usage and capacity

Limitation and Guidlines

  1. It is not possible to convert a pre-existing subject: you must delete the subject and reconfigure it with the compression option
  2. In an EPG pair only one contract can be compressed. The feature analyses all the contracts and select the one that gives the best saving
  3. Disables individual filter rule statistics
  4. A contract can include both compressed and non-compressed filters
  5. Use the same contract name when re-using filters in policy table compression
  6. Policy Compression works with permit and permit-log rules only
  7. Policy Compression is not enabled for vzAny contracts
  8. Policy Compression can be enabled for user-defined rules only. It is not applicable to implicit rules.

Automation

Automation is key to effectively deploy ACI and services. One of the objects in ACI that can be automated is filters. Because automation makes deployment easier, there may be a chance of unoptimized deployment that can result on exhaustion of policy resources. Series consideration and effort is needed to make sure the logic, source of information and the algorithm of developing filters and contracts are optimized.

Conclusion

Using the above optimization techniques, all or some depending on the environments specific requirements and suitability, it’s possible to avoid policy resource exhaustion. Filters and contract should always be reviewed to avoid unnecessary resource utilization since OTCAM and Policy CAM are very expensive resources of leaf switches. Properly planned filter / contract reuse with compression can help you go great length on your optimization journey. Using well crafted plan on using vzAny, preferred group, forwarding scale profile, port range vs. individual port filters, and endpoint placement makes your ACI policy enforcement journey pleasant.

Important Links

ACI Contract

ACI Contract Priority

ACI Contract White Paper

ACI Forwarding Scale Profile Guide

About

1 thought on “ACI TCAM Resource Utilization and Optimization”

Leave a Comment

Your email address will not be published. Required fields are marked *