Skip to main content

Cisco UCS LACP Port Channel Flapping

I recently encountered this issue during a deployment and wasn't able to find much information about it on the Internet, so I figured I'd make a quick blog post to document the issue and the solution in case other people encounter the same issue.

When connecting UCS Fabric Interconnects to non-Cisco switches (in this case Juniper EX series) we noticed some strange behavior: once every 30 minutes or so, the Uplink port-channel members would go down briefly and then come back up and then everything would be fine until the next occurrence. This would generate an intermittent F0727 error - UCS complaining that its port channels had no operational members.

I turned up LACP traceoptions logging on the Juniper side to see if it might be an issue with the LACP protocol itself, but it did not yield any useful information as to the root cause of the issue.

I turned to the UCS logs to see if I could get any further information, and noticed these log entries repeated many times:

2018 Apr 11 13:25:51 ucscluster-A %LLDP-1-NO_DCBX_ACKS_RECV_FOR_LAST_10_PDUs
2018 Apr 11 13:28:07 ucsclster-A %LLDP-1-NO_DCBX_ACKS_RECV_FOR_LAST_10_PDUs
2018 Apr 11 13:30:21 ucscluster-A %LLDP-1-NO_DCBX_ACKS_RECV_FOR_LAST_10_PDUs

A Google search then turned up this Cisco article - - which explains that LLDP was the actual root cause of the issue I was seeing.

From the Support article:

"Data Center Bridging Capability Exchange (DCBX) Type Length Values (TLV) are packaged within a Link Layer Discovery Protocol (LLDP) frame that is exchanged between the switch and the converged network adapter (CNA). One such Control Sub-TLV is used for acknowledgement (ACK), which is sequence-based. For example, the switch sends a Control Sub-TLV with a SeqNo of 1 and an AckNo of 2. The host is supposed to inverse this, and send an LLDP frame with a Control Sub-TLV with a SeqNo of 2 and an AckNo of 1. Refer to the Packet Captures section of this article for more details.

The switch expects this exchange from the host every 30 seconds. If the switch does not see this exchange for 100 Protocol Data Units (PDUs) , which is 3000 seconds or 50 minutes, the switch disables with this error."

Okay, so now I knew what the issue was, I started looking around for a way to disable LLDP within UCSM and came up empty. I Googled some more and wasn't able to find anything definitive on how to actually do it. So I finally admitted defeat and opened a Cisco TAC case - and the engineer very quickly responded that the reason I couldn't find a way to disable LLDP on the FIs is because the capability was not exposed via UCSM or the CLI and would have to be done via debug plugin (dplug). He also linked me to an enhancement request that would enable customers to enable/disable LLDP (you'll need a Cisco account to read it):

A 15-minute call later the dplug was loaded and we verified that LLDP was indeed disabled - further monitoring confirmed that disabling LLDP had resolved the flapping portchannels.

An alternative option would be to disable LLDP on the upstream switches instead of the UCS, but I elected to make the configuration change on the UCS side to keep our switches' LLDP configuration standardized.

Hopefully this will help someone else out there having the same issue - thanks for reading!

Popular posts from this blog

Step up your HTTP security header game with NetScaler Rewrite Policies

There are a number of HTTP response headers that exist to increase web site security. If set properly, they can ensure that your site is less exposed to many common web vulnerabilities. By no means are these descriptions exhaustive, so I have included some references that can provide a more in-depth explanation at the bottom of each section. I'd also like to give a shout-out to the OWASP Secure Headers Project  and Scott Helme of  - thank you! Note: Screenshots are from a NetScaler VPX 12.1 - if you are running a different version, the screenshots may look different, but the logic is the same. So that I have something to bind these policies to, I've also already created a load-balancing virtual server named lb_web_ssl and a Service Group for two TurnKey LAMP servers on the back-end. X-Frame-Options The X-Frame-Options header is designed to guard against clickjacking (an attack where malicious content is hidden beneath a clickable button or elem

How To: Unjoin NetApp Nodes from a Cluster

Let me paint you a word picture: You've upgraded to a shiny new AFF - it's all racked, stacked, cabled and ready to rock. You've moved your volumes onto the new storage and your workloads are performing beautifully (of course) and it's time to put your old NetApp gear out to pasture. We're going to learn how to unjoin nodes from an existing cluster. But wait! There are several prerequisites that must be met before the actual cluster unjoin can be done. Ensure that you have either moved volumes to your new aggregates or offlined and deleted any unused volumes. Offline and delete aggregates from old nodes. Re-home data LIFs or disable/delete if they are not in use. Disable and delete intercluster LIFs for the old nodes (and remove them from any Cluster Peering relationships) Remove the old node's ports from any Broadcast Domains or Failover Groups that they may be a member of. Move epsilon to one of the new nodes (let's assume nodes 3 and 4 are t

Modernizing a NetApp Certification

Read on to find out how new versions of NetApp exams are written during an Item Development Workshop at NetApp's RTP office In mid-October, this message popped up in the NetApp United Slack channel from Petya Stefanova, NetApp United's fearless leader: Hey @channel there’s a new opportunity to participate in an IDW with NetAppU. This time the workshop will be reviewing the two exams for NetApp Certified Data Administrator ONTAP (NCDA, NS0-192) and NetApp Certified Support Engineer ONTAP (NCSE ONTAP, NS0-590), taking place mid-end January. If you are interested, drop me an email how you quality and can contribute to IDW. I need to submit nominations by Friday. So please let me know ASAP! Partners and customers can participate I immediately knew that it was something that I would be interested in, so I talked to my employer to get their approval and put in my application. At the time, I didn't have any NetApp certifications so I didn't expect to be selected