Introduction
Recommendation: If you are intend to use LACP consider deploying from day one rather than migrating from Distribute Uplink containers. The migration is simple but adds additional steps that would be unnecessary.
Image Source: KB2051826
Distributed Switches support an enhanced version of LACP which allows for the use of dynamic link aggregation. Prior to vSphere 5.5 only single Multiple Link Aggregation Groups (LAGs) could be created – with the release of vSphere 5.5 up to 64 LAGs can be created per Distributed Switch. Despite these new enhancements a number of configuration requirements to do exist in terms of its current implementation – consult KB2051307 for further details. There is more detailed documentation that outlines the differences between vSphere 5.0, 5.1 and 5.5 which you should consult if you are running in a mixed environment – these are details in KB2051316
However, expressed briefly LACP does currently:
- Support is not compatible with software iSCSI multipathing.
- support between two nested ESXi hosts is not possible (virtualized ESXi hosts).
- LACP cannot be used in conjunction with the ESXi dump collector. For this feature to work, the vmkernel port used for management purposes must be on a vSphere Standard Switch.
- Port Mirroring cannot be used in conjunction with LACP to mirror LACPDU packets used for negotiation and control.
- The teaming health check does not work for LAG ports as the LACP protocol itself is capable of ensuring the health of the individual LAG ports. However, VLAN and MTU health check can still check LAG ports.
The VMwareTechPub YouTube channel has short video which explains how enhanced LACP functions, with a second video that guides you through the process.
LAGs are similar to the UpLink containers in the Distributed Portgroup. When you define a Distributed Switch you set how many Uplink containers you require – the same is true with a LAGs. You define the LAG, and then indicate how many vmnics it will support. If you intended to use LACP as the primary method for communications you could choose to create a Distributed Switch with 0 Uplinks, and then define the LAGs. This would mean there would be no need to migrate from using Distributed Uplinks to LAGs.
Create Multiple Link Aggregation Group (LAG)
1. Select the Distributed Switch, click the Manage tab and select the Settings column
2. Click the green + to create a LAG
3. Set a friendly name of LAG, and specify how many network ports (or vmnics) it will support – in this case our server has four network cards, and our Standard Switch0 is using vmnic0/1, and the Distributed Switch is using vmnic2/3. Change mode from Passive to Active, and this will trigger a negotiation with the physical switch. Finally, select a load-balancing algorithm supported by the physical switch
Reconfigure Portgroups to use the LAG in Standby Mode
Next we will reassign the portgroups that we want to use use with LACP. This is two stage process to avoid a situation where VMs or other traffic becomes disconnected.
1. Right-click the Distributed Switch, select Manage Distributed Portgroups
2. Select Teaming and Failover
3. Select the Portgroups you wish to modify. In our case we selected dVLAN101 and dVLAN102
WARNING: During the migration phase the selected Distributed Portgroups will have connectivity to both UpLinks and LAGs. However, once the VMware ESXi hosts have their physical NIC assignment it is entirely possible for portgroup to find that its Uplink containers contain now vmnics at all – as such communication would cease. So care must be taken to avoid situations where network outages are generated.
4. Next move the LAG from Unused Uplinks to Standby Uplinks. This an intermediary configuration as we migrate from using Uplinks to LAGs as the container for assigning physical vmnics. You cannot use LAGs and Uplinks together
If you do move the LAG in this state to the Active state, then the Web Client will warn you that this is an in proper configuration.
When you click next you the Web Client will warning you that normally Uplinks and LAGs cannot be used together – and they are only supported in this method during the migration stage.
[Update]. Later I was asked by Michael Webster of longwhiteclouds.com whether it was possible for multiple LAGs to back a portgroup, and for load-balancing to be done between them. I was a bit skeptical. I figured all though you might be able to allocate multiple LAGs, the load-balancing settings on the LAG itself would apply. I setup a nested environment to test this configuration (2xLAGs with 2xNIC assigned to each). Once I had two LAGs I tried to add them to the portgroup for either Active/Active or Active/Passive. I discovered this wasn’t even a supported configuration. So only one LAG (with many vmnics) can be assigned to a portgroup.
5. Click Next and Finish
Reassign vmnics from to Uplink to the LAG
1. Right-click the Distributed Switch and select Add and Manage Hosts
2. Select Manage Host Networking
3. Attach to the hosts allocated to the Distributed Switch. Enable the option Configure identical network settings on multiple hosts (Template Mode). Template mode allows the administrator to make a change to one host, and have that be the assumed configuration for all the hosts.
4. Select one of your VMware ESXi hosts to be the template host
5. Deselect Manage VMKernel Adapters, and ensure that Manage Physical Adapters is selected
6. In the Manage physical adapters page, notice how all the vmnics are assigned to the Uplink container.
Select the first vmnic, in our case this is vmnic2, and click the Assign Uplink option – in the subsequent dialog box select a free network port within the LAG group…
Once finished, click the Apply to all to have these settings applied to all the VMware ESXi hosts
Assign the LAG to be Active for Distributed Portgroups
1. Right-click the Distributed Switch, select Manage Distributed Portgroups
2. Select Teaming and Failover
3. Select the Portgroups you wish to modify. In our case we selected dVLAN101 and dVLAN102
4. Next move the LAG to be the Active uplink, and move the Uplink containers to Unused uplinks. Once the LAG is fully active, the settings for load-balancing on a Distributed Portgroup, are over-ridden by the setting on the LAG itself.
From the Topology view on the Distributed Switch you can see how its the LAG group which contains vmnic2/3 that is now responsible for the traffic.