F5 Load balancer lag with SSRS VM

This may be more of a perfect storm situation here, than a major concern across all SSRS Scale out deployments using an F5 hardware load balancer. Ill detail what I found & how we were able to workaround & eventually resolve this issue.

The 3 components that affected & essentially caused this issue are:

  • F5 Network Load Balancer
  • SQL Server Reporting Services (2014, though I believe all versions would be affected)
  • VMWare Windows Server 2012 R2 Virtual machine(s)

The Problem

Essentially, the issue we encountered was “lag” on page, folder & report loads in all of our non-production environments. Page load times were taking anything between 2 and 12 seconds to completely load each page. Comparing these to our production environment we see an average load time of less than 600ms with peaks under 200ms.

As you can see from this timing excerpt I maintained, we would see lag on some environments but not all consistently. I also tried a process of elimination by changing single components and marking the results. This was a lengthy process!

NLB_Lag

Light at the end of the tunnel?

Working closely with VM admins & network admins we tried different configuration changes. Host changes, subnet changes, proxy changes, all showing mixed results. We’d often see a positive improvement that would later appear to be an anomoly.

A glimmer of hope came when I installed Wireshark on a few of the servers to try and analyze the issue closer. As soon as wireshark was opened we were getting consistently fast results. Turn it off & it slowed down eventually.

Based on this revelation our VM admin came across a somewhat relevant post on VMWare’s KB Site:

https://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=1027511

Though this article refers to Linux VMs the additional info section is what led him to a solution:

“Using promiscuous mode in the Linux virtual machine leads to resolving the performance issue too because activating this mode leads to disabling LRO.”

By disabling LRO  (Large Receive Offload) on the VMXNET3 NIC of the VM we seen consistent performance improvements in load times. This remained the case even after WireShark was uninstalled & also on VMs where WireShark was never installed. All results were under 400ms!

A VMWare admin may be able to provide further details on how to disable LRO but this certainly resolved the lag issue we were having!

Hope this helps someone else out!

You may also like...

Leave a Reply