Deprecated TCP Chimney and outdated hardware

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
batiati
Posts: 29
Joined: Mon Apr 08, 2019 1:16 pm
Location: Brazil
Contact:

Thu Jul 25, 2019 5:20 pm

Hi all,

Let me share my experience dealing with outdated hardware in my first StarWind deployment.

I have two refurbished Dell PowerEdge R610 dual Xeon X5630 16GB RAM acting as storage (converged scenario) with StarWind vSAN Free Edition.
Each node has 6 x 1 TB SSD (RAID5) and 2 x Broadcom NetExtreme II Dual Port 10 Gbps Network Card (BMC57711 / Dell KJYD8)

Although these servers are from 2011, they are still useful today, so I decided to install the latest Windows Server 2019 available on them.

All driver / firmware updates have been applied, and all recommendations from StarWind KB have been followed (jumbo frames, netsh tweaks, etc.).

...

First, I was very happy with the iperf results on the 10GB Broadcom NIC, because I could achieve almost full speed on each link (~ 9.5 Gbps).
But my disappointment came when I put iperf to run on all four interfaces at once: it barely reached 4Gbps on each link, and sometimes when one port reached a higher speed it slowed down another one.

So I started wondering if it was caused by a Broadcom card limitation or even a PCIe 2.0 (5GT / s) broadband limit
I tried several configurations and experiments, in short, that's what worked for me:

1. Because my Broadcom network card does not support RDMA, but does support TOE, I moved from Windows Server 2019 to 2016.

TCP Chimney has been discontinued on Windows Server 2019, so you can no longer enable it (netsh fails!).
https://blogs.technet.microsoft.com/ask ... kb4014193/

Code: Select all

netsh int tcp set global chimney=enabled
If you don't have RDMA support, It really makes a huge difference by enabling TCP Chimney.

2. I had to tune RSS parameters, leaving dedicated physical processor cores for each NIC and limiting the number of RSS queues to the same number of assigned processor cores.

I tried to enable and disable (NUMA / node interleaving) and node interleaving (NUMA OFF) worked better on iperf tests.

Code: Select all

Set-NetAdapterRss -Name SAN01 -NumaNode 0 -BaseProcessorNumber 0 -MaxProcessorNumber 2 -Profile ClosestStatic -NumberOfReceiveQueues 2

Set-NetAdapterRss -Name SAN02 -NumaNode 0 -BaseProcessorNumber 4 -MaxProcessorNumber 6 -Profile  ClosestStatic -NumberOfReceiveQueues 2

Set-NetAdapterRss -Name SYNC01 -NumaNode 0 -BaseProcessorNumber 8 -MaxProcessorNumber 10 -Profile ClosestStatic -NumberOfReceiveQueues 2

Set-NetAdapterRss -Name SYNC02 -NumaNode 0 -BaseProcessorNumber 12 -MaxProcessorNumber 14 -Profile ClosestStatic -NumberOfReceiveQueues 2
3. Enabling TCP FlowControl on switch and NIC ports has helped sustain a uniform throughput when all NICS are busy.

After that, on both nodes, I reached about 80% of the wire speed by running iperf on all four interfaces at the same time, wich is higher than my SSD speed.
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Mon Jul 29, 2019 9:28 am

Hi batiati,
Thank you for your detailed instructions.
Post Reply