Deprecated TCP Chimney and outdated hardware
Posted: Thu Jul 25, 2019 5:20 pm
Hi all,
Let me share my experience dealing with outdated hardware in my first StarWind deployment.
I have two refurbished Dell PowerEdge R610 dual Xeon X5630 16GB RAM acting as storage (converged scenario) with StarWind vSAN Free Edition.
Each node has 6 x 1 TB SSD (RAID5) and 2 x Broadcom NetExtreme II Dual Port 10 Gbps Network Card (BMC57711 / Dell KJYD8)
Although these servers are from 2011, they are still useful today, so I decided to install the latest Windows Server 2019 available on them.
All driver / firmware updates have been applied, and all recommendations from StarWind KB have been followed (jumbo frames, netsh tweaks, etc.).
...
First, I was very happy with the iperf results on the 10GB Broadcom NIC, because I could achieve almost full speed on each link (~ 9.5 Gbps).
But my disappointment came when I put iperf to run on all four interfaces at once: it barely reached 4Gbps on each link, and sometimes when one port reached a higher speed it slowed down another one.
So I started wondering if it was caused by a Broadcom card limitation or even a PCIe 2.0 (5GT / s) broadband limit
I tried several configurations and experiments, in short, that's what worked for me:
1. Because my Broadcom network card does not support RDMA, but does support TOE, I moved from Windows Server 2019 to 2016.
TCP Chimney has been discontinued on Windows Server 2019, so you can no longer enable it (netsh fails!).
https://blogs.technet.microsoft.com/ask ... kb4014193/
If you don't have RDMA support, It really makes a huge difference by enabling TCP Chimney.
2. I had to tune RSS parameters, leaving dedicated physical processor cores for each NIC and limiting the number of RSS queues to the same number of assigned processor cores.
I tried to enable and disable (NUMA / node interleaving) and node interleaving (NUMA OFF) worked better on iperf tests.
3. Enabling TCP FlowControl on switch and NIC ports has helped sustain a uniform throughput when all NICS are busy.
After that, on both nodes, I reached about 80% of the wire speed by running iperf on all four interfaces at the same time, wich is higher than my SSD speed.
Let me share my experience dealing with outdated hardware in my first StarWind deployment.
I have two refurbished Dell PowerEdge R610 dual Xeon X5630 16GB RAM acting as storage (converged scenario) with StarWind vSAN Free Edition.
Each node has 6 x 1 TB SSD (RAID5) and 2 x Broadcom NetExtreme II Dual Port 10 Gbps Network Card (BMC57711 / Dell KJYD8)
Although these servers are from 2011, they are still useful today, so I decided to install the latest Windows Server 2019 available on them.
All driver / firmware updates have been applied, and all recommendations from StarWind KB have been followed (jumbo frames, netsh tweaks, etc.).
...
First, I was very happy with the iperf results on the 10GB Broadcom NIC, because I could achieve almost full speed on each link (~ 9.5 Gbps).
But my disappointment came when I put iperf to run on all four interfaces at once: it barely reached 4Gbps on each link, and sometimes when one port reached a higher speed it slowed down another one.
So I started wondering if it was caused by a Broadcom card limitation or even a PCIe 2.0 (5GT / s) broadband limit
I tried several configurations and experiments, in short, that's what worked for me:
1. Because my Broadcom network card does not support RDMA, but does support TOE, I moved from Windows Server 2019 to 2016.
TCP Chimney has been discontinued on Windows Server 2019, so you can no longer enable it (netsh fails!).
https://blogs.technet.microsoft.com/ask ... kb4014193/
Code: Select all
netsh int tcp set global chimney=enabled
2. I had to tune RSS parameters, leaving dedicated physical processor cores for each NIC and limiting the number of RSS queues to the same number of assigned processor cores.
I tried to enable and disable (NUMA / node interleaving) and node interleaving (NUMA OFF) worked better on iperf tests.
Code: Select all
Set-NetAdapterRss -Name SAN01 -NumaNode 0 -BaseProcessorNumber 0 -MaxProcessorNumber 2 -Profile ClosestStatic -NumberOfReceiveQueues 2
Set-NetAdapterRss -Name SAN02 -NumaNode 0 -BaseProcessorNumber 4 -MaxProcessorNumber 6 -Profile ClosestStatic -NumberOfReceiveQueues 2
Set-NetAdapterRss -Name SYNC01 -NumaNode 0 -BaseProcessorNumber 8 -MaxProcessorNumber 10 -Profile ClosestStatic -NumberOfReceiveQueues 2
Set-NetAdapterRss -Name SYNC02 -NumaNode 0 -BaseProcessorNumber 12 -MaxProcessorNumber 14 -Profile ClosestStatic -NumberOfReceiveQueues 2
After that, on both nodes, I reached about 80% of the wire speed by running iperf on all four interfaces at the same time, wich is higher than my SSD speed.