MPIO: Poor performance using Starwind 6.0

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Matthias
Posts: 14
Joined: Tue Apr 02, 2013 1:38 pm

Mon Apr 08, 2013 1:24 pm

Thank you all for coming back to me about this...

I applied the recommended TCP/IP settings according to the pinned post.
I applied the TcpAckFrequency registry patch.
Also, I already played with the offload stuff.
I went back to the previous release of the realtek drivers.

Note: After all those changes, I only did a disable/enable of the network adapters so far, but NO reboot of the server.
None of the changes had any significant impact to the performance. Neither with nor without MPIO.
Symptoms are still the same: Small transfer sizes are ultra slow, but large sizes seem to adequate. NIC usage in the resource monitor shows no high workload on small transfer sizes.

I am planning to order new NICs, however, this will take a few days. But still afraid that the issue is not resolved then...

Best regards,
Matthias
jeddyatcc
Posts: 49
Joined: Wed Apr 25, 2012 11:52 pm

Mon Apr 08, 2013 4:40 pm

I took a much closer look at your tests. I think you are really seeing the difference between TCP/iSCSI and SAS/SATA. They use very different mechanisms to do the same thing. If you have it available move your SSD's to SATA II connections and retest. In a perfect world with MPIO enabled that is the best your are gonna get. Next test is to mount the drive via smb and run atto again. This won't use MPIO, but you should see what 1 NIC can do for small transfers via smb. I think that you will find that iSCSI/TCP is not built for many small transfers, but large sustained transfers will push them to the limit. Now the big question is, does it matter? For what I do, so far the answer has been no, but I can see a point in the not so near future where iSCSI just can't compete with local attached shared storage(Fiber,SAS). The only reason I'm using iSCSI is that I can spread the disks to different data centers and still access them on normal commodity switches. For that there is a penalty for small transfers... If someone else has different numbers, please share as I would have to confirm everything you have as normal for iSCSI even using 10Gb/s nics.
Matthias
Posts: 14
Joined: Tue Apr 02, 2013 1:38 pm

Mon Apr 08, 2013 5:35 pm

Thank you, jeddyatcc.

My benchmarks with a localhost iSCSI connection on the server machine were great. 85-90 % of a local SATA connection (see previous post).
So, I am not sure if this is a general iSCSI issue.

One of your questions was, if the problem matters. In short: I think yes. We are using iSCSI to host VMs that compile software. They deal with high I/O workload, i.e. many small files. Also, this is exactly what you want to boost if you apply SSDs in contrast to HDDs: IO throughput.

A few post earlier, Max also stated a similar doubt about if the problem matters:
>>> iSCSI uses 64k blocks thus, shows good results in ATTO benchmark.
I think this suggestion was wrong. ATTO benchmarks storage performance on a higher level than iSCSI.

Just for fun, I ran the ATTO on a SMB2 share using 1 NIC: For small transfer sizes, the performance is equal to iSCSI with MPIO enabled (3 NICs).

I am still convinced that there must be an error in my system setup.

Best regards,
Matthias
jeddyatcc
Posts: 49
Joined: Wed Apr 25, 2012 11:52 pm

Mon Apr 08, 2013 6:18 pm

You may be right, but the best test will be to load up a system locally using Hyper-V and time your compiles, then move the vhd to iSCSI storage and try again. I honestly don't think you will see as big a difference as you think, but you then you will know if it is a problem or not.
Matthias
Posts: 14
Joined: Tue Apr 02, 2013 1:38 pm

Mon Apr 08, 2013 6:33 pm

Thanks. I ordered an intel i350t2 nic, and will come back later this week...
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Wed Apr 10, 2013 10:15 am

We`ll gladly look forward to hear back from you.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
Matthias
Posts: 14
Joined: Tue Apr 02, 2013 1:38 pm

Sun Apr 14, 2013 9:27 pm

Finally, I managed to install the new intel NIC into the server machine.

Connections using a single port are 10-15 % faster compared to the realtek NIC.

However, unfortunately, changing the NICs or some side-effect has lead to some more serious problem:
While a ATTO benchmark shows full NIC load in both directions for a single connection, MPIO read performance is -now- horribly slow.
A 128k ATTO transfer shows > 100 MB write speed, but only ~ 35 MB read.
The settings are close to default:
- 2 x i350 NIC (server), 2 x NC362 (initiator), all settings default, but Jumbo enabled (9k)
- RoundRobin
- The TCPAckFrequency registry patch has been applied.

Best regards,
Matthias
imrevo
Posts: 26
Joined: Tue Jan 12, 2010 9:20 am
Location: Germany

Tue Apr 16, 2013 4:16 am

Good morning Matthias,
Matthias wrote: While a ATTO benchmark shows full NIC load in both directions for a single connection, MPIO read performance is -now- horribly slow.
A 128k ATTO transfer shows > 100 MB write speed, but only ~ 35 MB read.
The settings are close to default:
- 2 x i350 NIC (server), 2 x NC362 (initiator), all settings default, but Jumbo enabled (9k)
- RoundRobin
- The TCPAckFrequency registry patch has been applied.
May I ask for the IOPS setting and the IP configuration of the iSCSI vmkernels?

Could you post some screenshots of the ESXi network configuration, especially the MTU settings of vSwitches and vmkernels? I once forgot to set the MTU for the vSwitch to 9000, giving me similar results...

bye
Volker
Last edited by imrevo on Wed Apr 17, 2013 5:33 am, edited 1 time in total.
Matthias
Posts: 14
Joined: Tue Apr 02, 2013 1:38 pm

Tue Apr 16, 2013 8:24 pm

Dear Volker,
Thank you for your reply.

I am sorry that I didn't point it out clearly again, our setup was described in more detail in post #1. There is no VMWare in our setup, but only a Windows Storage Server 2012 (iSCSI Target/StarWind), and a Windows Server 2008R2 client (iSCSI Initiator).
I did not change IOPS setting, and jumbo is enabled at all components.

As there was a big difference between a local-loopback-connection on the target, and our MPIO setup using 3 NICs in terms of performance of small transfer-sizes, I just started to follow performance tips. Now, my setup seams to be messed up...

Best regards,
Matthias
Matthias
Posts: 14
Joined: Tue Apr 02, 2013 1:38 pm

Thu Apr 18, 2013 9:09 pm

Hi all,

Finally, I SOLVED my issues.

There was a strange problem with my Cisco SGS200 switch. For some strange reason, a single-NIC ATTO benchmark was fine, but multiple load on the switch resulted in very strange and sporadic slowdowns.

Meanwhile, the intel I350 NIC works well in a 3-path MPIO setting together with one Realtek adapter. Some results attached to this. Although still far away from loopback-performance for small transfer sizes, we have a significant improvement. Maybe also due to tcp parameter tuning...

Thank you all very much.
Attachments
final result
final result
2xIntel-Plus-1xRealtek.PNG (19.67 KiB) Viewed 6855 times
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Wed May 15, 2013 11:09 am

Thank you very much for your feedback!
We are really glad that your issue get solved, and we truly hope that none will appear again.
It was our pleasure to assist you.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
Post Reply