Mellanox Connectx-3 slow writes

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Anatoly (staff), Max (staff)

Post Reply
User avatar
lohelle
Posts: 144
Joined: Sun Aug 28, 2011 2:04 pm

Fri Sep 23, 2016 7:20 pm

We are having some performance issues in our environment. That is, write performance for one path on one of our SANs.

We have a dualport 40GbE Mellanox Connectx-3 for sync + ISCSI and a dualport 10GbE Emulex for ISCSI on each HA-pair.
This is a VMWare environment, and all hosts have Emulex dualport 10GbE nics.

The problem is the performance to the 40GbE path on one of the SAN nodes.
To circle in the problem, I have installed a 1.6TB Intel P3600 NVME drive on that node, testing with a standalone target. We have tried multiple hosts (using different switch paths), so I do not think the problem is there.

When using a path to the 40GbE nic, the max write transfer speed peaks at 2-300MB/s.

Using one of the 10GbE paths, the scaling is very good, and the performance peaks at near line rate.
Using round robin + some MPIO tweaks, the P3600-drive started to become the limiting factor, at least for writes. But only when also disabling the single 40GbE path, so that the load was balanced using the dual 10GbE nics.

Do you have any idea where I should troubleshoot. Some driver advanced settings we should try.
I know this is not a Starwind problem actually, but I guess many people here have some experience with simular problems and solutions.
Btw, testing with ntttcp we get much better performance (20Gbit+)

Attached is a Bench32-test using 10GbE path and 40GbE path.
OS is Windows Server 2012 R2, testing with a few driver versions with the same result. We let the driver setup optimize Windows-settings.
Attachments
40GbE
40GbE
40gbe path.png (77.35 KiB) Viewed 16418 times
10GbE
10GbE
10gbe path.png (72.37 KiB) Viewed 16418 times
Al (staff)
Staff
Posts: 43
Joined: Tue Jul 26, 2016 2:26 pm

Thu Sep 29, 2016 3:58 pm

Hello Lohelle,

Thank you for your post here.

Could you please specify driver version you are using? Also, it would be great, if you told us ESXi version you are using?
User avatar
lohelle
Posts: 144
Joined: Sun Aug 28, 2011 2:04 pm

Thu Sep 29, 2016 8:20 pm

I'm currently on driver version 5.22.12447.0.
ESXi 6.0 build 4192238
Al (staff)
Staff
Posts: 43
Joined: Tue Jul 26, 2016 2:26 pm

Tue Oct 04, 2016 4:05 pm

Hello Lohelle,

I have couple questions.

When did this problem started? Did you upgrade your drivers for Mellanox cards?

We have tested in our lab Mellanox Connectx-3 cards with 5.22.12433 drivers and did not have such performance issues. Could you please try using this driver and update us with a results?
User avatar
lohelle
Posts: 144
Joined: Sun Aug 28, 2011 2:04 pm

Tue Oct 04, 2016 5:40 pm

The problem was there from the start when installing/upgrading to the new 40gig card. I did try to swap the cards with the same result.

I think I will reinstall Windows, just to rule out some driver/settings issues. Will try the driver you recommend. Thanks!

I know this is not a Starwind issue, but I also know you have a lot of experience with Mellanox, and that you are good at troubleshooting network and storage performance issues. Vomes with the job I guess.

Thanks!
Al (staff)
Staff
Posts: 43
Joined: Tue Jul 26, 2016 2:26 pm

Wed Oct 05, 2016 2:51 pm

Hello Lohelle,

I hope reinstall will help you :)

Could you please update us as soon as you will test it?
Al (staff)
Staff
Posts: 43
Joined: Tue Jul 26, 2016 2:26 pm

Fri Oct 21, 2016 4:05 pm

Hello Lohelle.

Do you have any updates for us?

Thanks.
User avatar
lohelle
Posts: 144
Joined: Sun Aug 28, 2011 2:04 pm

Fri Oct 21, 2016 5:09 pm

I have been a couple of weeks on Gran Canaria, but I hope to test in a few days.
Will Starwind run fine on Windows Server 2016 ?
Al (staff)
Staff
Posts: 43
Joined: Tue Jul 26, 2016 2:26 pm

Fri Oct 21, 2016 5:22 pm

Hello Lohelle,

We have tested StarWind on Windows Server 2016 Technical Preview 5 and it worked fine.

We are testing it on a release version now.
User avatar
lohelle
Posts: 144
Joined: Sun Aug 28, 2011 2:04 pm

Sat Oct 29, 2016 8:18 pm

Hello!
After installing Windows Server 2016 (from scratch), performance was much better.
It was a little bit unstable (as in varied between 400-800MB/s in VM), but much better.

I'm looking forward to installing a few 40gig cards in my most powerful VMware hosts, and run some benchmarks with end-to-end 40gig network.

A quick NTttcp-test between the Starwind nodes showed 25Gbit performance on the sync connection. I guess it would do 30Gb+ with some tweaking of the parameters.
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue Nov 01, 2016 9:48 am

I'd suggest to wait for the upcoming V8 R5 and give a try to iSER as it makes things MUCH faster (we'll publish a doc on the performance numbers for iSCSI Vs iSER soon after release).
Even if you run StarWind inside a VM you can try to SR-IOV Mellanox RDMA-capable NIC into StarWind Windows Control VM to implement RDMA sync traffic. With ESXi 6.5 and newer version of drivers Mellanox promises miracles ;)
lohelle wrote:Hello!
After installing Windows Server 2016 (from scratch), performance was much better.
It was a little bit unstable (as in varied between 400-800MB/s in VM), but much better.

I'm looking forward to installing a few 40gig cards in my most powerful VMware hosts, and run some benchmarks with end-to-end 40gig network.

A quick NTttcp-test between the Starwind nodes showed 25Gbit performance on the sync connection. I guess it would do 30Gb+ with some tweaking of the parameters.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Post Reply