vSphere VSA sync/iSCSI network speed

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
kwaleeb
Posts: 4
Joined: Thu Apr 11, 2019 3:49 am

Thu Apr 11, 2019 4:03 am

So I'm currently doing some testing with StarWind VSA for a 2-node vSphere setup and am having some issues. Both our servers are equipped with 5x 1.2tb write-intensive SSD's in a RAID5 on a P440ar storage controller, and dual-port 25gb/s HPE/Mellanox 640FLR-SFP28 Ethernet cards directly connected to one another for Sync and iSCSI/heartbeat. I've installed the OVA's on both servers and set them up with the proper networking (Sync switch, iSCSI switch, management switch, all VMXNET3 adapters on the VM's). However, I'm doing some testing with iperf on the StarWind VSA's themselves over their direct links to one another and only getting about 15Gb/s tops, not the 25Gb/s that the card is rated at. The cards show as 25Gb/s link speed on the physical NIC's tab in ESXi and I've set jumbo frames on all the vSwitches. I've done all firmware updates possible on all the storage controllers, network adapters, and BIOS/iLO so both hosts are at the same, most recent versions. The network adapters show as 10Gb/s in the StarWind VSA web interface, but this obviously isn't true or I wouldn't be getting 15Gb/s. Is this a limitation of the VMXNET3 adapter or is there something I'm missing?
Serhi
Posts: 21
Joined: Mon Mar 25, 2019 4:01 pm

Thu Apr 11, 2019 2:08 pm

Hello

Probably the reason in RAID5.

BR, Serhi
kwaleeb
Posts: 4
Joined: Thu Apr 11, 2019 3:49 am

Thu Apr 11, 2019 3:40 pm

But doesn't iperf run tests in memory? And if it was going to the RAID5 array, its an all-flash array, so shouldn't it get more than 15Gb/s?
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Thu Apr 11, 2019 10:52 pm

Serhi,

iperf does not rely on storage, as it runs data transfer to RAM.
---------
kwaleeb,

It's a usual situation for the VMXNET3 adapter to show the link speed as 10Gbps, but be in fact faster. From my experience, I've never seen an ESXi environment to hit the ceiling while testing network throughput using NICs that are faster than 10Gbps. It's floating at 15-25Gbps for the 25-40Gbps NICS with some random peaks, too. I believe it does depend on the way VMXNET3 is implemented, so we just need to wait for VMware to fix it. It could be a long wait though. As for the drivers, I've run into situations when the latest driver version would give me worse results compared to the latest but one version, for instance. Make sure you are not missing any updated VIB for networking at the level of ESXi.
kwaleeb
Posts: 4
Joined: Thu Apr 11, 2019 3:49 am

Fri Apr 12, 2019 12:54 am

So today I updated the network drivers/VIBs within ESXi for the adapter (nmlx5-core) with the most recent ones from Mellanox's website and disabled DRSS/RSS mode as recommended in the release notes for vSphere 6.7. This helped a little bit and I was able to get 20Gb/s with iperf directly from ESXi host to ESXi host.

We have two identical hosts with this hardware/config:
HPE DL380 Gen9
2x Intel 2660 v3 2.6GHz
128 GB DDR4 2133 (HP certified)
HPE 640FLR-SFP28 25Gb/s Mellanox ethernet (firmware version 14.23.8052, driver version 4.17.14.2)
HPE P440ar storage controller (firmware version 6.88)
5x HPE 1.2tb write-intensive SSD's in RAID5

1x 250gb Virtual Disk for StarWind VM and other local VMs
1x 100gb Virtual Disk passed directly to StarWind VM for image storage
1x 100gb Virtual Disk passed directly to test VM

I'm using the most recent StarWind VSA image for vSphere. Storage for the VSA's StarWind Images are stored on 100gb virtual disks that were made in the raid controller and passed through to the VM using "New RAW disk" in ESXi. The disks were zero formatted as xfs file systems and mounted to the /mnt/disk1 directory. I then proceeded to use the StarWind Management Console to create a 40gb Hard Disk/Virtual Disk device without any RAM or flash cache and replicated the image to the second host. Sync and Heartbeat/iSCSI channels are set to the two separate VMXNET3 interfaces that run through separate vSwitches to each of the 640FLR-SFP28's 25Gb/s interfaces.

Image
Image
Image

I also set the setting 'Disk.DiskMaxIOSize' on each ESXi host to 512, added the Dynamic iSCSI targets in VMware, and created a 40gb VMFS6 datastore from the STARWIND device presented through iSCSI.

Image
Image

To test actual disk performance, I spun-up a Win10 Pro virtual machine with its hard disk located on the 40gb datastore that runs through StarWind's iSCSI interface to the ESXi hosts. Using CrystalDiskMark 6, these were my results. C: is stored on the 40gb HA image through StarWind iSCSI, E: is a RAW 100gb virtual disk passed directly to the VM from the raid controller. Nothing else was being preformed on the server when these tests were being done.

StarWind disk:
Image

RAW Virtual Disk (direct storage controller access):
Image

The same results occur when using Window's winsat.

StarWind disk:
Image

RAW Virtual Disk (direct storage controller access):
Image


What is the issue? Obviously something is causing latency/speed issues, but the network still runs at a minimum 15Gb/s. Any suggestions?
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Mon Apr 15, 2019 2:30 pm

Hi kwaleeb,
To test actual disk performance, I spun-up a Win10 Pro virtual machine with its hard disk located on the 40gb datastore that runs through StarWind's iSCSI interface to the ESXi hosts.
Could you please clarify, what disk did you use for this test machine? Was it thick provision eager zeroed?
kwaleeb
Posts: 4
Joined: Thu Apr 11, 2019 3:49 am

Tue Apr 16, 2019 2:59 pm

The Win10 Pro virtual machine's VMDK was located on the StarWind datastore. The VMDK that the StarWind VSA VM uses to store image files was eager zeroed, but the Win10 Pro VMDK itself was not eager zero'd.

I know you're supposed to thick provision eager zero all the VMDK's that the StarWind VM uses to store image files, but do I need to thick provision eager zero all the VM's that are stored on the StarWind datastore as well?
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Wed Apr 17, 2019 1:39 pm

If you are after storage performance, the answer would be positive.
Post Reply