Page 1 of 3

ESXi SSD / NVME actual IOPS numbers

Posted: Thu May 26, 2016 1:00 am
by dwright1542
I've been working with the SW guys for the better part of 6 months trying to get some PCIE flash cards / SSD's running properly in a hyperconverged environment.


I've tried many different combos, and even setup a physical SW server with some LSI Nytro drives.

Basically, in a hyper converged environment, on Fusion Iodrive2 cards, we can only get 20k IOPS, 4k random read, QD32

In the physical environment, with cards that can locally do 330k IOPS, I can only get 60k across the network.

Tried Dell C2100's, HP DL385 G8's, all with tons of memory, performance set in BIOS. All the normal stuff. Doesn't make a difference.

10G network, intel x520 adapters, jumbo frames

SW guys are stumped, vmware support is a JOKE anymore. (Send them logs and they MAY get back to you. Maybe not) We're kinda at a loss.

Does anyone have real world data that you can share that shows more IOPS than 20k on ESX in a HC environment?

Re: ESXi SSD / NVME actual IOPS numbers

Posted: Thu May 26, 2016 4:36 pm
by nohope
May I ask how did you benchmark the physical storage? As far as I know, it is a bit complicated to simulate writes and reads directly on SSD drive. I think it’s just a matter of inbuilt cache which we cannot go around in doing so.

Re: ESXi SSD / NVME actual IOPS numbers

Posted: Fri May 27, 2016 1:53 pm
by dwright1542
nohope wrote:May I ask how did you benchmark the physical storage? As far as I know, it is a bit complicated to simulate writes and reads directly on SSD drive. I think it’s just a matter of inbuilt cache which we cannot go around in doing so.
Sure. We run IOmeter from the SW VM to get a base. Then we share out that SSD, and run it from a guest from the same machine.

(Ok take out the SSD....it's the same issue with a SW Ramdisk)

Re: ESXi SSD / NVME actual IOPS numbers

Posted: Fri May 27, 2016 2:52 pm
by hste
Have you looked into low latency settings?
http://www.vmware.com/files/pdf/techpap ... here55.pdf

hste

Re: ESXi SSD / NVME actual IOPS numbers

Posted: Sat May 28, 2016 4:43 am
by dwright1542
That's a good link, thanks. I'm relying right now on the SW guys to nail this down. I feel as that may be squeezing the final IOPS out of the system. I'm only getting 20% of the native IOPS, so there must be something HUGE bottlenecking the system.

-D

Re: ESXi SSD / NVME actual IOPS numbers

Posted: Mon May 30, 2016 11:57 am
by anton (staff)
Here's a whole story:

1) Microsoft storage stack is a very old code, it was written and designed when underlying storage was slow. As a result Microsoft isn't going to deliver all possible IOPS from the single very fast NVMe card. There are a couple of workarounds:

a) Disable I/O scheduler. There's going to be less context switches and I/O will be synchronous instead of async.

HKLM\SYSTEM\CurrentControlSet\Control\StorVSP\IOBalance\Enabled:0

So measure performance on NVMe-mapped disk BEFORE and AFTER applying this key (make sure you don't forget to reboot). It's possible to get 30-40% IOPS depending on your configuration (NVMe type, amount of CPU cores and their utilization)

b) Split your card into many logical volumes. Microsoft keeps one I/O queue per volume so when you'll have more volumes you'll have more queues and they will be served by CPU cores more effectively.

2) StarWind is also a very old code so we have issues with delivering high IOPs from one namespace. Increasing number of namespaces help (just like with MSFT n 1b). So get more LUNs with StarWind (w'ell hide them behind vVols for VMware so that's not an issue, Hyper-V will have SMB3 and VHDX files).

Combine 1a and 2 or 1b and 2 (but not both 1a and 1b !!!) and you'll get 100-250K IOPS from a single properly mapped StarWind-managed LUN. StarWind staff should help you with it ;)
dwright1542 wrote:That's a good link, thanks. I'm relying right now on the SW guys to nail this down. I feel as that may be squeezing the final IOPS out of the system. I'm only getting 20% of the native IOPS, so there must be something HUGE bottlenecking the system.

-D

Re: ESXi SSD / NVME actual IOPS numbers

Posted: Fri Jun 17, 2016 7:12 pm
by dwright1542
anton (staff) wrote:Here's a whole story:

1) Microsoft storage stack is a very old code, it was written and designed when underlying storage was slow. As a result Microsoft isn't going to deliver all possible IOPS from the single very fast NVMe card. There are a couple of workarounds:

a) Disable I/O scheduler. There's going to be less context switches and I/O will be synchronous instead of async.

HKLM\SYSTEM\CurrentControlSet\Control\StorVSP\IOBalance\Enabled:0

So measure performance on NVMe-mapped disk BEFORE and AFTER applying this key (make sure you don't forget to reboot). It's possible to get 30-40% IOPS depending on your configuration (NVMe type, amount of CPU cores and their utilization)

b) Split your card into many logical volumes. Microsoft keeps one I/O queue per volume so when you'll have more volumes you'll have more queues and they will be served by CPU cores more effectively.

2) StarWind is also a very old code so we have issues with delivering high IOPs from one namespace. Increasing number of namespaces help (just like with MSFT n 1b). So get more LUNs with StarWind (w'ell hide them behind vVols for VMware so that's not an issue, Hyper-V will have SMB3 and VHDX files).

Combine 1a and 2 or 1b and 2 (but not both 1a and 1b !!!) and you'll get 100-250K IOPS from a single properly mapped StarWind-managed LUN. StarWind staff should help you with it ;)
This is affecting the native SW Ramdisk as well. Is that the same issue? I'll pass this info along to Oles who's been assisting.

Re: ESXi SSD / NVME actual IOPS numbers

Posted: Fri Jun 24, 2016 9:01 am
by Dmitry (staff)
Thank you Darren, we`ll keep community updated.

Re: ESXi SSD / NVME actual IOPS numbers

Posted: Tue Jul 26, 2016 4:35 pm
by Trevbot
Any update on this?

Re: ESXi SSD / NVME actual IOPS numbers

Posted: Wed Jul 27, 2016 9:53 am
by Oles (staff)
Hello Guys,

Unfortunately, we do not have an update for this, but we will continue working until we get to the root cause of the issue.

Thank you for your understanding and patience.

Re: ESXi SSD / NVME actual IOPS numbers

Posted: Wed Aug 03, 2016 4:07 am
by dwright1542
I've actually gotten some decent numbers shutting off all the caching off, and using a minimal amount of RAM for SW, and installing PERNIX on the front side. This really is a far cry from the original intent for this software.

Hoping this gets fixed soon. The speed of drives have gone nuts in the last year.

Re: ESXi SSD / NVME actual IOPS numbers

Posted: Tue Aug 09, 2016 2:29 pm
by Al (staff)
Hello gentleman,

We are working hard on resolution and will update community as soon as possible.

The speed of drives is growing exponentially! :D

Re: ESXi SSD / NVME actual IOPS numbers

Posted: Thu Oct 13, 2016 5:53 pm
by kevrags
I'd like to bump this to see if there has been any progress on this.

Thank you,

Kevin

Re: ESXi SSD / NVME actual IOPS numbers

Posted: Mon Oct 24, 2016 3:40 pm
by Al (staff)
Hello gentleman,

We are doing our best to resolve this and update the community.

Kevrags,

Thanks for jumping in.

Re: ESXi SSD / NVME actual IOPS numbers

Posted: Mon Oct 24, 2016 7:50 pm
by anton (staff)
yes, major progress with v8r5 we'll push out next couple of weeks
kevrags wrote:I'd like to bump this to see if there has been any progress on this.

Thank you,

Kevin