ESXi SSD / NVME actual IOPS numbers

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

dwright1542
Posts: 19
Joined: Thu May 07, 2015 1:58 am

Thu May 26, 2016 1:00 am

I've been working with the SW guys for the better part of 6 months trying to get some PCIE flash cards / SSD's running properly in a hyperconverged environment.


I've tried many different combos, and even setup a physical SW server with some LSI Nytro drives.

Basically, in a hyper converged environment, on Fusion Iodrive2 cards, we can only get 20k IOPS, 4k random read, QD32

In the physical environment, with cards that can locally do 330k IOPS, I can only get 60k across the network.

Tried Dell C2100's, HP DL385 G8's, all with tons of memory, performance set in BIOS. All the normal stuff. Doesn't make a difference.

10G network, intel x520 adapters, jumbo frames

SW guys are stumped, vmware support is a JOKE anymore. (Send them logs and they MAY get back to you. Maybe not) We're kinda at a loss.

Does anyone have real world data that you can share that shows more IOPS than 20k on ESX in a HC environment?
nohope
Posts: 18
Joined: Tue Sep 29, 2015 8:26 am

Thu May 26, 2016 4:36 pm

May I ask how did you benchmark the physical storage? As far as I know, it is a bit complicated to simulate writes and reads directly on SSD drive. I think it’s just a matter of inbuilt cache which we cannot go around in doing so.
dwright1542
Posts: 19
Joined: Thu May 07, 2015 1:58 am

Fri May 27, 2016 1:53 pm

nohope wrote:May I ask how did you benchmark the physical storage? As far as I know, it is a bit complicated to simulate writes and reads directly on SSD drive. I think it’s just a matter of inbuilt cache which we cannot go around in doing so.
Sure. We run IOmeter from the SW VM to get a base. Then we share out that SSD, and run it from a guest from the same machine.

(Ok take out the SSD....it's the same issue with a SW Ramdisk)
hste
Posts: 17
Joined: Wed Mar 05, 2014 9:42 pm

Fri May 27, 2016 2:52 pm

Have you looked into low latency settings?
http://www.vmware.com/files/pdf/techpap ... here55.pdf

hste
dwright1542
Posts: 19
Joined: Thu May 07, 2015 1:58 am

Sat May 28, 2016 4:43 am

That's a good link, thanks. I'm relying right now on the SW guys to nail this down. I feel as that may be squeezing the final IOPS out of the system. I'm only getting 20% of the native IOPS, so there must be something HUGE bottlenecking the system.

-D
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Mon May 30, 2016 11:57 am

Here's a whole story:

1) Microsoft storage stack is a very old code, it was written and designed when underlying storage was slow. As a result Microsoft isn't going to deliver all possible IOPS from the single very fast NVMe card. There are a couple of workarounds:

a) Disable I/O scheduler. There's going to be less context switches and I/O will be synchronous instead of async.

HKLM\SYSTEM\CurrentControlSet\Control\StorVSP\IOBalance\Enabled:0

So measure performance on NVMe-mapped disk BEFORE and AFTER applying this key (make sure you don't forget to reboot). It's possible to get 30-40% IOPS depending on your configuration (NVMe type, amount of CPU cores and their utilization)

b) Split your card into many logical volumes. Microsoft keeps one I/O queue per volume so when you'll have more volumes you'll have more queues and they will be served by CPU cores more effectively.

2) StarWind is also a very old code so we have issues with delivering high IOPs from one namespace. Increasing number of namespaces help (just like with MSFT n 1b). So get more LUNs with StarWind (w'ell hide them behind vVols for VMware so that's not an issue, Hyper-V will have SMB3 and VHDX files).

Combine 1a and 2 or 1b and 2 (but not both 1a and 1b !!!) and you'll get 100-250K IOPS from a single properly mapped StarWind-managed LUN. StarWind staff should help you with it ;)
dwright1542 wrote:That's a good link, thanks. I'm relying right now on the SW guys to nail this down. I feel as that may be squeezing the final IOPS out of the system. I'm only getting 20% of the native IOPS, so there must be something HUGE bottlenecking the system.

-D
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
dwright1542
Posts: 19
Joined: Thu May 07, 2015 1:58 am

Fri Jun 17, 2016 7:12 pm

anton (staff) wrote:Here's a whole story:

1) Microsoft storage stack is a very old code, it was written and designed when underlying storage was slow. As a result Microsoft isn't going to deliver all possible IOPS from the single very fast NVMe card. There are a couple of workarounds:

a) Disable I/O scheduler. There's going to be less context switches and I/O will be synchronous instead of async.

HKLM\SYSTEM\CurrentControlSet\Control\StorVSP\IOBalance\Enabled:0

So measure performance on NVMe-mapped disk BEFORE and AFTER applying this key (make sure you don't forget to reboot). It's possible to get 30-40% IOPS depending on your configuration (NVMe type, amount of CPU cores and their utilization)

b) Split your card into many logical volumes. Microsoft keeps one I/O queue per volume so when you'll have more volumes you'll have more queues and they will be served by CPU cores more effectively.

2) StarWind is also a very old code so we have issues with delivering high IOPs from one namespace. Increasing number of namespaces help (just like with MSFT n 1b). So get more LUNs with StarWind (w'ell hide them behind vVols for VMware so that's not an issue, Hyper-V will have SMB3 and VHDX files).

Combine 1a and 2 or 1b and 2 (but not both 1a and 1b !!!) and you'll get 100-250K IOPS from a single properly mapped StarWind-managed LUN. StarWind staff should help you with it ;)
This is affecting the native SW Ramdisk as well. Is that the same issue? I'll pass this info along to Oles who's been assisting.
Dmitry (staff)
Staff
Posts: 82
Joined: Fri Mar 18, 2016 11:46 am

Fri Jun 24, 2016 9:01 am

Thank you Darren, we`ll keep community updated.
Trevbot
Posts: 12
Joined: Sun Mar 08, 2015 8:59 pm

Tue Jul 26, 2016 4:35 pm

Any update on this?
User avatar
Oles (staff)
Staff
Posts: 91
Joined: Fri Mar 20, 2015 10:58 am

Wed Jul 27, 2016 9:53 am

Hello Guys,

Unfortunately, we do not have an update for this, but we will continue working until we get to the root cause of the issue.

Thank you for your understanding and patience.
dwright1542
Posts: 19
Joined: Thu May 07, 2015 1:58 am

Wed Aug 03, 2016 4:07 am

I've actually gotten some decent numbers shutting off all the caching off, and using a minimal amount of RAM for SW, and installing PERNIX on the front side. This really is a far cry from the original intent for this software.

Hoping this gets fixed soon. The speed of drives have gone nuts in the last year.
Al (staff)
Staff
Posts: 43
Joined: Tue Jul 26, 2016 2:26 pm

Tue Aug 09, 2016 2:29 pm

Hello gentleman,

We are working hard on resolution and will update community as soon as possible.

The speed of drives is growing exponentially! :D
kevrags
Posts: 12
Joined: Wed Dec 26, 2012 8:11 pm

Thu Oct 13, 2016 5:53 pm

I'd like to bump this to see if there has been any progress on this.

Thank you,

Kevin
Al (staff)
Staff
Posts: 43
Joined: Tue Jul 26, 2016 2:26 pm

Mon Oct 24, 2016 3:40 pm

Hello gentleman,

We are doing our best to resolve this and update the community.

Kevrags,

Thanks for jumping in.
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Mon Oct 24, 2016 7:50 pm

yes, major progress with v8r5 we'll push out next couple of weeks
kevrags wrote:I'd like to bump this to see if there has been any progress on this.

Thank you,

Kevin
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Post Reply