Hyper-V issue

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
batiati
Posts: 29
Joined: Mon Apr 08, 2019 1:16 pm
Location: Brazil
Contact:

Mon Aug 26, 2019 7:21 pm

Folks,
I need help with a problem not directly related to StarWind VSAN.

I have a fresh install of a converged Hyper-V environment where I can access the iSCSI volumes from the host with a good performance, higher than 160k / 90k IOPS (measurement by diskspd.exe, read and write with 4k sequential blocks).

However the VMs have unbelievably slow disk performance, not reaching 1000 IOPS in the same operations; Sure it's not Starwind's fault, the same happens for VMs on local disk for both hosts. I'm using fixed size vhdx, formatted with NTFS, 4K blocks, on top of the CSV formatted in the same way.

I read somewhere about advanced IOBalance settings, but it had no effect on me.
CurrentControlSet\Control\StorVSP\IOBalance\Enabled

I wonder if anyone has experienced a similar problem. I have some experience in Hyper-V, but I have never seen this before (I confess that I'm quite new on storage and clustering environments).

I'm thinking about formatting everything and start it over again, but I'd like to understand and solve this issue.
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Tue Aug 27, 2019 1:24 pm

Hi batiati,
Could you please clarify what network cards are you using, what vendor?
batiati
Posts: 29
Joined: Mon Apr 08, 2019 1:16 pm
Location: Brazil
Contact:

Tue Aug 27, 2019 5:09 pm

Hi Oleg,
Could you please clarify what network cards are you using, what vendor?
It's a QLogic (or Broadcom) 10Gb SFP+ for iSCSI and SYNC on separated NICs.

But it doesn't seem like a problem, I have tested with iperf and diskspd.exe between the Hyper-V host and StarWind node, both with great performance.

- On StarWind I'm using RAID 10, NTFS formatted with 4k blocks and flat images without cache.
- On Hyper-V I'm using MPIO for iSCSI targets and NTFS formatted with 4k blocks.
- On Guest VM I'm using fixed size VHDX disks NTFS formatted with 4k blocks.

The problem is just inside the user's VM, where the 4k operations are too slow:

Code: Select all

------------------------------------------------------------
                    |    100% Writes   |     100% Reads    |
4K                  |    IOPS|     MBps|     IOPS|     MBps|
--------------------|--------|---------|---------|---------|
StarWind disk       |  92,915|   362.95|  140,015|   546.93|
Hyper-V host iSCSI  |  59,963|   234.23|  169,514|   662.17|
Guest VM vhdx fixed |  20,008|    78.16|   25,936|   101.31|
------------------------------------------------------------
As we can see, from Starwind to Hyper-V we got 35% less IOPS for write (due the SYNC) and 21% IOPS increase for read (due the MPIO), not bad at all :D !
But inside the VM we got a huge penality of 66% less IOPS for write and 84% less IOPS for read!
No QoS enabled for this VM, but no effect when I enabled it.

Thanks a lot!
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Wed Aug 28, 2019 8:37 am

Thank you for the details.
Performance inside VMs can be related to NICs settings on the management network. VMQ should be disabled on all levels, inside VMs as well. Also, EEE control policies should be changed to High performance as well. These changes can be done in Advanced network card settings. In the case of NIC teaming for management, these changes should be applied on all NICs in the team
Please also check power settings of the servers, should be set to High Performance.
batiati
Posts: 29
Joined: Mon Apr 08, 2019 1:16 pm
Location: Brazil
Contact:

Tue Sep 03, 2019 8:59 pm

Hi,
So, almost solved this issue

I don't know why, but all VMs cloned from that specific .vhdx showed the same performance issue.
I created a new .vhdx file and installed a fresh Windows on it, and now it's much better:

Code: Select all

------------------------------------------------------------
                    |    100% Writes   |     100% Reads    |
4K                  |    IOPS|     MBps|     IOPS|     MBps|
--------------------|--------|---------|---------|---------|
StarWind disk       |  92,915|   362.95|  140,015|   546.93|
Hyper-V host iSCSI  |  59,963|   234.23|  169,514|   662.17|
Guest VM vhdx fixed |  52,094|   203.49|   61,297|   239.44|
------------------------------------------------------------
The write IOPS are great, but the read IOPS are very disapointing...

I did run some tests:

- I tried to run diskspd.exe on 2 VMs at same time, but the sum of IOPS from both stayed on ~60k IOPS, no improvement.

- I tried to use a passthrough disk on the VM, and the same result, no improvements;

- I tried to attach directly a Starwind iSCSI disk inside the VM and boom! 150k read IOPS, just like the Hyper-V host!

Are there any tweaks on Hyper-V to improve the performance? Specifically READ on 4k performance, because the 64k tests are great too.
Thanks a lot!
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Wed Sep 04, 2019 7:38 am

Hi batiati,
Please try with fixed .vhdx. Please also make sure that you are using SCSI controller for .vhdx connection.
Did you try proposed tweaks?
batiati
Posts: 29
Joined: Mon Apr 08, 2019 1:16 pm
Location: Brazil
Contact:

Wed Sep 04, 2019 11:44 am

Oleg(staff) wrote:Hi batiati,
Please try with fixed .vhdx. Please also make sure that you are using SCSI controller for .vhdx connection.
Did you try proposed tweaks?
Yes, the vhdx is fixed size, the VM is Gen2 using SCSI controller.
I followed all best-practices on StarWind's KB (the iSCSI disks works great on the Hyper-V host), and I did some power config tweaks: bios to maximum performance, processor c states disabled, power plan on host and vm set to high performance, VMQ disabled on NIC and VM, I tried on Hyper-V 2016 and 2012R2.

I'll try to connect another iSCSI disk to a Vmware host and run some tests with a VM on Esxi, maybe it is not a Hyper-V issue.
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Thu Sep 05, 2019 10:14 am

I'll try to connect another iSCSI disk to a Vmware host and run some tests with a VM on Esxi, maybe it is not a Hyper-V issue.
Please share your findings.
batiati
Posts: 29
Joined: Mon Apr 08, 2019 1:16 pm
Location: Brazil
Contact:

Fri Sep 06, 2019 2:37 am

Comparison between Hyper-V and Vmware Esxi

Previous Hyper-V

Code: Select all

------------------------------------------------------------
                    |    100% Writes   |     100% Reads    |
4K                  |    IOPS|     MBps|     IOPS|     MBps|
--------------------|--------|---------|---------|---------|
StarWind disk       |  92,915|   362.95|  140,015|   546.93|
Hyper-V host iSCSI  |  59,963|   234.23|  169,514|   662.17|
Guest VM vhdx fixed |  52,094|   203.49|   61,297|   239.44|
------------------------------------------------------------
Now on VMWare (using the same StarWind installation on Windows hosts)

Code: Select all

------------------------------------------------------------
                    |    100% Writes   |     100% Reads    |
4K                  |    IOPS|     MBps|     IOPS|     MBps|
--------------------|--------|---------|---------|---------|
VMWare VM zeroed    |  48,480|   189.38|   97,873|   383.32|
------------------------------------------------------------
It was a quick setup of vmware, no tuning, I just connected the iscsi datastore and configured with round robin.


As we can see, the writes are slight below on Vmware than on Hyper-V (maybe due some TCP offload capabilities enabled on Windows).
Reads are far better on Vmware, almost 100k IOPS, it prooves that something is very wrong with my Hyper-V, but I don't know what to think anymore!
batiati
Posts: 29
Joined: Mon Apr 08, 2019 1:16 pm
Location: Brazil
Contact:

Fri Sep 06, 2019 5:18 pm

I spent some time reading about the Hyper-V IO balancer. I had already seen something about configuring the IOBalance registry keys on Hyper-V, but there was many variations from post to post, and no updated official documentation from Microsoft ... I tried some of them, but with no effect, and at the end I thought that it was some kind of deprecated feature, applicable only for Hyper-V 2008.

Finally I found a very clear presentation with useful tweaks, among the IOBalance issue.
http://www.defense-ops.com/wp-content/u ... tices.pptx

So, here is the right registry, changed and tested on 2012 R2, removed and tested again! I will try it again on Windows Server 2019 soon.

Code: Select all

HKLM\System\CurrentControlSet\Control\StorVsp\IOBalance\Enabled
After registry changes and host reboot:

Code: Select all

------------------------------------------------------------
                    |    100% Writes   |     100% Reads    |
4K                  |    IOPS|     MBps|     IOPS|     MBps|
--------------------|--------|---------|---------|---------|
StarWind disk       |  92,915|   362.95|  140,015|   546.93|
Hyper-V host iSCSI  |  59,963|   234.23|  169,514|   662.17|
Guest VM vhdx fixed |  57,981|   226.49|  138,365|   540.29|
------------------------------------------------------------
Now I'm happy with the guest VM performance!
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Mon Sep 09, 2019 2:05 pm

Thank you for sharing your results and findings.
batiati
Posts: 29
Joined: Mon Apr 08, 2019 1:16 pm
Location: Brazil
Contact:

Mon Oct 14, 2019 11:11 pm

Learning from my own mistakes ...
batiati wrote: So, almost solved this issue

I don't know why, but all VMs cloned from that specific .vhdx showed the same performance issue.
I created a new .vhdx file and installed a fresh Windows on it, and now it's much better:
I had the same issue again, and I finally realize what I was doing so wrong: I enabled the NTFS Dedup on CSV, and that thing is a big IOPS killer; That was why on a new vhdx I achive a fine performance, because the dedup job wasn't run yet. When I set dedup off and ran the unoptimze job, all VHDX performed as expected.
batiati wrote: So, here is the right registry, changed and tested on 2012 R2, removed and tested again! I will try it again on Windows Server 2019 soon.

Code: Select all

HKLM\System\CurrentControlSet\Control\StorVsp\IOBalance\Enabled
No, this registry trick does not work neither on WS2016 nor WS2019, just on 2012 R2; For both, I had the same issue on capped 4K reads.

Regards!
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Tue Oct 15, 2019 7:33 am

batiati,

If I am not mistaken, dedup is configured to be performed once in 24 hours, usually during non-production hours namely because of its IOPS hungry nature. Basically, your findings go in line with this.
It's great you were able to figure this our finally. And thanks for sharing this with the community, too.
batiati
Posts: 29
Joined: Mon Apr 08, 2019 1:16 pm
Location: Brazil
Contact:

Tue Oct 15, 2019 12:46 pm

Boris (staff) wrote:If I am not mistaken, dedup is configured to be performed once in 24 hours, usually during non-production hours namely because of its IOPS hungry nature
Yep, but I meant the performance penalty on a already deduplicated volume, not during the dedup job execution.
A VM running on .vhdx placed on a deduplicated volume shows a very high disk latency during the diskspd test.

From https://arstechnica.com/civis/viewtopic ... &t=1294043:

Is Hyper-V in general supported with a Deduplicated volume?

We spent a lot of time to ensure that Data Deduplication performs correctly on general virtualization workloads. However, we focused our efforts to ensure that the performance of optimized files is adequate for VDI scenarios. For non-VDI scenarios (general Hyper-V VMs), we cannot provide the same performance guarantees.

As a result, we do not support deduplication of arbitrary in use VHDs in Windows Server 2012 R2. However, since Data Deduplication is a core part of the storage stack, there is no explicit block in place that prevents it from being enabled on arbitrary workloads.
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Tue Oct 15, 2019 3:32 pm

Ah, I see your point now. It does make sense.
Post Reply