Performance troubleshooting assistance for 2 node Storage

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Wed Nov 07, 2018 10:24 am

Hi,

I've been having various issues, and support have been helpfull in trying to resolve these. I still have the main issue of performance, and have not been able to bottom out the issue. This is for a Compute and Storage seperated setup.

Both Storage Nodes have identical hardware:
Xeon Quad Core E3-1225 V3
32GB RAM
10GB RAID10 (6TB WD Red Drives)
4 x 2GB HA Devices with 2GB L1 Write-Back cache.
2 x 1GB NICs for Sync Channel (cross-over cables)
4 x 1GB NICs for iSCSI Channel (connected to Gigabit Switch)
Jumbo Frames enabled on Sync and iSCSI NICs

Compute Nodes have 2 flavours, ESXi and Hyper-V Server 2016:
4 x 1GB NICs for iSCSI (on each environment)
Jumbo Frames enabled iSCSI NICs

I've run diskspd with various options, and see varying results on Guest VMs (ESXi and Hyper-V), Hyper-V Hosts, and VSAN Nodes. Read from HA Device is quick, but write is very slow.

A 5GB ISO file copy from VSAN1 > VSAN2 is very fast.
A 5GB ISO file copy from Hyper-V Host > CSV is very fast.
A 5GB ISO file copy from VMware/Hyper-V Guest > CSV is very slow.

Do you have any guidelines/tweaks I could try?
Do you have the reccomended options I should use with diskspd?

I raised a support case yesterday, but Artem Gaevoy closed it, suggesting I log it here (ref: 79716).
artem (staff)
Staff
Posts: 8
Joined: Thu Nov 08, 2018 11:50 am

Thu Nov 08, 2018 12:49 pm

Hello bkdilse,

Could you please provide me with the results of your storage testing,
Also, as far as I understand you have a bad performance when you tried to copy from VM to CSV? Try to disable VMQ inside your VM and on the nodes.
Also, can you provide me with your node specification? Do you use teaming or switch between the nodes or it is direct connect? Can you share your network diagram?

Seance you are running NFR licence - Free products supported only via StarWind Online Community forum please check this article:
https://www.starwindsoftware.com/support
bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Thu Nov 08, 2018 1:09 pm

Hi,

I scrapped my testing results as they varied so much on each run. I can run these gain, do you have any reccomended paramers I should use with Diskspd?

The problem also exists with VMware Guests, which do not use VMQ, so I did not think this was the cause.

I've already provided the VSAN Node specs, no teaming, 1GB NICs, 1GB Switch, cross-over for Sync. If you mean Compute Nodes... ESXi is using Round Robin with 4 Storage NICs on different subnets, and Hyper-V is using a converged network (Physical Team with vSwitch), 4 iSCSI vNICs on different subnets.

Please see ticket ref: 78331, I don't have a network diagram, as it's only a dev environment, but this ticket contains all IPs and the layout.

As mentioned, please let me know the suggested parameters to run on Diskspd, which would help perform a reasonable test.
artem (staff)
Staff
Posts: 8
Joined: Thu Nov 08, 2018 11:50 am

Thu Nov 08, 2018 2:26 pm

No preferences, just standard diskspd. I will check your logs and come back to you as fast as I can.
bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Thu Nov 08, 2018 6:38 pm

Thanks, I'll run these as soon as I get a chance.
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Fri Nov 09, 2018 7:04 pm

Keep us updated with your results.
bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Sun Nov 11, 2018 9:36 am

OK, here are a few tests from a VM on Hyper-V and one on VMware. I had to use some switches, which I found others using.
I know these are sequential, but you can see the writes are too much slower than reads.

Please check these results, and advise on what I can try next?
Attachments
DiskspdResults.zip
Read and Write tests
(4.89 KiB) Downloaded 332 times
artem (staff)
Staff
Posts: 8
Joined: Thu Nov 08, 2018 11:50 am

Mon Nov 12, 2018 9:30 am

Hi bkdilse,

I checked your logs and didn't find anything specific. Please provide us with a network diagram (seance we need to know how your nodes connected physically).
Also, please double check that disk in VM is fixed (for hyper-V) and eager zeroed (for ESXI).
bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Mon Nov 12, 2018 9:52 am

artem (staff) wrote:Hi bkdilse,

I checked your logs and didn't find anything specific. Please provide us with a network diagram (seance we need to know how your nodes connected physically).
Also, please double check that disk in VM is fixed (for hyper-V) and eager zeroed (for ESXI).

As mentioned earlier, please see ticket ref: 78331, which details the network layout.

When you say that you've "checked the logs", do you mean the ones in the ticket I logged, or the Diskspd ones? The Diskspd ones, clearly show, slow writes (unless i'm reading them wrong).

Both VM's are Thin. These have always been Thin, and never had issues. Are you suggesting that Starwind hosted VM's must be Thick?

To test this, I am happy to build another 2 Thick provisioned VM's purely for testing. I am not going to convert all my VM's to thick, only to find that this is not the issue.

EDIT: I'll clone the VMware VM, and set it's disks to Thick (this will be quicker), and post the exact same test results.
artem (staff)
Staff
Posts: 8
Joined: Thu Nov 08, 2018 11:50 am

Mon Nov 12, 2018 10:47 am

Yes, I checked those logs that you send us the last time, if you are looking for performance you must use thick provision eager zeroed it is well-known fact, please try to test this type of disk on your VM to make sure that this is not the issue.
bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Mon Nov 12, 2018 11:00 am

artem (staff) wrote:Yes, I checked those logs that you send us the last time, if you are looking for performance you must use thick provision eager zeroed it is well-known fact, please try to test this type of disk on your VM to make sure that this is not the issue.
OK thanks for confirming.

Just cloned the VM, and set the Disk to Thick eager zero... Got worse performance (which does not make sense) - This is the random performance issue I have been seeingI can post the results, but basically Read is now around 6MB/s and Write is 3MB/s
.
I'm now powering up the original VM (thin) to re-run the test.
bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Mon Nov 12, 2018 11:35 am

So, powered up the original (thin) VM.

1st test, with the sme command line, I got 210MB/s read.
2nd test, with same command line, I got 7MB/s read.

Only thing I can think off is NIC card, as ESXi is using Round Robin. I will try this again, when I get to the physical Hardware, and unplug all but 1 NIC cable, to work out if it is the NICs causing this.

Can you think of anything else that may be causing this?
artem (staff)
Staff
Posts: 8
Joined: Thu Nov 08, 2018 11:50 am

Mon Nov 12, 2018 3:52 pm

Yes, sure. Can you provide me with fresh logs? Use the same credentials that Roman sent you to upload them. I want to double check that everything is ok with StarWind Configuration.
bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Mon Nov 12, 2018 7:04 pm

artem (staff) wrote:Yes, sure. Can you provide me with fresh logs? Use the same credentials that Roman sent you to upload them. I want to double check that everything is ok with StarWind Configuration.
Just uploaded.
artem (staff)
Staff
Posts: 8
Joined: Thu Nov 08, 2018 11:50 am

Mon Nov 19, 2018 5:27 pm

Didn't find any misconfiguration. Can you connect the StarWind drive directly in the VM (via ISCSi) and test a performance?
Post Reply