Performance troubleshooting assistance for 2 node Storage

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Mon Nov 19, 2018 5:56 pm

Thanks for the confirmation, I'll give that a go.
artem (staff)
Staff
Posts: 8
Joined: Thu Nov 08, 2018 11:50 am

Tue Nov 20, 2018 5:12 pm

Please let me know when you would have any results.
bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Wed Nov 21, 2018 4:24 pm

artem (staff) wrote:Didn't find any misconfiguration. Can you connect the StarWind drive directly in the VM (via ISCSi) and test a performance?
So, I gave the VM a 2nd NIC, on the same subnet as the VSAN Node, ran the same command line on the 5GB Test Device, and getting 4.46MB/s READ (very bad).

Will now try a Physical machine, so it bypassed the Compute Nodes completely.
bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Wed Nov 21, 2018 4:44 pm

Interesting result. So this is now a Physical Server, with an additional NIC, connected on the same subnet as VSAN Node 1.

Source NIC1 > VSANN01 NIC1.
5GB Device (No cache, no HA).
1.68MB/s READ

Would you agree that this is now pointing to an issue on the VSAN rather than the Compute Nodes?

I will leave this setup, for further testing, so please let me know if you have any tests that can help us diagnose this.
Michael (staff)
Staff
Posts: 317
Joined: Thu Jul 21, 2016 10:16 am

Thu Nov 22, 2018 4:39 pm

Hello Bharat,
I have checked this thread and case 78331 and it seems that you have misconfigured the setup or have hardware issues. As an exclusion, I would suggest scheduling a remote session to review the config.
Before the meeting we must have the network diagram of the setup. It should be not only interfaces names and their IPs but a picture where physical connection for each interface is displayed. You can check the example here.
bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Tue Nov 27, 2018 9:27 am

Just rebuilt my nodes, because the Windows allocation units were not lined up with the RAID Sector size (in case that is causing an issue).

Waiting for 2nd Node to complete sync, before I re-test this.
artem (staff)
Staff
Posts: 8
Joined: Thu Nov 08, 2018 11:50 am

Fri Nov 30, 2018 11:47 am

Thank you for your response.

Just keep us updated.
bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Tue Dec 04, 2018 6:04 pm

Performance is a little better with the sector sync corrected. I'm still testing this.

I'm now seeing the MB/s shoot up on the console, which I never saw before. I'll post updates once I've re-configured networking.
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Tue Dec 04, 2018 7:43 pm

Feel free to post any news or observations.
bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Tue Dec 04, 2018 7:50 pm

Boris (staff) wrote:Feel free to post any news or observations.
OK, here you go...

The disk speed tests are giving varied results, which is understandable as the workload various, depending on what my VM's are doing i.e. DB Backups, full server backups, downloading patches etc.

I haven't had time to do the network diagram, as I have a complex network, and am in the middle of downsizing/simplifying.

Reads are currently between 200MB/s - 350MB/s, which is great.
Write have gone from a slow 2-4MB/s to 12-16MB/s, which is better, but not right. Any suggestions on where this might be going wrong?
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Thu Dec 06, 2018 4:56 pm

What is the scenario of the latest tests? Please share more details. The more details you provide the better we would understand what you have got wrong there.
bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Tue Dec 11, 2018 9:37 am

Boris (staff) wrote:What is the scenario of the latest tests? Please share more details. The more details you provide the better we would understand what you have got wrong there.
Sorry, been really busy.

This last test, was on a Hyper-V Windows 10 Guest, after I rebuilt the VSAN Nodes, to allign them to 64K Cluster size (OS) with the 64K Sector size (RAID10).

Without a network diagram, here is an overview of the system:
VSAN01:
4 x 6TB RAID10
2 x 2TB ESXi Devices, with 2GB L1 Cache each in W/B mode
2 x 2TB Hyper-V Devices, with 2GB L1 Cache each in W/B mode
4 x iSCSI (1GB)
2 x Sync (1GB)

VSAN02
4 x 6TB RAID10
2 x 2TB ESXi Devices, with 2GB L1 Cache each in W/B mode
2 x 2TB Hyper-V Devices, with 2GB L1 Cache each in W/B mode
4 x iSCSI (1GB)
2 x Sync (1GB)

HYPV01:
6 x (1GB) (Physical Team, with vSwtich, and 4 x iSCSI connections)

HYPV02:
6 x (1GB) (Physical Team, with vSwtich, and 4 x iSCSI connections)

ESXI01:
6 x (1GB) (vDS, and 4 x iSCSI connections using 4 Port Groups)

ESXI02:
6 x (1GB) (vDS, and 4 x iSCSI connections using 4 Port Groups)

As you can see from the above, each VSAN Node has 4 x iSCSI links, and each each VM Environment has a total of 8 iSCSI links (maybe too much???).

VMware enviroment has around 12 VMs running, and Hyper-V environment has around 14 VMs running.

Any thought from the above?
Do you need more details on any components?
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Wed Dec 12, 2018 4:50 pm

What MPIO policy do you use for your Hyper-V nodes?
bkdilse
Posts: 52
Joined: Mon Sep 10, 2018 6:14 am

Thu Dec 13, 2018 6:57 am

Boris (staff) wrote:What MPIO policy do you use for your Hyper-V nodes?
I'm using "Least Queue Depth".
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Thu Dec 13, 2018 6:16 pm

Submit a support ticket at https://www.starwindsoftware.com/support-form and reference this thread.
Post Reply