Need Help - Setting up a 3 node VSan

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Anatoly (staff), Max (staff)

kb9jlo
Posts: 20
Joined: Thu Jun 29, 2017 2:33 pm
Location: Central IL - USA

Fri Sep 15, 2017 7:26 pm

I found the answer, you can manually move the master in a VMware HA cluster

https://communities.vmware.com/thread/543945
Attachments
ha_cluster.JPG
ha_cluster.JPG (21.61 KiB) Viewed 25586 times
Dan Reynolds
Sergey (staff)
Staff
Posts: 86
Joined: Mon Jul 17, 2017 4:12 pm

Mon Sep 18, 2017 4:34 pm

Thank you for your reply. In your previous message, when you said
kb9jlo wrote:I have a problem - when I setup the new replica it didn't ask me for the cache memory and set it at 1 MB. I'm experiencing high latency numbers now.
Could you please clarify how did you replicate it?
kb9jlo
Posts: 20
Joined: Thu Jun 29, 2017 2:33 pm
Location: Central IL - USA

Mon Sep 18, 2017 5:22 pm

Sergey (staff) wrote:Thank you for your reply. In your previous message, when you said
kb9jlo wrote:I have a problem - when I setup the new replica it didn't ask me for the cache memory and set it at 1 MB. I'm experiencing high latency numbers now.
Could you please clarify how did you replicate it?
Yes, I created the new replica from another node. If it asked for cache memory size I don't remember or missed it. It asked for disk size and flash cache size.

What happens if I restart the service and that particular node is serving up the disks as that time? When I checked I see that all the VMware hosts have iSCSI sessions to this particular node. BTW, all nodes are synchronized.
Dan Reynolds
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Wed Sep 20, 2017 2:19 pm

Before restarting the service or rebooting StarWind VSAN VM, make sure you have active paths to both StarWind VSAN VMs (i.e. Round Robin policy is used). In your case you seem to fbe using Most Recently Used policy, which does not give you active paths to both nodes. I highly recommend you change it to Round Robin to face no issues or downtime during service stop/restart or StarWind VSAN VM reboot.
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Wed Sep 20, 2017 2:36 pm

As for the cache size, please refer to https://knowledgebase.starwindsoftware. ... dance/661/ for operations on L2 cache and to https://knowledgebase.starwindsoftware. ... -l1-cache/ for L1 cache. You can check the size of both parameters in StarWind Management Console if it is available or in .swdsk files. Note that L2 (flash) cache is standalone, so it is not shown in HA .swdsk file, only in the ordinary one.
kb9jlo
Posts: 20
Joined: Thu Jun 29, 2017 2:33 pm
Location: Central IL - USA

Wed Sep 20, 2017 3:21 pm

Boris (staff) wrote:Before restarting the service or rebooting StarWind VSAN VM, make sure you have active paths to both StarWind VSAN VMs (i.e. Round Robin policy is used). In your case you seem to fbe using Most Recently Used policy, which does not give you active paths to both nodes. I highly recommend you change it to Round Robin to face no issues or downtime during service stop/restart or StarWind VSAN VM reboot.
Can you remind me where to set this up please? Is this on the Starwind side or the VM side?
Dan Reynolds
kb9jlo
Posts: 20
Joined: Thu Jun 29, 2017 2:33 pm
Location: Central IL - USA

Wed Sep 20, 2017 3:47 pm

kb9jlo wrote:
Boris (staff) wrote:Before restarting the service or rebooting StarWind VSAN VM, make sure you have active paths to both StarWind VSAN VMs (i.e. Round Robin policy is used). In your case you seem to fbe using Most Recently Used policy, which does not give you active paths to both nodes. I highly recommend you change it to Round Robin to face no issues or downtime during service stop/restart or StarWind VSAN VM reboot.
Can you remind me where to set this up please? Is this on the Starwind side or the VM side?
NEVER MIND! I found it. They're all set to round robin.
Dan Reynolds
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Wed Sep 20, 2017 3:49 pm

Round Robin multipathing policy is set for each datastore you have. In your standalone vSphere client, you select the node, go to Configuration -> Storage -> right click the StarWind datastore (or each of them, if you have more than one), select Properties -> Manage Paths. Here you set the policy to Round Robin and press Change.
Screenshot_3.png
Screenshot_3.png (47.04 KiB) Viewed 25554 times
For the web client, select Storage -> your StarWind datastore -> Manage tab -> Connectivity and Multipathing. Then, select the node and press Edit Multipathing.
Here is a screenshot from the web console:
Screenshot_2.png
Screenshot_2.png (54.93 KiB) Viewed 25554 times
These instructions (specifically those for the web client) are valid for ESXi 6.0, while for ESXi 6.5 they may differ a bit. Yet, I believe you have the general idea now. Let me know if you need additional instructions.
kb9jlo
Posts: 20
Joined: Thu Jun 29, 2017 2:33 pm
Location: Central IL - USA

Wed Sep 20, 2017 3:52 pm

Thanks. I didn't expect a reply back so fast! 8)

It always takes me a bit to remember where that is and I didn't remember if there was something on the Starwind side either. Fortunately I set it up when I started. Again, I just didn't remember. There are a lot of little pieces to this complex puzzle! :D
Dan Reynolds
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Wed Sep 20, 2017 4:11 pm

You are welcome.
kb9jlo
Posts: 20
Joined: Thu Jun 29, 2017 2:33 pm
Location: Central IL - USA

Wed Oct 04, 2017 9:03 pm

Everytime my system starts resyncing the latency across the board jumps up to 100% and about half my VM desktops are almost unusable.

I've tried turning down the "Synchronization Priority" but that doesn't seem to help much. I have dual Infiniband loops for sync/heartbeat. What are some things I can do to diagnose and repair this?

Does a sync start completely over? Seems like i takes FOREVER no matter what speed you have it set at?

Other than a reboot of a StarWind node what causes reyncs? Seems like they're kicking off for no reason...
Dan Reynolds
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Thu Oct 05, 2017 10:13 am

kb9jlo,

Could you please provide the scenario in which full sync is triggered on your system? Details about availability/absence of L1/L2 cache on your StarWind datastores would be appreciated as well.
Post Reply