minimising risk of full sync starwind HA following an outage
Posted: Thu Jun 23, 2022 12:03 pm
I have looked at:
https://knowledgebase.starwindsoftware. ... -blackout/
https://knowledgebase.starwindsoftware. ... xpectedly/
https://knowledgebase.starwindsoftware. ... may-start/
For a 3 HA HCI node + independent file witness setup using starwind free, I am trying to get a lab setup which will minimise the risk of a resync. Final setup will be a pair of 5Tb LUNs created in starwind in high availability, interlink sync channel will be either 40Gb or 100Gb (final spec to be decided), I cannot test a 10Tb resync over 40Gb or 100Gb at this time. The servers will present a volume to starwind that will be based on a "hardware cache disabled" RAID1 set of 6 SSDs so an all flash infrastructure with no hardware caching (an 8Gb cache is available and I may look at enabling for testing later). Looking at the above links I have come to the conclusion that to potentially avoid full resyncs (assuming all 3 nodes CAN come back online) I should do the following:
[*]Disable caching. Caching will most likely cause a full resync.
[*]Ensure that the primary node (if known) is booted first. Starwind will query the nodes + witness to determine "who is the most up to date" and "Fast" sync those changes to the partner nodes.
[*] Avoid the use of "Mark as synchronised" as much as possible as this forces a full resync to other partner nodes
[*] If (for example) node 1 was determined by starwind to have the most up to date data and node 1 is not available (i.e. node 1 has "died" and will not be restarted anytime soon) then a full resync is inevitable as you must mark either node 2 or node 3 as synchronised manually (thus triggering a full resync)
am I correct in thinking that starwind marks LUNs as read-only when performing a full sync but allows writes when performing a "Fast" sync?
https://knowledgebase.starwindsoftware. ... -blackout/
https://knowledgebase.starwindsoftware. ... xpectedly/
https://knowledgebase.starwindsoftware. ... may-start/
For a 3 HA HCI node + independent file witness setup using starwind free, I am trying to get a lab setup which will minimise the risk of a resync. Final setup will be a pair of 5Tb LUNs created in starwind in high availability, interlink sync channel will be either 40Gb or 100Gb (final spec to be decided), I cannot test a 10Tb resync over 40Gb or 100Gb at this time. The servers will present a volume to starwind that will be based on a "hardware cache disabled" RAID1 set of 6 SSDs so an all flash infrastructure with no hardware caching (an 8Gb cache is available and I may look at enabling for testing later). Looking at the above links I have come to the conclusion that to potentially avoid full resyncs (assuming all 3 nodes CAN come back online) I should do the following:
[*]Disable caching. Caching will most likely cause a full resync.
[*]Ensure that the primary node (if known) is booted first. Starwind will query the nodes + witness to determine "who is the most up to date" and "Fast" sync those changes to the partner nodes.
[*] Avoid the use of "Mark as synchronised" as much as possible as this forces a full resync to other partner nodes
[*] If (for example) node 1 was determined by starwind to have the most up to date data and node 1 is not available (i.e. node 1 has "died" and will not be restarted anytime soon) then a full resync is inevitable as you must mark either node 2 or node 3 as synchronised manually (thus triggering a full resync)
am I correct in thinking that starwind marks LUNs as read-only when performing a full sync but allows writes when performing a "Fast" sync?