HA 2 Node Behavior?

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
Rusty
Posts: 9
Joined: Sat Feb 01, 2014 4:05 pm

Wed Feb 05, 2014 10:33 pm

Hi,

Just wondering this...

If I have a HA environment with 2 nodes, and the nodes stop seeing each other (network disconnect), what is the default behavior? I'm assuming that in order to be HA, they need to both stay up and running.

But then what happens when network connectivity between them is restored? Does the administrator have to choose which copy survives? If so, can that be done differently on a per-LUN basis, or do you just choose a surviving server?

Also, on a 2 node HA network, can you easily upgrade it to a 3 node network and pay the difference in licensing? Can this be done without downtime?
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Thu Feb 06, 2014 9:52 am

and the nodes stop seeing each other (network disconnect)
Are we talking about total network failure or just the SyncChannel malfunctioning?
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
Rusty
Posts: 9
Joined: Sat Feb 01, 2014 4:05 pm

Thu Feb 06, 2014 1:39 pm

Probably network failure, but that could be due to bad cable, bad NIC, bad switch, etc.. So the physical port may remain up (or may not), but the connection between servers may go down.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Thu Feb 06, 2014 3:38 pm

Actually according to our best practices total network failure shouldn't happen since we recommend to keep the SyncChannel and the HeartBeat, which StarWind nodes are using to monitor health status of HA device(-es), on the different data links and the subnets. anyway, we can have three types of failures:
1. SyncChannel failure. When something is wrong with the SyncChannel then StarWind nodes are checking the health status of each other through the HeartBeat channel and they`ll deсide which node contains the most recent data, and keep it online, while the second one will be temporary offline until the network will be fixed.
2. HB Channel failure. Actually nothing will happen here - The SyncChannel is the primary channel that monitors the health status of the nodes, so the HA data store will just continue running.
3. Failure on all of the links. Bad situation here. Since StarWind has no way to see what is wrong with the second node, each node will decide that the partner is down, and will mark itself as Synchronized, which is basically typical split brain issue, which causes data corruption.

I hope that helped you to better understand how StarWind processing Network failures.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
Rusty
Posts: 9
Joined: Sat Feb 01, 2014 4:05 pm

Mon Feb 10, 2014 7:43 pm

One more question. If we start with 2 nodes and want to expand to 3 nodes, can that be done seamlessly without downtime?
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Tue Feb 11, 2014 3:18 pm

One more question. If we start with 2 nodes and want to expand to 3 nodes, can that be done seamlessly without downtime?
I can guarantee that! 8)
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
Post Reply