In a two node setup, I don't see how it's possible to be very sure on which node is the master without other nodes acting as a witness. Having a 3rd ( or 5th!) node as a witness could prevent a split brain scenario.. I've been bitten by this design flaw several times over the last few years.
This was behind my suggestion of a maintenance mode operation although cannot the logic work something like this...
A typical UPS controlled shutdown would:
- Detect power outage
- Instigate shutdown after certain period (say 2 minutes)
- Issue command to shutdown virtual machines & standalone physical machines
- Issue command sometime later to shutdown the Hyper-V hosts
- Issue command to shutdown both SAN nodes
The above has to happen pretty quickly as most average loaded UPS can't last that long - 30 minutes if you are lucky.
So the two StarWind nodes receive notification that the host operating system is shutting down. At this time, hopefully most iSCSI targets will have disconnected but not necessarily but this is doom and gloom time so forceful disconnecting targets is the preferred option to simply loosing power mid-write.
At this time, both StarWind nodes are still in communication with each other - the network is still functional. Why cannot they communicate with each other and work out who has the master copy?
BEGIN
Primary node #1 to #2 - you shutdown yet?
Node #2 - not quite, hang on - flushing stuff (holding up shutdown at this point which is perfectly acceptable)
Node #2 - okay, I've forcefully disconnected any targets I had left and flushed to disk
Primary node - okay, I got that and I'm going to flag myself as master. You get that node #2?
Node #2 - yes, I got that and I'm letting shutdown carry on or Hello, node #2? Ohh you've gone - all bets off now
END
On power-up, the primary node knows it's got the master copy and starts processing iSCSI requests (like manually saying "Mark Sync"). If the other nodes come up first, then they don't do anything until either the primary node wakes up and starts synchronising with them or some manual intervention is carried out.
Cheers, Rob.