StarWind’s automatic failover and failback

imrevo · Tue Jan 12, 2010 9:23 am

Good morning,

in "http://www.starwindsoftware.com/news/27" it is said that starwind supports automatic failover and failback.

Does it really support automatic failback?

cu
Volker

Constantin (staff) · Tue Jan 12, 2010 10:11 am

Automatic failback will be available in StarWind 5.5, which will be released approximately in the end of January.

imrevo · Tue Jan 12, 2010 1:40 pm

Hi,

that's great news. Hopefully, the status displays for HA Images will be more consistent in 5.5 as they are now:

I recently had one server showing 8 of 10 HA Images as "synchronized" (both partners) and 2 as "synchronized" (self only) while the partner server showed 5 HA Images as "not synchronized" (both partners) and 5 HA Images as "not synchronized" (self only).

Trying to sync the images resulted in various "already in sync" and "partner not in sync"(tried either way) messages and it was quite hard to find out which server should sync which images in which direction.

If this happens again (hopefully it doesn't), I'll send you some logs because I really want to know why the 2 partners showed different status for several HA Images.

cu
Volker

Constantin (staff) · Tue Jan 12, 2010 1:57 pm

If one HA node is shown as synchronized, and second no - the synchronization process is still going. But on one storage node this process is already finished, while second is still retrieving data.

imrevo · Tue Jan 12, 2010 2:24 pm

I know, and the progress counts up to 100.

But that's not what I meant.

Let's get into detail:

Image: HAIMage1
Server1 is showing his status and the partner status as "synchronized"
Server2 is showing his status as "synchronized" and the partner status (server1) as "not synchronized"

Or even worse, server 2 is showing both itself and the partner as "not synchronized" while server 1 shows both itself and the partner (server2) as "synchronized".

This is _before_ any sync operation has been started, e.g. after a network failure at server2.

I take it that this shouldn't happen, should it?

cu
Volker

Constantin (staff) · Tue Jan 12, 2010 3:50 pm

What did you do to get it? Normally if node 1 fails you will sync from node 1, however if as part of your test, you simply pull the network plug and that network was also his sync channel this can happen (the strange sync status you are reporting) this is why we suggest a redundant sync channel (in case one dies) if you lose your sync channel in such a way (just yank he network cord) bad things can happen!!
Does that make sense?
In the event you lose your sync channel completely your best course of action to avoid corruption is to recreate the HA device using your existing image files

imrevo · Wed Jan 13, 2010 6:06 am

Hi Constantin,

well I did nothing to provoke it, as it is a production environment. It happened when the switch died. So you are right assuming the network connectivities got lost.

As we sorted that one out, I've got one more question.

As Starwind is running on top of windows, reboots are inevitable (e.g. when patching the OS). Is it better to stop the service before the reboot or doesn't that matter? I'm afraid that the service might be terminated ungracefully by windows if it takes too long to terminate.

In the long run, I'll avoid those reboots completely by adding another Storageworks x510 replacing the current server that has some other services running and thus needs to be patched. The x510 is a nice device to be used with Starwind if you disable the windows home server functions and the smb protocol and close all ports but 3389 and 326x

The performance is outstanding and you can use the built in raid functionality of windows for mirroring. In addition with Starwind HA this makes a very reliable iSCSI target for my vSphere-Cluster (unless the switch dies

)

Do you know devices that are like the x510, but with more than one nic?

cu
Volker

Constantin (staff) · Wed Jan 13, 2010 1:44 pm

It doesn`t matter. Anyway after reboot you will need to resync target. Personally my opinion - the best option is to use Server Core - it has small footprint, small patches and less of them, etc.
I can`t recommend you any certain hardware, sorry. For HA you need two open ports - 3260 and 3261.

Aitor_Ibarra · Wed Jan 13, 2010 3:17 pm

I've not tried Server Core, but one big thing you can do to improve performance (if you are pushing Starwind really hard with 10GbE) is to disable the windows firewall, as packet inspection is quite a CPU overhead at high speeds. Only do this if the network used for iSCSI is secure!

As for HA - I hope that 5.5 can survive failure of the sync network, as even with redundant links, it's still quite possible for it to fail, and such a failure shouldn't mean that you end up with data corruption. When an HA target is synced/created, the two nodes should elect one as primary, so that if the link between them fails, but both servers are still running, the secondary one can immediately cut communications with the initiators.

cheers,

Aitor

Constantin (staff) · Wed Jan 13, 2010 3:41 pm

I don`t know any HA solution, that can work without heartbeat. But: if your heartbeat will fail you still would be able to get data from storage, but if you will try to write data it would be written only on one node, and this can lead to data corruption

Aitor_Ibarra · Wed Jan 13, 2010 5:53 pm

Hi Constantin,

I wasn't saying that I was expecting HA to work without a heartbeat or some form of data syncronisation. What I mean is that once the heartbeat/sync link is down, one node has to shut down immediately or you will get data corruption. It's the classic split brain problem. With Starwind 5 HA, MPIO on the initiators will write data to both nodes, if those nodes aren't in sync then they will both be corrupted very quickly, and it will be impossible to resync them. It's safer to do the sync over the same NIC you use for talking to initiators, that way sync and MPIO will both fail at the same time, on one node only. But if you are using different NICs (for performance), and your sync channel goes, you get corruption, and no way to recover. Hence the need for the parter servers to decide which one should be primary and which secondary in the event of the failure of the link between them.

cheers,

Aitor

JLaay · Thu Jan 14, 2010 9:44 am

Hi Aitor,

Converning "Hence the need for the parter servers to decide which one should be primary and which secondary in the event of the failure of the link between them."

Am I correct to say that if primary targets are devided over both HA-servers this even further complicate the split brain issue/solution.
In my opinion in this situation both servers have to decide on a level lower level then servers who is boss, namely targets.

Greetz Jaap

Constantin (staff) · Thu Jan 14, 2010 10:46 am

MPIO will write data to one node, but we cannot predict on which one it will be written. Also you can use NIC teaming for two network cards, and use them for connecting to initiator and as heartbeat channel.

Aitor_Ibarra · Thu Jan 14, 2010 11:14 am

Hi Jaap,

That's an interesting question. I guess it would be a bad thing for each target to have a different primary as since both servers would still be visible to the initiators, so if the sync link goes down you want all targets on one of the servers to disappear. However, if the shut down was done by removing the targets, rather than breaking the network connection, they could be done independently and you could have some targets on one server and others on another. This might actually be desirable if the initiators for a particular target are closer on the network to one of the nodes (e.g. in a geo cluster).

I suppose the easiest thing would be for Starwind to follow whatever's been set for the target - e.g. have the user set the primary node for each HA target. And maybe a server level option which can change all the defined HA targets so that the server is always the primary/secondary node.

cheers,

Aitor

Constantin (staff) · Thu Jan 14, 2010 11:54 am

Let`s imagine: we mark 1 target as primary, 2 target as slave. Than heartbeat fails. How initiator will know which of storage node is primary? The only possible workaround - to make all customers use our iSCSI initiator, that will be able to determine which node is primary, if it will inform that heartbeat has failed.