Correct procedure after planned downtime

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

robnicholson
Posts: 359
Joined: Thu Apr 14, 2011 3:12 pm

Tue Aug 05, 2014 3:25 pm

Just impatience ;-) The iSCSI targets are now reconnecting to the one node that's up. So the steps that we'd have to document currently even after clean shutdown are:
  1. Decide which of the nodes you trust as having the latest version - although in terms of clean shutdown it should be either
  2. Manually mark that node as synchronised
  3. Manually start re-sync to the other node
The reason for not starting re-sync first as in the lab it currently seems to not do a fast resync and therefore the resync could take some time. Marking one node as synchronised first means at least you can get everything else back up online which resync is carried out.

Cheers, Rob.
robnicholson
Posts: 359
Joined: Thu Apr 14, 2011 3:12 pm

Tue Aug 05, 2014 3:30 pm

Another bit of feedback:
  • SAN91 powered up
  • SAN90 left powered down
  • TEST90 powered up - file server using iSCSI initiator
  • Manually marked SAN91 storage as synchronised
  • TEST90 reconnected and mounted storage
  • SAN90 powered up an hour later
  • Automatic synchronisation with SAN91 was carried out
So the difference here is that autosync kicked in with this scenario. But strangely never does kick in when both nodes are bought up together after a clean shutdown. That seems a little countertuitive - I'd expect autosync to kick in after a clean power down and power up sequence.

Cheers, Rob.
robnicholson
Posts: 359
Joined: Thu Apr 14, 2011 3:12 pm

Tue Aug 05, 2014 3:47 pm

Except the iSCSI initiator has not yet reconnected to SAN90, the second node brought up, even though the storage has since auto-synchronised.
robnicholson
Posts: 359
Joined: Thu Apr 14, 2011 3:12 pm

Tue Aug 05, 2014 4:57 pm

Hmm, not good - the LSFS storage is stuck in a continual synchronisation loop. I checked a while back and it was at over 50%. Just checked now and it's back to 9%. The non-LSFS storage has synchronised fine. Will leave it overnight.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Sat Aug 09, 2014 11:54 am

I'm also a little concerned where disk IO goes when you power-up back up after a clean shutdown. The iSCSI connection is to both nodes but the nodes are not connected. What is the algorithm in this case? In the case of a power outage with clean shutdown, when power is restored and the SAN is restarted, disk IO will start. Where does it go?
I`m not sure that I understood your question correctly, so correct me if I`m answering something wrong, but the StarWind will automatically detect what node has most recent data and will initiate the synchronization in proper direction automatically.
it's taking a lot time to sync. I'm tempted to raise a support call on this specific case.
Well, we`ve got nothing from you in our mailboxes yet. Let me know if you sent anything.
Power restored but assume something has gone wrong with SAN90 - it hasn't powered up for some reason. SAN91 has come back online as well as TEST90. However, iSCSI is stuck in reconnecting (see attached screenshot).
It happens in one of two cases:
1.StarWind target is in the Not Synchronized state
2.MS iSCSI initiator hanged some connections and glitches
I've just found "Mark as synchronised" which I assume is the option to be chosen in the disaster scenario? It says it'll start processing client requests immediately - sounds like that's what I want it to do. The iSCSI initiator on TEST90 is still saying "Reconnecting" - will leave it a little longer to see if it manages to reconnect automatically.
Next time don`t forget to check the FAQ: http://www.starwindsoftware.com/starwind-faq#q_11
Just impatience ;-) The iSCSI targets are now reconnecting to the one node that's up. So the steps that we'd have to document currently even after clean shutdown are:

Decide which of the nodes you trust as having the latest version - although in terms of clean shutdown it should be either
Manually mark that node as synchronised
Manually start re-sync to the other node

The reason for not starting re-sync first as in the lab it currently seems to not do a fast resync and therefore the resync could take some time. Marking one node as synchronised first means at least you can get everything else back up online which resync is carried out.
Well wow:)
The steps looks good to me.
Hmm, not good - the LSFS storage is stuck in a continual synchronisation loop. I checked a while back and it was at over 50%. Just checked now and it's back to 9%. The non-LSFS storage has synchronised fine. Will leave it overnight.
Have you checked the disk errors?
Check the event log to see when the synchronization gots broken, and then check the StarWind logs on that time.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
robnicholson
Posts: 359
Joined: Thu Apr 14, 2011 3:12 pm

Sat Aug 09, 2014 12:55 pm

Have you checked the disk errors?
I'm just doing that - my lab PC is currently on the bench with all it's guts hanging out :-) I'm doing some low-level bench marks plus wiping the disks using Active KillDisk:

http://www.killdisk.com

The side effect of wiping the disks is that it writes to ever single sector - many times if you use the higher wipe functions.

That said, I don't think it's disk errors but I'm willing to spend the time to remove that from the equation.

Cheers, Rob.
jimbyau
Posts: 22
Joined: Wed Nov 21, 2012 2:12 am

Sun Aug 10, 2014 3:07 am

epalombizio wrote:It would really be good if there was an option to determine the last HA member to write to disk and auto sync based on that member being the primary.

While I haven't installed v8 yet, I've been bitten by this issue multiple times in v6. If we have a power outage and both HA members shutdown cleanly, there is no reason that I know of why it couldn't be setup to auto sync. I was hoping this would have been fixed in v8.
I dealt with this issue by placing an additional UPS inline for our primary node, it will be the last node to go down and is programmed to shutdown cleanly, where as node #2 will go down hard straight away.
robnicholson
Posts: 359
Joined: Thu Apr 14, 2011 3:12 pm

Tue Aug 12, 2014 10:48 am

I'm still getting inconstancy here in the lab with this. Summary:
  1. SAN node #1 running
  2. SAN node #2 running
  3. Test server running using iSCSI
  4. Single LSFS storage without deduplication
Perform clean shutdown:
  1. Shutdown test server and wait for iSCSI targets to disconnect on both nodes
  2. Count to 10 to allow everything to flush to disk
  3. Check both nodes synchornised
  4. Shutdown node #2 and wait until powered off
  5. Shutdown node #1 and wait until powered off
As far as I know, this is a clean shutdown and both nodes are synchronised.

A wish here - ability to put StarWind into maintenance mode - a special state where a clean shutdown can be carried out safe in the knowledge that it will only wake up cleanly when one takes it out of maintenance mode. XenServer has this feature and it's a nice addition.

When I power back up in the order node #1, node #2 and test server, then more often than note the nodes come back up unsynchronised. They very occasionally (like 1 in 10) auto-synchronise on power up. It doesn't seem to matter how long I wait between powering up the nodes, i.e. power up node #1, wait until StarWind service is running and power up node #2, wait an hour and then start up the test server.

What this means is that in the case of a power outage where the UPS sends the signal to shutdown cleanly, StarWind will not come back up cleanly afterwards and will require manual intervention.

I'm sure this isn't how it's supposed to work?

Cheers, Rob.
robnicholson
Posts: 359
Joined: Thu Apr 14, 2011 3:12 pm

Tue Aug 12, 2014 10:55 am

Another snippet here. In the attached screenshot, I've just shutdown the second node and it's disconnected from the console as expected. However, note that it says that the primary node is synchronised. So how come when I shut this server down and restart it, it changes to Unsynchronised? Nothing has changed (well aside from a reboot) so the node should still be synchronised should it not? This may be the reason that auto-synchronisation isn't working.

Cheers, Rob.
Attachments
sshot-43.png
sshot-43.png (49.78 KiB) Viewed 7948 times
robnicholson
Posts: 359
Joined: Thu Apr 14, 2011 3:12 pm

Tue Aug 12, 2014 11:00 am

Actually, you don't have to even power down. Simply restart the service and the synchronisation goes from synchronised to unsynchronised. This just seems a little odd to me. No data is being written to the storage so IMO it should not go unsynchronised just because the service is restarted.

Cheers, Rob.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Thu Aug 14, 2014 8:40 am

Count to 10 to allow everything to flush to disk
Why count? StarWind service will not be stopped until the data will be flushed.
A wish here - ability to put StarWind into maintenance mode - a special state where a clean shutdown can be carried out safe in the knowledge that it will only wake up cleanly when one takes it out of maintenance mode. XenServer has this feature and it's a nice addition.
May I doublecheck why do we want it? I mean, what`s the reason and purpose?
Another snippet here. In the attached screenshot, I've just shutdown the second node and it's disconnected from the console as expected. However, note that it says that the primary node is synchronised. So how come when I shut this server down and restart it, it changes to Unsynchronised? Nothing has changed (well aside from a reboot) so the node should still be synchronised should it not? This may be the reason that auto-synchronisation isn't working.
The trick here is the primary node is don`t know what happened when it was down - maybe you brought second node, and made it primary, for example. Thats why the sync is not automated here.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
robnicholson
Posts: 359
Joined: Thu Apr 14, 2011 3:12 pm

Thu Aug 14, 2014 7:49 pm

May I doublecheck why do we want it? I mean, what`s the reason and purpose?
Because of the confusion of node synchronisation after clean power down. Just an idea so that one can control the function manually. If auto-synchronisation worked better then it might not be needed.

This is possibly the most worrying scenario:
  1. Power to computer room fails
  2. UPS initiates shutdown sequence
  3. VMs shut down first
  4. Hyper-V hosts shut down next
  5. At this point, all targets are disconnected
  6. Shutdown SAN node #2
  7. Shutdown SAN node #1
Power to restored and power-up sequence restarts things. System will not automatically come back up because all StarWind storage is unsynchronised and requires a manually resynchronisation step. That's not an idea DR plan.

Cheers, Rob.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Wed Aug 20, 2014 2:47 pm

Well, the trick here is it will be faster and easier to make the autosync work better rather then coding the maintenance mode. Moreover - if somebody has confusion - my mailbox is always opened 8)
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Fri Aug 22, 2014 9:47 am

Yes, you are correct about marking a node as "synchronized". In case of disaster recovery you mark the last failed node as "synchronized" and then choose it as synchronization source for the rest of the nodes. As for requests processing, running nodes will send heartbeat ping requests to failed nodes and until they are back up, they will not send any data to these nodes to ensure minimal data loss. In addition, to ensure data integrity, consider write-through caching as an option, since data transfer acknowledgment packets are sent only in case these data are actually written to the disk and not simply stored in cache of synced cluster nodes. I advise using a quorum drive to determine which node failed first/last. It is common with windows failover clustering:
http://technet.microsoft.com/en-us/libr ... 31739.aspx
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
robnicholson
Posts: 359
Joined: Thu Apr 14, 2011 3:12 pm

Fri Aug 22, 2014 3:35 pm

I'm still concerned about that very real scenario I last posted. Without manual intervention, the entire infrastructure cannot recover from power outage. This is supposed to be high-availability! Compared to single node, this HA is actually worse from one point of view.

Imagine the discussion with one's boss:

Boss: Why were the systems offline for six hours over the weekend?
Me: Because the power went off
Boss: But I thought we'd bought "high-availability"
Me: We have but it needs manual intervention by IT to bring it back on-line
Boss: That doesn't sound very high availability to me?
Me: Yes but when it did come back up, nothing was lost and if a disk controller had failed, we'd have been fine
Boss: Flim flam, we still lost £500k of revenue because the website was down
Me: I'll get my coat...

Cheers, Rob.
Post Reply