Full Sync after First node restart

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
simlac
Posts: 6
Joined: Fri Oct 16, 2020 7:46 am

Fri Jan 15, 2021 10:55 am

Hi all,

I have a three nodes cluster that I'm implementing : two data nodes and a witness.
Everything is running smoothly, except when I try to reboot the "First" node.

When it comes back online, all the devices that are synchronised start a full sync.
When it's the "Second" node that I restart, it only starts a delta sync, as well as when I'm in Heartbeat replication without witness.
I do not put them in maintenance mode, as I want to simulate a node failure.

Is there a way to avoid that full sync ?

Thanks !
yaroslav (staff)
Staff
Posts: 2346
Joined: Mon Nov 18, 2019 11:11 am

Fri Jan 15, 2021 12:11 pm

Greetings please collect logs from all 3 boxes and share them with me https://knowledgebase.starwindsoftware. ... collector/. Share them via Google Drive. Make sure to follow this reboot procedure https://knowledgebase.starwindsoftware. ... installed/

It can be that 1 is the only active server while 2 is not synchronized.
simlac
Posts: 6
Joined: Fri Oct 16, 2020 7:46 am

Fri Jan 15, 2021 1:48 pm

Thanks for your quick answer !
You'll find the files here https://drive.google.com/drive/folders/ ... sp=sharing

As for the shutdown procedure, everything is working great when I follow it, but I would like the experiment the case of a node crash.

Thanks again,
yaroslav (staff)
Staff
Posts: 2346
Joined: Mon Nov 18, 2019 11:11 am

Mon Jan 18, 2021 4:53 am

I believe that I found the problem. Please wait for all the HA devices to synchronize. I found that the last time service was restarted while other side was not synchornized.
Also, I noticed that at 14:20 on 15/01 service was stopped on 02 while at 14:21 it was stopped on 2 other nodes. Full sync here is expected as StarWind Service is stopped on all 3 nodes.
Please make sure to test an unexpected shutdown of a server as described here https://knowledgebase.starwindsoftware. ... installed/ (instead of shut down gracefully there should be an unexpected shutdown in your case).

If you aim to test the unexpected shutdown of all 3 servers though, full synchronization is expected in that case. You can avoid that process though by putting StarWind HA devices into maintenance mode as discussed in the article I shared.
simlac
Posts: 6
Joined: Fri Oct 16, 2020 7:46 am

Mon Jan 18, 2021 7:43 am

Thanks once again for your work !

I think I had the issue even with full synced devices, and a only the first node shut down.
I will try again with this configuration, and the shutdown documentation you provided.

I'll keep you in the loop about it, and provide the logs if necessary.

Cheers,
yaroslav (staff)
Staff
Posts: 2346
Joined: Mon Nov 18, 2019 11:11 am

Mon Jan 18, 2021 9:32 am

Yes, please keep me posted.
simlac
Posts: 6
Joined: Fri Oct 16, 2020 7:46 am

Tue Jan 19, 2021 9:49 am

Hi again !
I recreated the issue, with a Device "testRebootW" fully synced (the other devices are left in maintenance mode)
Both the GUI and the CLI shows the device as synced.
At around 10:10, I stopped the server hosting the device noted as "First", and on boot, a full synchro started.

You'll find the generated logs here
https://drive.google.com/drive/folders/ ... sp=sharing

Thanks again,
Cheers
yaroslav (staff)
Staff
Posts: 2346
Joined: Mon Nov 18, 2019 11:11 am

Thu Jan 21, 2021 8:50 pm

Greetings,

Could you log a case with us? Fill in this form https://www.starwindsoftware.com/support-form.
Use 444545 and this forum thread as your references.
simlac
Posts: 6
Joined: Fri Oct 16, 2020 7:46 am

Fri Jan 22, 2021 8:25 am

Done !
I defeated the evil blue reset button.
Once again, thanks for your work.

Cheers !
yaroslav (staff)
Staff
Posts: 2346
Joined: Mon Nov 18, 2019 11:11 am

Fri Jan 22, 2021 12:08 pm

:)
Post Reply