Recovery after disk failure

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Tue Sep 15, 2020 10:36 am

Hi, Yaroslav.
Do you have any ideas or methods to restore storage after disk failure?
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Tue Sep 15, 2020 10:47 am

Hi, sorry for the delay, type the IP addresses instead of {0}

$syncInterface2="#p1={0}:3260" -f $addr,
$hbInterface2="#p1={0}:3260" -f $addr,
$selfSyncInterface="#p1={0}:3260" -f $addr2,
$selfHbInterface="#p1={0}:3260" -f $addr2

Here is more info on these parameters https://www.starwindsoftware.com/help/H ... ategy.html. There should be a familiar sample script at the bottom.
jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Tue Sep 15, 2020 11:40 am

I think, that is not a solution, since the parameters are substituted correctly. Please see a screenshot below.
Безымянный.png
Безымянный.png (66.25 KiB) Viewed 7931 times
Are you blocked restoration functionality for free version?
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Tue Sep 15, 2020 12:07 pm

No.
This script enables to restore the HA device synchronization partner. Please fill in this form to log a new case https://www.starwindsoftware.com/support-form. Please use this forum thread as the reference.
Let us take a closer look at this issue together.
jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Tue Sep 15, 2020 12:40 pm

Yaroslav, I have created support request as you recommened.
I hope, that we find solution and I will publish it here.
Kind regards.
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Tue Sep 15, 2020 12:54 pm

Thank you!
Serhi
Posts: 21
Joined: Mon Mar 25, 2019 4:01 pm

Tue Sep 15, 2020 1:19 pm

In my opinion, there is only one way to solve the problem you are facing. You have to collect the logs and provide them to the StarWind support team. You can find the link to StarWind Log Collector below:
https://knowledgebase.starwindsoftware. ... collector/

Then, please upload them to somewhere.
jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Tue Sep 15, 2020 7:48 pm

Dear Serhi,
Thank you a lot!
I will collect logs
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Wed Sep 16, 2020 3:44 am

Thank you!
jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Fri Sep 18, 2020 10:48 am

Yesterday we, together with the engineer, considered the problem and came to the conclusion that it is impossible to restore operability using regular means for free version.
A possible reason is the lack or mismatch of the necessary functionality of COM objects that are used when working through PowerShell and REST interface of StarWind service.
This information will be shared with the development team.
So, at this time is only one way to restore functionality - restoration of configuration files as I suggested early.
I Hope StaRWind team will fix this.
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Fri Sep 18, 2020 11:01 am

Thank you for the update and your time and effort!
jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Tue Nov 03, 2020 9:42 pm

Hello all!
In version 8.0.0.13861 the problem was fixed.
Thank you a lot to StarWind support service and personally to Ivan Ischenko!
I hope that with every release the product will get better and better

Below is an short explanation how to recover functionality after a disk failure.
For example we have the following environment:
labsheet1.png
labsheet1.png (22.62 KiB) Viewed 7662 times
At this time we have replaced failed disk, assign the same letter for it and have created a folder for images.
The screenshot of StarWind console:
afterfailure.png
afterfailure.png (7.12 KiB) Viewed 7662 times
At first, we need to delete any information about failed devices from StarWind.cfg. To do this, we need to stop StarWind Virtual SAN Service before editing StarWind.cfg. Else, after restarting service an old records will appear in the file again. I suspect, that the service writes it in the file from memory during stopping.
So, in my case, you can see records that should be deleted on the picture below:
editconfig.png
editconfig.png (35.22 KiB) Viewed 7662 times
Last edited by jdeshin on Tue Nov 03, 2020 10:08 pm, edited 1 time in total.
jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Tue Nov 03, 2020 9:52 pm

Then, we need to delete failed HA partner, on healthy node. To do this we need to find device name and target name for failed device. You can to do this like on screenshot below:
gettarget.png
gettarget.png (30.34 KiB) Viewed 7662 times
Then we should delete this HA partner by using RemoveHAPartner.ps1 script that belongs in the StarWindX\Samples\powershell folder
removetarget.png
removetarget.png (20.29 KiB) Viewed 7662 times
After that, we need start StarWind Virtual SAN service on failed node and add it as HA partner by using AddHAPartner.ps1 script
addha.png
addha.png (32.3 KiB) Viewed 7662 times
jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Tue Nov 03, 2020 10:03 pm

After that, the failed node will come in working state
afterrecover.png
afterrecover.png (6.73 KiB) Viewed 7660 times
So the one bad thing is that if you use few ha devices then all of them will be unavilable during restarting starwind service.
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Wed Nov 04, 2020 5:24 am

Greetings,

Thanks a lot for such a detailed guide. We highly appreciate your collaboration.
Post Reply