Recover HA after second node failure

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

harish.patil
Posts: 23
Joined: Sun Oct 27, 2019 8:31 am

Wed May 19, 2021 5:30 pm

We had created 2 Node HA with vSAN Free. Due to OS error, we had to format and reinstall the secondary node. Now we want to restore the earlier HA setup, how do we do that and which script needs to be followed step by step. Can someone help in this regard? Our storage is running on the primary(healthy) node. Details are as follows:

Node 1 (Healthy)
IP: 10.10.10.6
HB Interface: 192.168.100.6
Devices:
quorum - HAImage1 - 50GB
datastor- HAImage2 - 500GB
filestor - HAImage3 - 1000GB
linuxstor - HAImage4 - 1000GB
csv - HAImage5 - 500GB

Node 2 (new)
IP: 10.10.10.8
HB Interface: 192.168.100.9


Please help!
harish.patil
Posts: 23
Joined: Sun Oct 27, 2019 8:31 am

Wed May 19, 2021 6:54 pm

P.S. The images are still available on the new node.
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Thu May 20, 2021 7:45 am

Welcome to StarWind Forum. Are HA devices on the affected node gray? If so, please share the StarWind.cfg file from the affected node with me. Also, you can try removing the grayed-out partner HAs. On the healthy node, run RemoveHAPartner.ps1 from C:\Program Files\StarWind Software\StarWind\StarWindX\Samples\powershell. Once you do that, share the .cfg file with me here (unless it removes the gray HA devices).
If gray HA devices will go away, replicate the healthy HAs to the affected node with the AddHAPartner.ps1 from the same folder.

Let me know if you have additional questions.
harish.patil
Posts: 23
Joined: Sun Oct 27, 2019 8:31 am

Thu May 20, 2021 8:01 am

Hi yaroslav,

Thank for replying.

The affected node does not have any HA devices as it is newly installed however the image files are available. We want to know how to reconfigure the new node so that it shall communicate with the existing healthy node.

Thanks
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Thu May 20, 2021 12:11 pm

Do you have an old config file from the affected node?
harish.patil
Posts: 23
Joined: Sun Oct 27, 2019 8:31 am

Thu May 20, 2021 12:36 pm

I do not have " StarWind.cfg" file however I have ".swdsk" & "_HA.swdsk" configuration files of each device. Let me if it helps.
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Thu May 20, 2021 4:34 pm

Hi,
There are 2 ways to restore everything.
1. Long and easy. No risk. You delete the files from the underlying storage and start replicating the disks by running AddHAPartner.
2. Complicated and fast. You need to modify the config file (Share with me both StarWind.cfg's). Not sure if that works as should so please make sure to have solid backups that are not located on StarWind HAs.
harish.patil
Posts: 23
Joined: Sun Oct 27, 2019 8:31 am

Thu May 20, 2021 4:51 pm

I have attached the config files from both the nodes, meanwhile we will try to recreate one of the image. We will also ensure that we take proper backup of all VMs and data.
Attachments
StarWind-Config-Files.rar
(16.13 KiB) Downloaded 157 times
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Thu May 20, 2021 5:15 pm

Hi, the affected config file has all devices there on the affected .cfg. Would you be so kind to provide the screenshot from the Management Console?
harish.patil
Posts: 23
Joined: Sun Oct 27, 2019 8:31 am

Thu May 20, 2021 7:09 pm

Please find attached screenshot of the console.
Attachments
IMG-20210521-WA0000.jpg
IMG-20210521-WA0000.jpg (129.34 KiB) Viewed 4680 times
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Fri May 21, 2021 12:53 pm

You need to remove the replica to the affected server (run RemoveHAPartner.ps1 from C:\Program Files\StarWind Software\StarWind\StarWindX\Samples\powershell), remove the HAs on the affected server from the underlying storage and recreate the replica with AddHAPartner.ps1.
harish.patil
Posts: 23
Joined: Sun Oct 27, 2019 8:31 am

Fri May 21, 2021 6:34 pm

Hi,
Sorry for late reply
I have run below removehacode successfully

Code: Select all

param($addr="192.168.100.9", $port=3261, $user="root", $password="starwind", $deviceName="HAImage4", $partnerTargetName="iqn.2008-08.com.starwindsoftware:revmaxsr6.revmax.co.in-linuxstor")

#
# RemoveHAPartner.ps1
#
Import-Module StarWindX

try
{
    Enable-SWXLog

    $server = New-SWServer $addr $port $user $password
    $server.Connect()

    Remove-HAPartner $server -deviceName $deviceName -partnerTargetName $partnerTargetName
}

catch
{
    Write-Host $_ -foreground red
}

finally
{
    $server.Disconnect()
}


But facing issue in ADDHAPARTNER code is mentioned below

Code: Select all

param($addr="192.168.100.9", $port=3261, $user="root", $password="starwind", $deviceName="HAImage4",
    $addr2="192.168.100.6",$port2=$port, $user2=$user, $password2=$password,
#secondary node
    $imagePath2="My computer\C\starwind",
    $imageName2="linuxstor",
    $createImage2=$true,
    $targetAlias2="linuxstor",
    $autoSynch2=$true,
    $poolName2="pool1",
    $syncSessionCount2=1,
    $aluaOptimized2=$true,
    $syncInterface2="#p1={0}" -f $addr,
    $hbInterface2="",
    $selfSyncInterface="#p1={0}" -f $addr2,
    $selfHbInterface=""
    )
    
Import-Module StarWindX

try
{
    Enable-SWXLog -level SW_LOG_LEVEL_DEBUG
    
    $server = New-SWServer $addr $port $user $password
    $server.Connect()

    $device = Get-Device $server -name $deviceName
    if( !$device )
    {
        Write-Host "Device not found" -foreground red
        return
    }

    $node = new-Object Node
    $node.HostName = $addr2
    $node.HostPort = $port2
    $node.Login = $user2
    $node.Password = $password2
    $node.ImagePath = $imagePath2
    $node.ImageName = $imageName2
    $node.CreateImage = $createImage2
    $node.TargetAlias = $targetAlias2
    $node.SyncInterface = $syncInterface2
    $node.HBInterface = $hbInterface2
    $node.AutoSynch = $autoSynch2
    $node.SyncSessionCount = $syncSessionCount2
    $node.ALUAOptimized = $aluaOptimized2
    $node.PoolName = $poolName2

    Add-HAPartner $device $node $selfSyncInterface $selfHbInterface
}
catch
{
    Write-Host $_ -foreground red 
}
finally
{
    $server.Disconnect()
}



The above code is giving me below error
PS C:\Users\administrator.REVMAX\Desktop> C:\Users\administrator.REVMAX\Desktop\AddHaPartner.ps1
Exception calling "AddPartner" with "1" argument(s): "Request to REVMAXSR9.REVMAX.CO.IN ( 192.168.100.9 ) : 3261
-
control 0x000000FD39878A00 -AddPartner:"" -PartnerTargetName:"#p1=iqn.2008-08.com.starwindsoftware:revmaxsr6.revmax.co.in-linuxstor" -Priority:"#p1=2" -nodeType:"#p1=1" -
PartnerIP:"#p1=REVMAXSR6.REVMAX.CO.IN:1" -AuthChapType:"#p1=none" -AuthChapLogin:"#p1=0b" -AuthChapPassword:"#p1=0b" -AuthMChapName:"#p1=0b" -AuthMChapSecret:"#p1=0b" -Re
plicator:"#p1=0"
-
200 Failed: invalid partner info.. "

PS C:\Users\administrator.REVMAX\Desktop>
Also attaching screenshot of affected node console
Attachments
IMG-20210522-WA0000.jpg
IMG-20210522-WA0000.jpg (110.01 KiB) Viewed 4658 times
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Sat May 22, 2021 9:03 am

Did you remove the replication partners from the underlying and StarWind Console prior to adding the replicas?
harish.patil
Posts: 23
Joined: Sun Oct 27, 2019 8:31 am

Sun May 23, 2021 8:05 am

I have run the script"RemoveHAPartner" as per your last reply. I am using the free edition hence I cant use the management console to remove partner.
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Sun May 23, 2021 12:33 pm

Management console is still available for monitoring. Did you run the script for all devices? Were the replica remoced for the selected devices from thr Management Console?
Post Reply