RAID Failure - Rebuilding Node

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
bg_IT
Posts: 14
Joined: Mon Aug 13, 2018 11:22 pm

Sun Sep 26, 2021 12:49 pm

Morning all,

I provide support for a small call center and back in 2017/2018 configured a 2 Node StarWind VSAN (Free) on a couple servers then setup them up to be HyperConverged Hyper-V. After some initial issues the system has run flawlessly. This past week the unthinkable happened. A RAID drive failed and during rebuild the second drive failed. To StarWinds credit we were only down about 10 minutes while VMs cam back up on the other node (Thank you for this great product!). That being said I'm in the unfortunate position of rebuilding the server.

I did have a little luck in that I was able to gather the StarWind.cfg, Datastore_HA.swdsk, Datastore.swdsk, and a copy of the VSAN version we using from the old server before destroying the virtual disk and rebuilding (bare metal).

Finally my question. Would it be easier to setup the new node using those files or to setup a new node and use the AddHaPartner powershell script to add it to the VSAN. Can you give a brief overview of what the recommended steps would be?

Thank you
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Sun Sep 26, 2021 9:45 pm

Hi,

Thanks for kind words!
Remove the replicas to the faulty server by running RemoveHAPartner from the HEALTHY server. Then, once the rebuild of the faulty array us done, run AddHaPartner from the HEALTHY server.
bg_IT
Posts: 14
Joined: Mon Aug 13, 2018 11:22 pm

Sun Sep 26, 2021 11:33 pm

Thank you Yaroslav,

It looks like I'm missing the AddHaPartner.ps script in the version I'm running. I could download a newer version and get the script, would that cause any problems?

Thanks again
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Mon Sep 27, 2021 3:01 am

Hi,

I would recommend updating both hosts and their Management Console (just in case). But first, fix the replication. Do the update, once your system is solid again.
Let me know if you need the script alone.
bg_IT
Posts: 14
Joined: Mon Aug 13, 2018 11:22 pm

Mon Sep 27, 2021 11:34 am

Thank you,

If you could provide a copy of the powershell script that would be great.
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Mon Sep 27, 2021 12:43 pm

Code: Select all

param($addr="192.168.0.1", $port=3261, $user="root", $password="starwind", $deviceName="HAImage1",
	$addr2="192.168.0.2", $port2=$port, $user2=$user, $password2=$password,
#secondary node
	$imagePath2="My computer\C\starwind",
	$imageName2="partnerImg22",
	$createImage2=$true,
	$targetAlias2="partnerha22",
	$autoSynch2=$true,
	$poolName2="pool1",
	$syncSessionCount2=1,
	$aluaOptimized2=$true,
	$syncInterface2="#p1={0}:3260" -f $addr,
    $hbInterface2="",
    $selfSyncInterface="#p1={0}:3260" -f $addr2,
    $selfHbInterface=""
	)
	
Import-Module StarWindX

try
{
    Enable-SWXLog -level SW_LOG_LEVEL_DEBUG
    
    $server = New-SWServer $addr $port $user $password
    $server.Connect()

	$device = Get-Device $server -name $deviceName
	if( !$device )
	{
		Write-Host "Device not found" -foreground red
		return
	}

    $node = new-Object Node
    $node.HostName = $addr2
    $node.HostPort = $port2
    $node.Login = $user2
    $node.Password = $password2
    $node.ImagePath = $imagePath2
    $node.ImageName = $imageName2
    $node.CreateImage = $createImage2
    $node.TargetAlias = $targetAlias2
    $node.SyncInterface = $syncInterface2
    $node.HBInterface = $hbInterface2
	$node.AutoSynch = $autoSynch2
	$node.SyncSessionCount = $syncSessionCount2
	$node.ALUAOptimized = $aluaOptimized2
	$node.PoolName = $poolName2

    Add-HAPartner $device $node $selfSyncInterface $selfHbInterface
}
catch
{
	Write-Host $_ -foreground red 
}
finally
{
	$server.Disconnect()
}
See this script
Attachments
AddHaPartner.zip
The script
(796 Bytes) Downloaded 163 times
bg_IT
Posts: 14
Joined: Mon Aug 13, 2018 11:22 pm

Mon Sep 27, 2021 4:11 pm

Just checking to see if I'm on track here?

.CFG files from failed server

Code: Select all

<device file="My Computer\X\witness\witness.swdsk" node="-1" name="imagefile1"/>
    <device name="HAImage1" OwnTargetName="iqn.2008-08.com.starwindsoftware:node1-witness" file="My Computer\X\witness\witness_HA.swdsk" serialId="574A5713E3DCBE32" asyncmode="yes" readonly="no" highavailability="yes" buffering="no" header="65536" reservation="no" CacheMode="wb" CacheSizeMB="128" CacheBlockExpiryPeriodMS="5000" AluaNodeGroupStates="0,0" Storage="imagefile1"/>
    <device file="My Computer\X\DataStore\DataStore.swdsk" node="-1" name="imagefile2"/>
    <device name="HAImage2" OwnTargetName="iqn.2008-08.com.starwindsoftware:node1-datastore" file="My Computer\X\DataStore\DataStore_HA.swdsk" serialId="69484FA902A4BCB9" asyncmode="yes" readonly="no" highavailability="yes" buffering="no" header="65536" reservation="no" CacheMode="wb" CacheSizeMB="16384" CacheBlockExpiryPeriodMS="5000" AluaNodeGroupStates="0,0" Storage="imagefile2"/>
What I think the variable should be for HAImage1 (witness)

Code: Select all

param($addr="192.168.100.107", $port=3261, $user="root", $password="starwind", $deviceName="HAImage1",
	$addr2="192.168.100.110", $port2=$port, $user2=$user, $password2=$password,
#secondary node
	$imagePath2="My Computer\X\witness",
	$imageName2="HAImage1",
	$createImage2=$true,
	$targetAlias2="witness",
	$autoSynch2=$true,
	$poolName2="pool1",
	$syncSessionCount2=1,
	$aluaOptimized2=$true,
	$cacheMode2="wb",
	$cacheSize2=128,
	$syncInterface2="#p1=172.20.10.1:3260" -f $addr,
    $hbInterface2="#p1=172.20.30.1:3260" -f $addr,
    $selfSyncInterface="#p1=172.20.10.2:3260" -f $addr2,
    $selfHbInterface="#p1=172.20.30.2:3260" -f $addr2
	)
Thanks you
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Tue Sep 28, 2021 8:16 am

Yes,

Please run the script per device from the Healthy Server. Otherwise, you will destroy the healthy side.
bg_IT
Posts: 14
Joined: Mon Aug 13, 2018 11:22 pm

Tue Sep 28, 2021 2:54 pm

Hi Yaroslav,

I'm receiving the error "A positional parameter cannot be found that accepts the argument 'Node'" when I run AddHaPertner. Is this because the version I'm running predates the AddHaPartner script (it wasn't included in the powershell folder).

Thank you
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Tue Sep 28, 2021 4:10 pm

Yes, please update the software. Please be aware of the downtime and HA device acquiring the "NOT SYNCHRONIZED" status on the active side. Could you see if there is SyncHaDevice.ps1 in D:\Program Files\StarWind Software\StarWind\StarWindX\Samples\powershell?
Alternatively, in order to avoid downtime, you can update only Integrated Component Library while running the installer (Make sure to tick this only option!!!).
bg_IT
Posts: 14
Joined: Mon Aug 13, 2018 11:22 pm

Tue Sep 28, 2021 4:28 pm

Thank you,

I was wondering if updating only the Integrated Component Library would work. That's great, I'll go that route and update software after.
bg_IT
Posts: 14
Joined: Mon Aug 13, 2018 11:22 pm

Tue Sep 28, 2021 4:56 pm

Is there a place I could download different versions of VSAN? I worry that going straight to the latest version might not be a good idea, even if it's just for the Libraries.
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Tue Sep 28, 2021 7:40 pm

Installing just libraries does not affect the service.
You can jump to the latest available one.
Post Reply