Resore Failed Node - Recreate node

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
aveonhealth
Posts: 4
Joined: Wed Jun 19, 2019 2:34 pm

Thu Feb 06, 2020 7:25 pm

After much reading, researching, and playing with my lab I was finally able to restore a failed node or a failed image. This guide will help you if you completely lost a node or if you only lost 1 or more images.

Install StarWind VSAN Free (In case you completely lost a node)
• Confirm your interfaces are configured properly
o Sync interface subnet
o HB-iSCSI interface subnet
o LAN-Management interface subnet
• Install StarWind VSAN Free software

Restore missing image files
When a StarWind Node is lost, four important files are needed to be restored. This guide will help you recover these files
Starwind.cfg this file has the configuration of your imagefiles, HAImage files, and Targets. We will restore it from a backup but if a backup is not available, we can use the original one and modify it.
*.img this file is the virtual hard drive that has your data. We will recreate it using a powershell script
*.swdsk this file has details about the format of the *.img. We will restore it from a backup but if a backup is not available, we can copy it from partner node and modify it.
*_HA.swdsk this file is what makes the *.img file replicate among StarWind nodes. We will restore it from a backup but if a backup is not available, we can copy it from partner node and modify it.

Restore *.img file We will use powershell to create a blank *.img file
• Copy all sample script files to your desktop. (Usually located in c:\Program Files\StarWind Software\StarWind\StarWindX\Samples\powershell\)
• Edit powershell scrip CreateImageFile.ps1
o Edit $fileName (without extension) $filePath of *.img file to be restored and $fileSize in MB. Example $fileName="witness", $filePath="My Computer\C\VHD", $size=1024
o Comment out everything under #Create Device and #Create Target. We don’t need these lines because we will manually restore from backups or from partner node
 #create device
 #$device = Add-ImageDevice -server $server -path $filePath -fileName #$fileName -sectorSize $sectorSize -NumaNode $numaNode -CacheMode #$cacheMode -CacheSize $cacheSize
 #$device
 #create target
 #$target = New-Target -server $server -alias $targetAlias -devices $device.Name
 #$target
o Save and run the script
o Stop the StarWind VSAN service

Restore *_HA.swdsk file
• Restore file from backup or copy it from partner node
• Confirm the <serial_id> matches the partner node *_HA.swdsk <serial_id>
• Confirm the <eui_64> matches <serial_id>
• Confirm all <interval size =”x” units=”GB”> fields match. Preferably should also match partner node
• Confirm <storage id=”1” name=”imagefilex” matches what you have in starwind.cfg (if you don’t have a backup of your starwind.cfg start with imagefile1 and increase by 1 as you restore additional images
• Confirm <storage id=”2” name=iqn.2008-08.com.starwindsoftware:**> points to partner target (If you copied this file from the partner node make sure to change this otherwise it will be pointing to itself, if you don’t know what this is run script enumDevicesTargets on the partner node to discover the target)
• Confirm <link id=”1” type=”data”/> matches partner sync interface IP (If you copied this file from the partner node make sure to change this otherwise it will be pointing to itself, if you don’t know what this is then check your network sync interface on partner node to discover it)
• Confirm <link id=”2” type=”control”/> matches partner HB-iSCSI interface IP (If you copied this file from the partner node make sure to change this otherwise it will be pointing to itself, if you don’t know what this is then check your network sync interface on partner node to discover it)
• Change <auto_sync> to false. This will give you control of when to start the full sync process to copy data from working node to restored node and not the other way around
• Change node 1 <sync_status> to 0 and node 2 <sync_status> to 1
• Confirm <node id=”2” name=”iqn.2008-08.com.starwindsoftware:** points to partner target (should be pointing to the same target as the field <storage id=”2” name=iqn.2008-08.com.starwindsoftware:**>)

Restore *.swdsk file
• Restore file from backup or copy it from partner node
• Confirm all <interval size =”x” units=”GB”> fields match. Preferably should also match partner node
• Confirm <serial_id> is globally unique. No other *.swdsk should be using this serial id in any of your StarWind VSAN nodes
• Confirm <eui_64> matches <serial_id>
• Confirm <storage id=”1” name=”My computer\C\starwind\*.img”/> points to the location of your *.img file you restored earlier

Restore Starwind.cfg file
• Restore file from backup or use the default one if you don’t have a backup
• Scroll down about ¾ down and look for section <devices> there will be several commented lines look for </devices> you will be inserting your devices here
• Confirm <device name=”imagefilex” file=”My computer\C\starwind\*.swdsk”/> name matches what you have on your *_HA.swdsk <storage id=”1” name=”imagefilex” and file matches the file path of your *.swdsk file
• Confirm <device name=”HAImagex HAImage you are restoring” OwnTargetName=”This node’s target NOT partner target” file=”file path to *_HA.swdsk” serialID=”the same <serial_id> as in *_HA.swdsk” asyncmode=”yes” readonly=”no” highavailability=”yes” buffering=”no” header=”65536” reservation=”no” CacheMode=”no” CacheSizeMB=”128” AluaNodeGroupStates=”0,0” Storage=”imagefilex”/>
• Scroll down to <targets>
• Confirm <target name=”this should be the same as above OwnTargetName” alias=”normally the last part of your target Example witness or csv1 or smb” devices=”HAImagex which HAImage will this belong to”

Manual sync
• The starwind service should start automatically but should not be synchronizing. If it starts synchronizing stop the starwind service and check <auto_sync> in your *_HA.swdsk file. Make sure it is set to false
• Edit SyncHADevice script
o Edit correct HAImage in $deviceName=”HAImagex”
o Confirm $device.MarkAsSynchronized() is commented out. Should read #$device.MarkAsSynchronized()
• Once HAImage has been restored and synched stop starwind service and change *_HA.swdsk <auto_sync> from false to true.
• Restart the server and check if everything is synched
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Mon Feb 10, 2020 5:34 pm

Thank you for your input.
Post Reply