Storage is not available untill Priority1 Node is UP

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Kishore CA
Posts: 26
Joined: Wed Jan 04, 2023 10:38 am

Wed Jul 26, 2023 12:20 pm

Hi Team,
I have configured StarWind Virtual SAN for vSphere 2-Node Hyperconverged Scenario with VMware vSphere 8. I have created 3 Volumes (attached code for reference).
The Storage is working fine, Node1 had 1st Priority and Node had 2nd Priority. But during testing i made Shutdown both the servers and when i turned on Node 2 1st and later Node 1. All the storage Volumes were not available untill my Node 1 is up.

Kindly help me in this case.

If in actual case if my Node1 fails to start the storage will not come up and i will stuck up with loss of data.

please help me out.

Code for Volume1

Code: Select all

param($addr="10.110.13.123", $port=3261, $user="root", $password="starwind",
	$addr2="10.110.13.124", $port2=$port, $user2=$user, $password2=$password,
#common
	$initMethod="SyncFromFirst",
	$size=1540066,
	$sectorSize=512,
	$failover=0,
	$bmpType=1,
	$bmpStrategy=0,
#primary node
	$imagePath="/mnt/sas/",
	$imageName="MasterImgMgmt",
	$createImage=$true,
	$storageName="",
	$targetAlias="MGMT-Volume",
	$autoSynch=$true,
	$poolName="",
	$syncSessionCount=1,
	$aluaOptimized=$true,
	$cacheMode="wb",
	$cacheSize=2048,
	$syncInterface="#p2=10.110.15.112:3260" -f $addr2,
	$hbInterface="#p2=10.110.16.112:3260,10.110.13.124:3260" -f $addr2,
	$createTarget=$true,
	$bmpFolderPath="",
#secondary node
	$imagePath2="/mnt/sas/",
	$imageName2="PartnerImgMgmt",
	$createImage2=$true,
	$storageName2="",
	$targetAlias2="MGMT-Volume",
	$autoSynch2=$true,
	$poolName2="",
	$syncSessionCount2=1,
	$aluaOptimized2=$false,
	$cacheMode2=$cacheMode,
	$cacheSize2=$cacheSize,
	$syncInterface2="#p1=10.110.15.111:3260" -f $addr,
	$hbInterface2="#p1=10.110.16.111:3260,10.110.13.123:3260" -f $addr,
	$createTarget2=$true,
	$bmpFolderPath2=""
	)
Code for Volume2

Code: Select all

param($addr="10.110.13.123", $port=3261, $user="root", $password="starwind",
	$addr2="10.110.13.124", $port2=$port, $user2=$user, $password2=$password,
#common
	$initMethod="SyncFromFirst",
	$size=2097152,
	$sectorSize=512,
	$failover=0,
	$bmpType=1,
	$bmpStrategy=0,
#primary node
	$imagePath="/mnt/sas/",
	$imageName="MasterImgVM",
	$createImage=$true,
	$storageName="",
	$targetAlias="VM-Volume",
	$autoSynch=$true,
	$poolName="",
	$syncSessionCount=1,
	$aluaOptimized=$true,
	$cacheMode="wb",
	$cacheSize=3072,
	$syncInterface="#p2=10.110.15.112:3260" -f $addr2,
	$hbInterface="#p2=10.110.16.112:3260,10.110.13.124:3260" -f $addr2,
	$createTarget=$true,
	$bmpFolderPath="",
#secondary node
	$imagePath2="/mnt/sas/",
	$imageName2="PartnerImgVM",
	$createImage2=$true,
	$storageName2="",
	$targetAlias2="VM-Volume",
	$autoSynch2=$true,
	$poolName2="",
	$syncSessionCount2=1,
	$aluaOptimized2=$false,
	$cacheMode2=$cacheMode,
	$cacheSize2=$cacheSize,
	$syncInterface2="#p1=10.110.15.111:3260" -f $addr,
	$hbInterface2="#p1=10.110.16.111:3260,10.110.13.123:3260" -f $addr,
	$createTarget2=$true,
	$bmpFolderPath2=""
	)
Code for Volume3

Code: Select all

param($addr="10.110.13.123", $port=3261, $user="root", $password="starwind",
	$addr2="10.110.13.124", $port2=$port, $user2=$user, $password2=$password,
#common
	$initMethod="SyncFromFirst",
	$size=911360,
	$sectorSize=512,
	$failover=0,
	$bmpType=1,
	$bmpStrategy=0,
#primary node
	$imagePath="/mnt/ssd/",
	$imageName="MasterImgBCKUP",
	$createImage=$true,
	$storageName="",
	$targetAlias="BACKUP-Volume",
	$autoSynch=$true,
	$poolName="",
	$syncSessionCount=1,
	$aluaOptimized=$true,
	$cacheMode="wb",
	$cacheSize=3072,
	$syncInterface="#p2=10.110.15.112:3260" -f $addr2,
	$hbInterface="#p2=10.110.16.112:3260,10.110.13.124:3260" -f $addr2,
	$createTarget=$true,
	$bmpFolderPath="",
#secondary node
	$imagePath2="/mnt/ssd/",
	$imageName2="PartnerImgBCKUP",
	$createImage2=$true,
	$storageName2="",
	$targetAlias2="BACKUP-Volume",
	$autoSynch2=$true,
	$poolName2="",
	$syncSessionCount2=1,
	$aluaOptimized2=$false,
	$cacheMode2=$cacheMode,
	$cacheSize2=$cacheSize,
	$syncInterface2="#p1=10.110.15.111:3260" -f $addr,
	$hbInterface2="#p1=10.110.16.111:3260,10.110.13.123:3260" -f $addr,
	$createTarget2=$true,
	$bmpFolderPath2=""
	)
Kishore CA
Posts: 26
Joined: Wed Jan 04, 2023 10:38 am

Wed Jul 26, 2023 1:35 pm

Before Shutting down the server I ensured all the 3 Volumes were successfully synchronised between nodes. Synchronisation happened from Node 1 to Node 2.

When I started Node 2 the storage was not available but when I started Node 1 storage was available and again synchronisation started from Node2 to Node1.
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Wed Jul 26, 2023 10:59 pm

Before the restart, did you check the iSCSI paths to be connected and active (make sure ALL the mirrors are available under Storage -> Storage Adapters -> Software iSCSI)?
Please also make sure the rescan script is set up and working fine.
Kishore CA
Posts: 26
Joined: Wed Jan 04, 2023 10:38 am

Thu Jul 27, 2023 12:22 pm

Hi sir,

Thanks for the response.
yaroslav (staff) wrote:Before the restart, did you check the iSCSI paths to be connected and active (make sure ALL the mirrors are available under Storage -> Storage Adapters -> Software iSCSI)?
Please also make sure the rescan script is set up and working fine.
Yes sir, iSCSI paths to be connected and active

rescan script is set up and working fine checked in both the VMs.

Observed today in this way.

Case1: When Starwind VMs on both VMs are ON it is working fine. Once after Successful Synchronization between both node, if any 1 VM is turned off the storage is available and is working fine.

Case2: When both VMs are turned off, during turn on the storage will not be available with 1 VM (Node) ON (either Node1 or Node2), both VM should be turned ON and then again synchronization will start between nodes and the storage will be available. is there any way that even if only one node is turned on and the storage should be available for same node??

Case3: When Both VMs are turned ON the synchronization of all Pools starts again. How to avoid it? Is there anything i have to change in code?

Thanks in advance.
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Thu Jul 27, 2023 1:55 pm

Case2: When both VMs are turned off, during turn on the storage will not be available with 1 VM (Node) ON (either Node1 or Node2), both VM should be turned ON and then again synchronization will start between nodes and the storage will be available. is there any way that even if only one node is turned on and the storage should be available for same node??
This is an extra safety measure.
In the case of the outage scenario, StarWind VSAN needs to verify the partner status before it starts synchronization on its own. It can get stuck in not synchronized until the partner wakes up. The software tries to avoid synchronization running in the wrong way, so it either lets user to launch full synchronization or waits partner to confirm its own status. Try shutting down 01 and 02 with an interval of ~5 mins.
You can always check the synchronization status of a specific node/devices in the Events tab, or by searching

Code: Select all

to 3 from 
and

Code: Select all

to 1 from 
events in the logs. The former says about synchronization drop the latter says about being synchronized/finishing the synchronization process (to 1 from 2 means that the node finished synchronization).
Kishore CA
Posts: 26
Joined: Wed Jan 04, 2023 10:38 am

Thu Jul 27, 2023 5:25 pm

The software tries to avoid synchronization running in the wrong way, so it either lets user to launch full synchronization or waits partner to confirm its own status.
Sir,
Then incase of outage, by running synchronisation or advanced synchronisation script will bring my storage online??
Try shutting down 01 and 02 with an interval of ~5 mins.
Pardon me sir, can you please elaborate what to test sir.??
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Thu Jul 27, 2023 7:44 pm

Hi,

Yes, you should be able to mark the device as synchronized manually, yet you must check the data to be OK on the side you marked as synchronized.
Please also remove the write-back cache (see https://knowledgebase.starwindsoftware. ... -l1-cache/) from all your HA devices as devices with WB cache are likely to end in mutual not synchronized, based on my experience.

The test I was asking you to perform should create a situation where HA device is "aware" of the partner going out of disconnection. Yet, please remove the Write-Back cache first.
Kishore CA
Posts: 26
Joined: Wed Jan 04, 2023 10:38 am

Fri Jul 28, 2023 2:54 pm

Hi sir,
Thanks for the response.
I have removed the Write Back Cache.

Today's Observation after removal of Write-Back Cache.
Case 1: Previous State of VMs - Both VMs are Off.
When any 1 of the Node is switched On the storage volume will be in Not Synchronised state, storage will not be available and observed HAImage1 not ready in Server log.

Case2: When both VMs are switched On then synchronisation will start and storage will be available.

I dint find the PowerShell Script for Marking Pool as synchronised manually. Can please help me where can I find that .
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Fri Jul 28, 2023 3:15 pm

Please review the SyncHaDeviceAdvanced.ps1.
Did you try shutting down with the delay?
Kishore CA
Posts: 26
Joined: Wed Jan 04, 2023 10:38 am

Fri Jul 28, 2023 4:00 pm

yaroslav (staff) wrote:Please review the SyncHaDeviceAdvanced.ps1.
I tried with SyncHaDeviceAdvanced.ps1 and also SyncHADevice.ps1 but dint workout. I tried with SyncType 1, 2 and 3 also but it was throwing error saying partner iSCSi target didn't found.
What is the Type of sync to be used sir??
Did you try shutting down with the delay?
Yeah. But same problem no improvement.
Last edited by Kishore CA on Fri Jul 28, 2023 5:18 pm, edited 2 times in total.
Kishore CA
Posts: 26
Joined: Wed Jan 04, 2023 10:38 am

Fri Jul 28, 2023 4:06 pm

One more question raised in me.

If one of my node is down with hard disk failure or any other failure and to be bought up with only fresh installation while another node is either in Synchronised or in Non Synchronised state.. how to create HA between existing node and newly installed node???
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Fri Jul 28, 2023 11:25 pm

To mark manually as synchronized
1. Stop StarWindService on all nodes.
2. Go to *_HA.swdsk
3. Set <sync_status>1</sync_status> for the HA device you need to mark as synchronized and <sync_status>0</sync_status>
4. Repeat appropriate changes for the opposite node.
5. Start the service.
If one node is not reachable, do that for the remaining one.
Kishore CA
Posts: 26
Joined: Wed Jan 04, 2023 10:38 am

Sat Jul 29, 2023 1:42 am

yaroslav (staff) wrote:To mark manually as synchronized
1. Stop StarWindService on all nodes.
2. Go to *_HA.swdsk
3. Set <sync_status>1</sync_status> for the HA device you need to mark as synchronized and <sync_status>0</sync_status>
4. Repeat appropriate changes for the opposite node.
5. Start the service.
If one node is not reachable, do that for the remaining one.
Thanks sir I'll try this.
Kishore CA wrote:One more question raised in me.

If one of my node is down with hard disk failure or any other failure and to be bought up with only fresh installation while another node is either in Synchronised or in Non Synchronised state.. how to create HA between existing node and newly installed node???
Is there any workaround for this sir??
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Sat Jul 29, 2023 1:47 am

Remove the replica (RemoveHAPartner) from the unused node and Run AddHaPartner script.
Kishore CA
Posts: 26
Joined: Wed Jan 04, 2023 10:38 am

Sat Jul 29, 2023 8:19 am

yaroslav (staff) wrote:To mark manually as synchronized
1. Stop StarWindService on all nodes.
2. Go to *_HA.swdsk
3. Set <sync_status>1</sync_status> for the HA device you need to mark as synchronized and <sync_status>0</sync_status>
4. Repeat appropriate changes for the opposite node.
5. Start the service.
If one node is not reachable, do that for the remaining one.

This is not working sir.

It was in the same state <sync_status>1</sync_status> and <sync_status>0</sync_status>
Post Reply