PS shows synchronized but Web Console does not

Software-based VM-centric and flash-friendly VM storage + free version
Post Reply
cjanoch
Posts: 3
Joined: Thu Jun 11, 2026 12:00 am

Thu Jun 11, 2026 4:24 pm

Having an issue with a new deployment of the free version of VSAN on ESX 8.0 that I could not find previously listed in the forum:

I have a new, 2-node HA CVM deployed into a ESX cluster.

Networking: Management and Heartbeat on a dedicated vSwitch into a LAN network switch. iSCSI traffic on a dedicated vSwitch into an isolated SAN switch. Sync traffic on a direct NIC connection between the two hosts. (ping works fine between both CVMs on all interfaces, and the Web Console on the CVM shows all interfaces are up)

Storage: 2 LUNs, backed by VMDKs on dedicated datastores on local RAID 6 arrays, were created successfully with CreateHA_2.ps1. (One 400Gb SSD and one 14TB SAS). Both iSCSI targets are connected to the ESX servers, and the disks mount without any issues.

Issue: Initial full sync starts normally (~2 hrs and ~24 hrs), and completes successfully. SyncHaDevice.ps1 run on each CVM shows that both HAimage1 and HAimage2 are synchronized. The web console on each CVM however shows both LUNS with limited availability and shows alerts on each CVM that the replication partner is not synchronised.

Shutting down the the primary node causes the iSCSI storage to become unavailable, so HA is not working (so the web console status is accurate.)

I have tried running SyncHaDevice.ps1 with MarkAsSynchronized uncommented (viewtopic.php?f=5&t=7257), stopping and restarting the vsan service, and manually forcing a full sync, but I get the same results: SyncHaDevice.ps1 reports that both HAimage1 and HAImage2 are synchronized and the Web Console shows that they are not.

Do you have any suggestions for what to try next for troubleshooting the issue? Is there a specific error that I should be looking for the log files?
Thanks!
yaroslav (staff)
Staff
Posts: 4341
Joined: Mon Nov 18, 2019 11:11 am

Thu Jun 11, 2026 8:17 pm

VMDKs, especially lazy or thin, are capping performance quite heavily. That's why we use RDM or pass-through.
alerts on each CVM that the replication partner is not synchronised.
Check the Windows-based console. Please also check the appliances view.
Shutting down the the primary node causes the iSCSI storage to become unavailable, so HA is not working (so the web console status is accurate.)
Are you sure that the devices are indeed synchronized and connected over iSCSI?
cjanoch
Posts: 3
Joined: Thu Jun 11, 2026 12:00 am

Thu Jun 11, 2026 9:15 pm

Hi Yaroslav -- thanks for the response.

I'm using the free version, so I didn't think that any features on the Windows-based console were functional.
Thanks for the tip -- glad to get a graphical status on sync progress!

The Windows console shows that both HAImage1 and HAImage2 are up and synchronized, and both ESX servers can see and browse the files in the Datastores on the mounted LUNs.
IMG01_0611.png
IMG01_0611.png (174.28 KiB) Viewed 492 times
The appliance console shows partners are not synchronized and LUNs have limited availability (so the ESX hosts are probably only seeing the LUNs on the primary CVM.
IMG02_0611.png
IMG02_0611.png (73.7 KiB) Viewed 492 times
In the Windows Console view however, I can now see that:
(1) the Failover Strategy seems to be node majority and not heartbeat, and
(2) Asynchronous mode seems to be enabled.
Do these need to be changed? And if so, what PS script/option do I need to use to change them?

After I better understand the HA and sync configurations, I'll replace the VMDK with an RDM. There is no data (except testing) on the LUNs, so no concerns about re-creating them.

Thanks so much for the assistance!
cjanoch
Posts: 3
Joined: Thu Jun 11, 2026 12:00 am

Thu Jun 11, 2026 9:36 pm

Assuming that the issue might be related to the Failover Strategy (if the system is waiting for a 3rd node before making the LUN highly available seems to make sense to me), I am attaching the first part of my CreateHA_2.ps1 script in case that will be your next ask.

Thanks again!

param($addr="192.168.100.121", $port=3261, $user="root", $password="starwind",
$addr2="192.168.100.122", $port2=$port, $user2=$user, $password2=$password,
#common
$initMethod="Clear",
$size=405504,
$sectorSize=512,
$failover=1,
$bmpType=1,
$bmpStrategy=1,
[Uint32][ValidateRange(0, 4294967294)]
$maxUnmap=128,
#primary node
$imagePath="/mnt/sdb1/vSAN-SSD",
$imageName="Primary-vSAN-SSD",
[bool][ValidateSet($false, $true)]
$createImage=$true,
$storageName="",
$targetAlias="vSAN-SSD",
$poolName="pool1",
$syncSessionCount=1,
[bool][ValidateSet($false, $true)]
$aluaOptimized=$true,
$cacheMode="none",
$cacheSize=0,
$syncInterface="#p2=10.0.20.122:3260",
$hbInterface="#p2=10.0.21.122:3260",
[bool][ValidateSet($false, $true)]
$createTarget=$true,
$bmpFolderPath="",
#secondary node
$imagePath2="/mnt/sdb1/vSAN-SSD",
$imageName2="Secondary-vSAN-SSD",
[bool][ValidateSet($false, $true)]
$createImage2=$true,
$storageName2="",
$targetAlias2="vSAN-SSD",
$poolName2="pool1",
$syncSessionCount2=1,
[bool][ValidateSet($false, $true)]
$aluaOptimized2=$false,
$cacheMode2=$cacheMode,
$cacheSize2=$cacheSize,
$syncInterface2="#p2=10.0.20.121:3260",
$hbInterface2="#p2=10.0.21.121:3260",
[bool][ValidateSet($false, $true)]
$createTarget2=$true,
$bmpFolderPath2=""
)
yaroslav (staff)
Staff
Posts: 4341
Joined: Mon Nov 18, 2019 11:11 am

Fri Jun 12, 2026 3:11 am

Glad to read it helped. Management Console, even for VSAN Free, still works to check the status and logs.
Yes, you are right, the failover strategy was not set correctly. If Network redundancy allows, use heartbeat (i.e., failover=0). Check this script out viewtopic.php?f=5&t=6852.
Good luck with your project.
Post Reply