Having an issue with a new deployment of the free version of VSAN on ESX 8.0 that I could not find previously listed in the forum:
I have a new, 2-node HA CVM deployed into a ESX cluster.
Networking: Management and Heartbeat on a dedicated vSwitch into a LAN network switch. iSCSI traffic on a dedicated vSwitch into an isolated SAN switch. Sync traffic on a direct NIC connection between the two hosts. (ping works fine between both CVMs on all interfaces, and the Web Console on the CVM shows all interfaces are up)
Storage: 2 LUNs, backed by VMDKs on dedicated datastores on local RAID 6 arrays, were created successfully with CreateHA_2.ps1. (One 400Gb SSD and one 14TB SAS). Both iSCSI targets are connected to the ESX servers, and the disks mount without any issues.
Issue: Initial full sync starts normally (~2 hrs and ~24 hrs), and completes successfully. SyncHaDevice.ps1 run on each CVM shows that both HAimage1 and HAimage2 are synchronized. The web console on each CVM however shows both LUNS with limited availability and shows alerts on each CVM that the replication partner is not synchronised.
Shutting down the the primary node causes the iSCSI storage to become unavailable, so HA is not working (so the web console status is accurate.)
I have tried running SyncHaDevice.ps1 with MarkAsSynchronized uncommented (viewtopic.php?f=5&t=7257), stopping and restarting the vsan service, and manually forcing a full sync, but I get the same results: SyncHaDevice.ps1 reports that both HAimage1 and HAImage2 are synchronized and the Web Console shows that they are not.
Do you have any suggestions for what to try next for troubleshooting the issue? Is there a specific error that I should be looking for the log files?
Thanks!
The Latest Gartner® Magic Quadrant™Hyperconverged Infrastructure Software