This is a 2 node HA Cluster on Windows Server 2016, with a 6TB Starwind vSAN.
It has been working relatively flawlessly up until this point.
The Shared volume broke once the sync caused problems, as expected when the iSCSI connector went offline.
Marked the most recent node as synchronized, and then had everything resync.
The iSCSI connector came back up, but my Cluster Volume resource still reported it was down, and would not go online.
I verified that there was a sync status of 1 on both nodes. I tried to remove the volume resource from the Cluster Manager, and readd it, but it reports there is no available disks on either machine. Other threads mention removing a problematic node and re-adding it, but now the node I removed will not join back to the cluster, but that is not really important at this time.
I would be fine with it mounting to a single node so I can check the data consistency at this point.
Our entire HA infrastructure has been down for 2 days at this point, and I am at a loss for what to do next.
EDIT:
A little more info as I have trying to correct the issue:
Some developments.
Once the iSCSI connectors are in place, I can verify the shared disk shows in disk management on both nodes. However it mounts as 'RAW' instead of NTFS. Any interaction with the disk prompts a 'resource is busy' alert.
Event Viewer spits out the following when Mounted, or interacted with:
Code: Select all
The system failed to flush data to the transaction log. Corruption may occur in VolumeId: E:, DeviceName:
\Device\HarddiskVolume13.
({Device Busy}
The device is currently busy.)
Code: Select all
A corruption was discovered in the file system structure on volume E:.
The exact nature of the corruption is unknown. The file system structures need to be scanned online.
Code: Select all
chkdsk /r E:
The type of the file system is NTFS.
Volume label is CSVOL1.
Stage 1: Examining basic file system structure ...
Deleting corrupt attribute record (0x80, "")
from file record segment 0x7F.
Deleting corrupt attribute record (0x80, "")
from file record segment 0xCD.
Deleting corrupt attribute record (0x80, "")
from file record segment 0xD4.
Deleting corrupt attribute record (0x80, "")
from file record segment 0xE1.
Deleting corrupt attribute record (0x80, "")
from file record segment 0x114.
Deleting corrupt attribute record (0x80, "")
from file record segment 0x18C.
Deleting corrupt attribute record (0x80, "")
from file record segment 0x18D.
Deleting corrupt attribute record (0x80, "")
from file record segment 0x193.
Deleting corrupt attribute record (0x80, "")
from file record segment 0x1A3.
Deleting corrupt attribute record (0x80, "")
from file record segment 0x1B4.
512 file records processed.
File verification completed.
An unspecified error occurred (6e74667363686b2e 109f).
An unspecified error occurred (6e74667363686b2e 1583).