There is some trouble with my pre-prod cluster.
I've got 2-node HA storage cluster with Starwind Virtual SAN Free edition v8.0.0 (Build 12166). Each node is Huawei RH5288v3 server with Windows Server 2012 R2 Standard. The disk space of each node consists of two separate RAID10 logical disks, one (My computer\D\) for localstore and other (My computer\E\) for partner's synchronized replica. Both disks are NTFS-formatted LVM.
Both nodes was tuned for syncronious replication of it's local storages as backend HA datastories of Vmware cluster. So, every server contains one iSCSI target with one Device in for local storage and one iSCSI target with one Device in for replicated storage.
All worked fine through some network tests.
Untill today.
I rebooted one node today. It was normal reboot, not a crash-test (in my plans btw =]). And a local Device became non-active after reboot. The replica of this device is normal (without syncronization of course) on the remote node, The replica of the remote node is normal on the "emergency" node, this Device seems like fully syncronized.
But I noticed one strange thing. The path of replica file now points to the... LOCAL storage file!
And I found information about the conflict in the Starwind log:
Code: Select all
...
8/16 19:54:01.338 c38 IMG: *** SscPort_Create: File 'My Computer\D\storage-mmt\storage-mmt.img' is already opened as 'imagefile2'.
8/16 19:54:01.338 c38 SCSI: *** iScsiSscDevice::open: Failed to create 'imagefile1' device.
8/16 19:54:01.338 c38 Plugin: iScsiPlugin::pluginCallback: Device is not opened by service core!
8/16 19:54:01.338 c38 Plugin: iScsiPlugin::pluginCallback: Device 'imagefile1' not found or has wrong type!
8/16 19:54:01.338 c38 HA: *** CHADevice::get_storage_device_interface_by_name: EXITing with failure(failed), error getting storage device interface for device imagefile1, error code 0x57.
8/16 19:54:01.338 c38 HA: *** SscPort_Create: Getting storage device interface failed, error code 0x57.
8/16 19:54:01.338 c38 HA: *** CHADevice::release_storage_device_interface: Error releasing storage device interface, error code 0x57.
8/16 19:54:01.338 c38 SCSI: *** iScsiSscDevice::open: Failed to create 'HAImage1' device.
8/16 19:54:01.338 c38 Srv: iScsiServer::deviceOpen: Device 'HAImage1' - open failed!
...
The replica device (storage-210 - HAImage2) now seems like using the alien datafile while local device is now orpfaned although all .swdsk files contain the correct paths. And another one thing is not clear for me - why the Starwind believes the "storage-210" device healthy and syncronized in this situation?
So finally - why it happened and how to fix it (via PS if it's possible)? I keep in mind that the evaluation period will gone after 2 weeks and I don't want this problem to repeat. Thanx to GUI for now, it's much more informative than PS script output.
WBR, Sergey Goncharov.