The storage device became non-active after a node reboot

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
serhiogo
Posts: 30
Joined: Thu Aug 16, 2018 12:49 pm

Thu Aug 16, 2018 7:05 pm

Hello collegues.

There is some trouble with my pre-prod cluster.

I've got 2-node HA storage cluster with Starwind Virtual SAN Free edition v8.0.0 (Build 12166). Each node is Huawei RH5288v3 server with Windows Server 2012 R2 Standard. The disk space of each node consists of two separate RAID10 logical disks, one (My computer\D\) for localstore and other (My computer\E\) for partner's synchronized replica. Both disks are NTFS-formatted LVM.
Both nodes was tuned for syncronious replication of it's local storages as backend HA datastories of Vmware cluster. So, every server contains one iSCSI target with one Device in for local storage and one iSCSI target with one Device in for replicated storage.
All worked fine through some network tests.
Untill today.
I rebooted one node today. It was normal reboot, not a crash-test (in my plans btw =]). And a local Device became non-active after reboot. The replica of this device is normal (without syncronization of course) on the remote node, The replica of the remote node is normal on the "emergency" node, this Device seems like fully syncronized.
Image

But I noticed one strange thing. The path of replica file now points to the... LOCAL storage file!

Image

And I found information about the conflict in the Starwind log:

Code: Select all

...
8/16 19:54:01.338 c38 IMG: *** SscPort_Create: File 'My Computer\D\storage-mmt\storage-mmt.img' is already opened as 'imagefile2'.
8/16 19:54:01.338 c38 SCSI: *** iScsiSscDevice::open: Failed to create 'imagefile1' device.
8/16 19:54:01.338 c38 Plugin: iScsiPlugin::pluginCallback: Device is not opened by service core!
8/16 19:54:01.338 c38 Plugin: iScsiPlugin::pluginCallback: Device 'imagefile1' not found or has wrong type!
8/16 19:54:01.338 c38 HA: *** CHADevice::get_storage_device_interface_by_name: EXITing with failure(failed), error getting storage device interface for device imagefile1, error code  0x57.
8/16 19:54:01.338 c38 HA: *** SscPort_Create: Getting storage device interface failed, error code 0x57.
8/16 19:54:01.338 c38 HA: *** CHADevice::release_storage_device_interface: Error releasing storage device interface, error code 0x57.
8/16 19:54:01.338 c38 SCSI: *** iScsiSscDevice::open: Failed to create 'HAImage1' device.
8/16 19:54:01.338 c38 Srv: iScsiServer::deviceOpen: Device 'HAImage1' - open failed!
...
The full log is saved, feel free to ask.
The replica device (storage-210 - HAImage2) now seems like using the alien datafile while local device is now orpfaned although all .swdsk files contain the correct paths. And another one thing is not clear for me - why the Starwind believes the "storage-210" device healthy and syncronized in this situation?

So finally - why it happened and how to fix it (via PS if it's possible)? I keep in mind that the evaluation period will gone after 2 weeks and I don't want this problem to repeat. Thanx to GUI for now, it's much more informative than PS script output.

WBR, Sergey Goncharov.
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Thu Aug 16, 2018 8:54 pm

On VSAN1, do you have both header file *.swdsk and *_HA.swdsk for storage-mmt? Do they have correct file extensions?
serhiogo
Posts: 30
Joined: Thu Aug 16, 2018 12:49 pm

Thu Aug 16, 2018 9:24 pm

Boris (staff) wrote:On VSAN1, do you have both header file *.swdsk and *_HA.swdsk for storage-mmt? Do they have correct file extensions?
Did you mean - on VSAN-1-MMT? Yes, I have both header files for the storage-mmt. At all, none of necessary files of the storage devices on both nodes disappered. And all header files contain correct info about it's storage img file path. The only thing is the storage-210 device on vsan-1-mmt started to use NOT own file.
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Thu Aug 16, 2018 10:18 pm

Submit a ticket through https://www.starwindsoftware.com/support-form with reference to this topic.
serhiogo
Posts: 30
Joined: Thu Aug 16, 2018 12:49 pm

Fri Aug 17, 2018 8:05 am

Re: The storage device became non-active after a node reboot

Post by Boris (staff) » Fri Aug 17, 2018 1:18 am
Submit a ticket through https://www.starwindsoftware.com/support-form with reference to this topic.
Done.
Michael (staff)
Staff
Posts: 317
Joined: Thu Jul 21, 2016 10:16 am

Mon Oct 08, 2018 8:36 am

Just an update to the community - the issue has been resolved.
Post Reply