Lost Datastores After Starwind Reboot

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Anatoly (staff), Max (staff)

Post Reply
User avatar
DarkDiamond
Posts: 29
Joined: Sat Dec 31, 2011 5:57 pm

Mon Jan 14, 2013 3:23 am

Hi,

I just recently had to reboot my Starwind target due to Windows patching. I shutdown all VMs connected to the datastores that were going offline and patched my server (and upgraded to Starwind 6.0.5189 after patching). Once the Starwind server came back online, two of my datastores would not mount on my ESXi cluster. I rebooted both of my hosts and still the datastore wouldn't mount (even after multiple refreshes and rescans of my hba). The devices are visible on both hosts, just not the datastores themselves.

I was able to add storage manually to the first host in the cluster and said "keep existing vmfs signature". However, I'm unable to mount the luns on the second host. When I go to the "Add Storage" link and select one of the luns, the "Keep Existing Signature" box is disabled and my only option is to "Format".

My other two datastores came back fine. The only thing different about the ones that didn't come back up is that there's a gap in names between the two that came up and the two that came down. The ones that came back online are "ImageFile1" and "ImageFile2". The ones that wouldn't come back online are "ImageFile5" and "ImageFile6" (I originally had additional luns that I removed because I no longer needed them).

I'm at a loss for troubleshooting further. I've attached my Starwind log. The target is set for "Clustering = True". The two luns that won't mount are both set to Asynchronous Mode = Yes.

Thanks :)
Dark Diamond
Attachments
starwind.zip
(30.31 KiB) Downloaded 833 times
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Mon Jan 14, 2013 12:32 pm

Hi,
Did you unmount the datastores prior to disconnecting the iSCSI targets?

Also, I've seen this behavior when trying to re-add the datastores from vCenter. This also included "not able to change the host configuration" message.
Did you try to re-add the datastore from the host directly?
Max Kolomyeytsev
StarWind Software
User avatar
DarkDiamond
Posts: 29
Joined: Sat Dec 31, 2011 5:57 pm

Mon Jan 14, 2013 2:17 pm

I did not unmount the datastores before rebooting the target. It doesn't look like I damaged anything because I was able to mount it from one host, just not the other. I tried to re-add the datastore from the host directly (even after a reboot) with no success.
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Tue Jan 15, 2013 12:20 pm

Got it.
I would recommend to try this :
1."remove from inventory" all the VMs you have on iSCSI
2. Unmount the iSCSI datastores
3. Remove the SAN's IP address from both dynamic and static discovery.
4. Run a rescan, verify that all iSCSI devices are gone
5. Re-add the IPs to dynamic discovery of the iSCSI initiator.
6. Try to re-add the datastores.
Max Kolomyeytsev
StarWind Software
User avatar
DarkDiamond
Posts: 29
Joined: Sat Dec 31, 2011 5:57 pm

Fri Jan 18, 2013 3:24 pm

I was able to reproduce the issue again list night. Unfortunately I won't have a chance to try your steps for a little while, yet. However, here's the steps I did to recreate the problem...

1. Create a 2TB lun and mount it to both hosts.
2. Storage vmotion a VM or two onto the new lun.
3. Shutdown all VMs.
4. Restart the Starwind service on the SAN node.

My other datastores (which are 400GB in size) stay mounted. However, the new 2TB lun is unmounted. I'm unable to mount it to more than one host, even after rebooting of the ESXi hosts themselves. I have to force mount the LUN on the first host (keep existing signature), which makes it unable to mount on the second host.

It almost seems to me that there's something with the size of the LUN that is causing the issue, given that my older 400GB luns are still online (twice now after repeated attempts to reproduce)...

DarkDiamond

Max (staff) wrote:Got it.
I would recommend to try this :
1."remove from inventory" all the VMs you have on iSCSI
2. Unmount the iSCSI datastores
3. Remove the SAN's IP address from both dynamic and static discovery.
4. Run a rescan, verify that all iSCSI devices are gone
5. Re-add the IPs to dynamic discovery of the iSCSI initiator.
6. Try to re-add the datastores.
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Fri Jan 18, 2013 5:37 pm

If it is VMFS5 there shouldn't be any trouble.
Also, what's the cache setting on the 2 TB and on the 400 Gb?
Max Kolomyeytsev
StarWind Software
User avatar
DarkDiamond
Posts: 29
Joined: Sat Dec 31, 2011 5:57 pm

Fri Jan 18, 2013 6:27 pm

The 2TB was set to 512MB read through. The 400GB luns were set to 2GB read through. I'm running the most recent build of Starwind.
Max (staff) wrote:If it is VMFS5 there shouldn't be any trouble.
Also, what's the cache setting on the 2 TB and on the 400 Gb?
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Sun Jan 20, 2013 12:33 pm

I'll forward the steps to our Q&A department. Let's see what they'll find.
I'll keep you updated
Max Kolomyeytsev
StarWind Software
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Tue Jan 22, 2013 11:10 am

Just got an update from our QA department.
Looks like they couldn't reproduce the problem following all the steps you've provided.
I think we need to dive into detail here. Also, did you have a chance to contact VMware regarding this issue?
Max Kolomyeytsev
StarWind Software
Post Reply