Scary Failure

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Anatoly (staff), Max (staff)

Post Reply
dwright1542
Posts: 19
Joined: Thu May 07, 2015 1:58 am

Mon May 25, 2015 12:40 am

We've been testing SW v 8.0.7929. esxi 5.5. Server 2012 3TB volume with a cache volume on a FusionIO drive.

Yesterday, the SW VM started throwing:

Event 129 Reset to device, \Device\RaidPort0, was issued

and

Event 153 The IO operation at logical block address 14588400 for Disk 1 was retried. (many of these, different addresses)

and finally

Event 7034 The StarWind Virtual SAN service terminated unexpectedly. It has done this 1 time(s).

ESXI then lost connectivity, and the VM's on SW disapeared. After a restart of SW, that ESXi box could not find the Datastore. It SAW that there was a disk there, and knew that there was a proper partition layout, and would give me the option of creating a new datastore and formatting that drive.

From another ESXi box, I then attached to the SW iscsi, and lo and behold, there were the VM's and a proper datastore.

So, down the original ESXi box and the SW VM, back up, and everything is fine.

There was no indication of HW drive / raid problems at all outside of the SW VM. And none of the VM's that were just directly on the parent datastore showed any issues, nor were affected.
Vladislav (Staff)
Staff
Posts: 180
Joined: Fri Feb 27, 2015 4:31 pm

Mon May 25, 2015 9:44 am

Hi,

Those events are related to underlying storage issues, but you said RAID is healthy, therefore VM where StarWind is installed requires further investigation.

If Windows thought that RAID is not healthy, then StarWind definitely has same thoughts.

Please upload somewhere the StarWind dump file if it was created after StarWind service failure and provide me with the link. Also please attach StarWind logs and Windows System and Application logs from both nodes as well. I'll take s closer look at logs and will let you know about the result.
dwright1542
Posts: 19
Joined: Thu May 07, 2015 1:58 am

Thu May 28, 2015 3:06 am

Vitaliq (staff) wrote:Hi,

Those events are related to underlying storage issues, but you said RAID is healthy, therefore VM where StarWind is installed requires further investigation.

If Windows thought that RAID is not healthy, then StarWind definitely has same thoughts.

Please upload somewhere the StarWind dump file if it was created after StarWind service failure and provide me with the link. Also please attach StarWind logs and Windows System and Application logs from both nodes as well. I'll take s closer look at logs and will let you know about the result.
Single node. Where do I fine the dump file? Can I PM you the location?
Vladislav (Staff)
Staff
Posts: 180
Joined: Fri Feb 27, 2015 4:31 pm

Thu May 28, 2015 10:31 am

You can find *.mdmp file here C:\Program Files\StarWind Software\StarWind

Please drop me the link at support@starwindsoftware.com
Post Reply