Server 2003 hard shutdown halts cluster

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
tkloppel
Posts: 1
Joined: Fri Aug 08, 2008 6:13 pm

Wed Aug 13, 2008 12:05 pm

First, we're using the trial for Starwind version 3.5.5 to evaluate the product.

Created a 2GB quorum image which is configured with default options and enabled for multiple connections (clustering.)

I have created the cluster without any issues, and it appears to function properly.

I can shut down a node and the other takes over without delay, I can initiate failures and move the cluster group between nodes. But when I pull the power cord on node1, node2 is not able to take over. It attempts 3 times and then kills the cluster service. In the Starwind log, it looks like there is a lock or some kind of reservation on the quorum target from node1 that is preventing node2 from taking control.

Once the cluster service gets restarted, node2 becomes active -- but this is not until after about 3 minutes.

Any help here would be great. I can send logs if needed.
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Thu Aug 14, 2008 5:12 pm

1) Are you using the most recent version of StarWind?

2) If "yes" to 1) please send the logs to support@rocketdivision.com (zipped if you please). + detailed description what you do :)

Thank you very much for cooperation!
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Shawn
Posts: 2
Joined: Wed Sep 24, 2008 5:34 pm

Wed Sep 24, 2008 5:43 pm

tkloppel wrote:First, we're using the trial for Starwind version 3.5.5 to evaluate the product.

Created a 2GB quorum image which is configured with default options and enabled for multiple connections (clustering.)

I have created the cluster without any issues, and it appears to function properly.

I can shut down a node and the other takes over without delay, I can initiate failures and move the cluster group between nodes. But when I pull the power cord on node1, node2 is not able to take over. It attempts 3 times and then kills the cluster service. In the Starwind log, it looks like there is a lock or some kind of reservation on the quorum target from node1 that is preventing node2 from taking control.

Once the cluster service gets restarted, node2 becomes active -- but this is not until after about 3 minutes.

Any help here would be great. I can send logs if needed.
I am having the same problem with the evaluation software in an almost identical setup. I am running the trial version which I downloaded about a week ago:

StarWind iSCSI Target v3.5.4 (Build 20080527, Win64)
Trial license, days left: 24)

I even seem to be seeing the same log messages as mentioned above ("ScsiOp 0x28 - Reservation conflict.") However my node 2 never becomes active and eventually stops trying to attach. I did notice in the StarWind device properties that Persistent reservation is set to "Yes". I am not sure if this can be turned off or is the problem.
Has a resolution been proposed or found?
Val (staff)
Posts: 496
Joined: Tue Jun 29, 2004 8:38 pm

Thu Sep 25, 2008 9:54 pm

Try to change the following parameter in StarWind.cfg and restart the service:
from <iScsiPingPeriod value="0"/>
to <iScsiPingPeriod value="5"/>

This should help.
Best regards,
Valeriy
Shawn
Posts: 2
Joined: Wed Sep 24, 2008 5:34 pm

Mon Sep 29, 2008 11:03 pm

valery (staff) wrote:Try to change the following parameter in StarWind.cfg and restart the service:
from <iScsiPingPeriod value="0"/>
to <iScsiPingPeriod value="5"/>

This should help.

It worked!
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue Sep 30, 2008 6:59 am

Good :)
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Post Reply