Virtual SAN V8 craches with 3ware controller (?)

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
joerg
Posts: 12
Joined: Wed Dec 03, 2014 8:00 pm
Location: Germany

Wed Dec 03, 2014 8:16 pm

Hi there,

my first post... and excuse my english...

I (wanted to) use the starwind iSCSI Clustered Disk as CSV in my Windows 2012R2 Cluster. All went fine with V6. But after upgrading to V8 I had the problem that the cluster server with a LSI 3Ware RAID Controller run into problems. He could not initialize the iSCSI disc and hang up. The Cluster-Server without a LSI 3Ware controller was able to initialize the disc but the cluster went down, when the disc was appended as a CSV.
I go back to Starwind iSCSI V6 and solve my problem that way. But if there is a solution or a known "bug"/issue i am very interested in that.

cheers
Joerg
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed Dec 03, 2014 9:11 pm

It's not a known issue. For now I'd suspect it's because of the emulated block size with 3Ware controller. To be 100% sure (and of course we need to fix the "crash" as is not something we expect our app to do...) we need you to re-install back V8 and send us StarWind log for a failed attempt to do something :)
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
joerg
Posts: 12
Joined: Wed Dec 03, 2014 8:00 pm
Location: Germany

Fri Dec 05, 2014 7:06 am

many thanks for the quick reply...

OK, I will reinstall V8 and produce some logs (is loglevel 3 needed?)
What I know by now are the events in the eventlog (my logs are in german so only some info translated by me):

ID 129: additional information: \device\RaidPort1
ID 39: The initiator got a task to reset the target
ID 9: target did not answer in time
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Fri Dec 05, 2014 4:41 pm

Yes, please do it.

There's no need to start with a Level3 logs please start with Level1 and we'll ask for 2 and 3 if needed (Level3 is very slow...)

There's no need to translate from German. We have native German speakers here :)
joerg wrote:many thanks for the quick reply...

OK, I will reinstall V8 and produce some logs (is loglevel 3 needed?)
What I know by now are the events in the eventlog (my logs are in german so only some info translated by me):

ID 129: additional information: \device\RaidPort1
ID 39: The initiator got a task to reset the target
ID 9: target did not answer in time
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
joerg
Posts: 12
Joined: Wed Dec 03, 2014 8:00 pm
Location: Germany

Sat Dec 06, 2014 8:20 pm

mail with logs is on its way...
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Mon Dec 08, 2014 2:48 pm

Received. I`ve just answered you in email.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
jhamm@logos-data.com
Posts: 78
Joined: Fri Mar 13, 2009 10:11 pm

Tue Dec 09, 2014 8:56 pm

I have also been experiencing issues with 3ware RAID controller and V8. Never had an issue with V6. Was thinking it was an issue with my controller but now not so sure. Perhaps one clue: These issues for me did not really start until I installed the Hardware VSS Provider. I was running V8 for about a month with no issues, but after installing the Hardware VSS Provider on the StarWind servers, Hyper-V cluster nodes, and backup (Veeam) server, it seems that the Startwind Hardware VSS provider crashes and restarts on at least some of these servers several times throughout the day. At the same time that the Hardware VSS provider service crashes, I also get a 3ware error logged in the event log. Opened up a support case a couple of weeks ago about the Hardware VSS Provider not being able to snapshot the LSFS devices. I have been working with Vitaliy. Currently have build 7450 installed on both HA nodes.
jhamm@logos-data.com
Posts: 78
Joined: Fri Mar 13, 2009 10:11 pm

Wed Dec 10, 2014 3:45 am

I did speak with LSI/3ware. They said the error message might be related to StarWind sending a block sizes greater than 128k?
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Thu Dec 11, 2014 6:06 pm

StarWind works over NTFS so I guess it comes down to the size specified when creating the partition.
I'm looking into the logs Joerg sent few days ago to see if there are any leads to the issue.
Max Kolomyeytsev
StarWind Software
jhamm@logos-data.com
Posts: 78
Joined: Fri Mar 13, 2009 10:11 pm

Thu Dec 11, 2014 6:12 pm

So it would be NTFS passing along the block size; StarWind really has nothing to do with the block size, correct?
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Thu Dec 11, 2014 7:29 pm

Unless you're doing a direct RAID volume pass through you're not putting big writes on the disks.
Max Kolomyeytsev
StarWind Software
joerg
Posts: 12
Joined: Wed Dec 03, 2014 8:00 pm
Location: Germany

Wed Jan 28, 2015 8:47 pm

I restarted tests this week, but I surrender.

I had eMail contact with Max and Anatoly and did some testing over the last weeks:
- I configured Maximum Burst Length with different values
- I played with the sector size of the LUN (4k and 512B)
- I tried a extra HDD not connected to the RAID controller but to onboard SATA
all with the same (bad) effect.
Because my server was unbootable after each test and I had to make a cold start (after some time waiting wether he restarts or not), each test lasts minimum one hour, now I stopped testing and went back to a setup with V6, which fulfills all my requirements and runs prefectly...

cheers
Joerg
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Wed Jan 28, 2015 8:51 pm

Thank you for the update!
Max Kolomyeytsev
StarWind Software
Post Reply