Move to a supported OS and performance goes down the tubes?

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
digitalis99
Posts: 44
Joined: Sat Sep 15, 2012 6:24 am

Sun Mar 10, 2013 5:53 am

I've been running a CDP target in ROW mode off of a Win7 test machine for a while now. I have an Intel 10Gbps NIC in this machine, and I have 30-50 machines booting up off of that ROW target simultaneously. I used to store the image files on flash storage, but have since moved them to a ramdisk to improve performance. On flash-backed storage, the Intel NIC would only hit about 35% usage. With ramdisk storage, the NIC will hit 60% usage. I was pretty happy with that, since all the machines could boot quite quickly.

Then I decided to switch to a "supported" OS. :roll:

I installed Windows Server 2008 R2 on the exact same hardware that I was previously using with Win7. Same NIC, same cable, same switch, same ramdisk software (Softperfect), same nodes booting off the ROW target, same everything except for the host OS. Performance tanked.

The Intel NIC only gets to about 27% usage, and the system kernel time is through the roof. Only about 6 machines at a time can effectively boot from the target, whereas before, I could have all 30-50 boot simultaneously no problem.

Yes, I'm running the latest Intel driver for the NIC in both OS', so don't bother asking. Any idea why the performance of Starwind in Win2k8 R2 is SO bad in comparison to an identical setup with Win7 as the host OS?

Makes me want to go back to an unsupported OS...just 'cause the performance doesn't stink.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Wed Mar 13, 2013 5:05 pm

To be honest that is really interesting situation.
OK, my first guess: I saw situations when old NIC drivers showed better performance comparing to the latest ones. Question: can you confirm that drivers were the same on Win7 and WS2008R2?

Also, maybe you had changed some Jumbo frames or some other tweaks on Win7 that you haven`t changed on WS?
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
digitalis99
Posts: 44
Joined: Sat Sep 15, 2012 6:24 am

Wed Mar 13, 2013 5:12 pm

The NIC driver (downloaded from Intel), was the same version for both OS'. I have tried enabling all of the jumbo frame, chimney, ecn, and so on...options as recommended in the forum post here. Nothing has made a difference so far. Nothing I've done makes performance go back to what it was when Win7 was the host OS.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Fri Mar 15, 2013 5:26 pm

Well, that is really interesting. It looks like we need to have some benchmarking. Here is the document that should be helpful here:
http://www.starwindsoftware.com/starwin ... t-practice
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
digitalis99
Posts: 44
Joined: Sat Sep 15, 2012 6:24 am

Sun Mar 17, 2013 9:48 pm

So far, in my testing, I've found that the stickied forum post about TCP/IP tuning should probably be permanently deleted. :shock:

In my environment, at least, modifying ECN, chimney, jumbo frames, and similar parameters per the doc are a sure-fire way to have poor performance in Starwind. I get much better performance by just leaving everything at Windows default. I don't know why that is, exactly, but it's clearly the case.

At this point, I really don't think tuning is the culprit here, as the bootup delays for 30-50 machines off a ROW target are consistent regardless of any tuning I've done or undone. I'm going to dig into the log files to see if there's anything obvious there.
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Sun Mar 17, 2013 10:04 pm

Interesting... Can you trace it down to a particular parameter making everything flaky?
digitalis99 wrote:So far, in my testing, I've found that the stickied forum post about TCP/IP tuning should probably be permanently deleted. :shock:

In my environment, at least, modifying ECN, chimney, jumbo frames, and similar parameters per the doc are a sure-fire way to have poor performance in Starwind. I get much better performance by just leaving everything at Windows default. I don't know why that is, exactly, but it's clearly the case.

At this point, I really don't think tuning is the culprit here, as the bootup delays for 30-50 machines off a ROW target are consistent regardless of any tuning I've done or undone. I'm going to dig into the log files to see if there's anything obvious there.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
digitalis99
Posts: 44
Joined: Sat Sep 15, 2012 6:24 am

Sun Mar 17, 2013 10:07 pm

I haven't found any parameter that makes it flakey in terms of the number of machines it will allow to connect to the target and boot simultaneously. The parameter changes only change the performance of a session after it's connected.
digitalis99
Posts: 44
Joined: Sat Sep 15, 2012 6:24 am

Mon Mar 18, 2013 11:30 pm

I still haven't found any good explanation for this. I have noticed, however, that when systems are delayed in the boot process, it's during the switchover from the iPXE boot ROM to the MS initiator. I'm combing through Starwind log files to see if I can find good examples and track down why those examples took longer than they should have.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Sat Mar 23, 2013 9:57 am

Will it be possible for you to do next:
*Benchmark the RAID arrays on the SW host
*Create the Starwind RAM drive, connect to it with some machine that is using the same paths so as the machines that you had performance issues and and run benchmark tests
*Create the Starwind Basic Image drive, connect to it with some machine that is using the same paths so as the machines that you had performance issues and and run benchmark tests

The ATTO Benchmark tool should be good to start
We`ll appreciate if you could post benchmarking results after

Thanks
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
digitalis99
Posts: 44
Joined: Sat Sep 15, 2012 6:24 am

Wed Mar 27, 2013 6:24 am

I should probably clarify my thread title. "Throughput performance" doesn't appear to be the issue. "Bootup performance", on the other hand, is what has suffered. I get great throughput. It appears to be roughly the same as when Win7 Pro was the host OS on the target server. What doesn't happen is all clients booting up simultaneously. In the transition from iPXE to Windows during the middle of the boot phase, many of the clients pause or lockup. Some of them so bad that, I believe, they BSOD and reset.

There's just something about the host OS change on the target server that really messes up ROW mode on a completely unpredictable basis. How many clients have you guys tested simultaneously booting?
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Mon Apr 01, 2013 11:13 am

Hm... Lets try to solve it from another side. This isssue looks related to the MS since OS is the only thing that has been changed (just my opinion). Have you tried to ask this question at MS forum or Ms support (our product is MS certified, so they should give you proper support)?
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
digitalis99
Posts: 44
Joined: Sat Sep 15, 2012 6:24 am

Mon Apr 01, 2013 3:23 pm

I haven't asked MS yet, but I'm not sure what help they could give. That phase of bootup isn't logged very well, and likely couldn't be logged at all in ROW with discard mode, which is what I use.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Wed Apr 03, 2013 1:03 pm

Well, I think its not about the client machines which are booting, but about the SAN host OS. I really think that it worth a shot and try to ask MS reps.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
Post Reply