Just can't figure this out
Posted: Sun Jul 28, 2013 11:42 pm
We built a brand new 10Gb SAN to replace an older 1Gb one and just can't figure out what is going on with the performance.
Setup is like this:
Hp C7000 chassis with bl460c blades each with 2x quad cores and 32GB ram running esxi 5.1 on VC Flex-10 10Gbe nics connecting back to Starwind latest version running on HP Dl360 G5 with a HP P812 controller connected to a HP MSA70 running 25x 300GB SAS 15k disks and a HP NC522sfp dual port 10Gbe nic connecting back to the Flex-10 modules via HP .5m SFP+ DAC cables. Starwind is running on Server 2008 R2 and all lates updates installed.
In theory the server 2008 R2 test VM running on one of the blades should be able to push 800MB - 900MB+ connecting back to the Starwinnd SAN but this is not the case. Best case scenario is about 550MBs write and 450MBs write.
Local performance tests directly on the SAN yield about 1.2GBs write and 1.6GBs read using both Hdtune 5.0 and Atto tools.
So I then created an 8GB ramdrive on the SAN and ran the same benchmarks and got around 3.2GBs write and 2.4GBs read.
Then I shared out this ramdrive to the ESXi hosts using Starwind and guess what, tests only get about 600MBs write and 500MBs read.
I switched to an RDM in vmware instead of a datastore but it yielded no improvement.
Thinking there was something wrong with my network config I went back to the starwind server and using Microsoft iScsi initiator I connected to the 8GB ramdrive being shared out from Starwind. assuming this would benchmark out at pretty much full 10Gbe speed I was disappointed to only see about 400MBs write and 350MBs read (please note this is MS initiator running on the same server as Starwind and the ramdrive).
So then I thought it was the HP Nc522SFP adapter or driver causing the problem, I disabled the Nic and installed the MS loopback adapter. Surely this would give me max performance but it was worse than the 10Gbe Nic, only seeing performance of about 350MBs write and 300MBs read. Weird.
So what on earth could be going on here?
I followed all the recommendations for performance tuning like enabling/disabling RSS, Chimney, TCP offloading, Delayed Ack, etc. but no major difference overall.
Where do I go from here?
Setup is like this:
Hp C7000 chassis with bl460c blades each with 2x quad cores and 32GB ram running esxi 5.1 on VC Flex-10 10Gbe nics connecting back to Starwind latest version running on HP Dl360 G5 with a HP P812 controller connected to a HP MSA70 running 25x 300GB SAS 15k disks and a HP NC522sfp dual port 10Gbe nic connecting back to the Flex-10 modules via HP .5m SFP+ DAC cables. Starwind is running on Server 2008 R2 and all lates updates installed.
In theory the server 2008 R2 test VM running on one of the blades should be able to push 800MB - 900MB+ connecting back to the Starwinnd SAN but this is not the case. Best case scenario is about 550MBs write and 450MBs write.
Local performance tests directly on the SAN yield about 1.2GBs write and 1.6GBs read using both Hdtune 5.0 and Atto tools.
So I then created an 8GB ramdrive on the SAN and ran the same benchmarks and got around 3.2GBs write and 2.4GBs read.
Then I shared out this ramdrive to the ESXi hosts using Starwind and guess what, tests only get about 600MBs write and 500MBs read.
I switched to an RDM in vmware instead of a datastore but it yielded no improvement.
Thinking there was something wrong with my network config I went back to the starwind server and using Microsoft iScsi initiator I connected to the 8GB ramdrive being shared out from Starwind. assuming this would benchmark out at pretty much full 10Gbe speed I was disappointed to only see about 400MBs write and 350MBs read (please note this is MS initiator running on the same server as Starwind and the ramdrive).
So then I thought it was the HP Nc522SFP adapter or driver causing the problem, I disabled the Nic and installed the MS loopback adapter. Surely this would give me max performance but it was worse than the 10Gbe Nic, only seeing performance of about 350MBs write and 300MBs read. Weird.
So what on earth could be going on here?
I followed all the recommendations for performance tuning like enabling/disabling RSS, Chimney, TCP offloading, Delayed Ack, etc. but no major difference overall.
Where do I go from here?