Performance Problems with HA-targets

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
Chr.Raible
Posts: 1
Joined: Tue Jan 05, 2010 1:37 pm

Wed Jan 13, 2010 2:44 pm

Hi @all,

first of all, sorry for my bad english. If you don´t understand please ask me more detailed. I´ll trie my best!!!!

Now want to explain my infrastucture that you better know what I´m talkin about ;)

I have 2 ESX-Servers in my infrastructure.
The local HDDs are configured in RADI 5. This local storage is used by the Starwind Server for the HA-Targets.
Each ESX has 4 NIC´s with 1 GB/s and is connected with the other over an Cisco Switch.

On each ESX-Server I installed a Starwind Server with host OS Windows Server 2008.
Es VM has 4 CPU´s an 4 GB RAM an two NIC´s


Now to my Problem.

I have configured an HA-Volume like in the Starwind_HA_ESX.pdf Guide. But the performance is is very very BAD
The primary HA-Target is located on ESX-1 and the secondary HA-Target is located on teh ESX-2. Both Virtual machines have an own replication network interface.

1.)

If I install an new Virtual Machine on the HA-target and ESX-1 (thats important) and I will test the perfomance with the following command:

Code: Select all

dd if=/dev/zero of=file bs=1024 count=$((5000*1024)) 
The maximum write Output is 24,2 MB/s


2.)

If I install a new VM on the HA-target an ESX-2 and I test the performance with the same command. The maximum write Output is about the half 12,2.


Did anyone of you have an idea what ther went wrong? I think 24 MB/s is not very fast. I´ve read in other Threads about 100 MB/s or much faster.
Is the problem, that the Starwind-Server are part of the ESX?

Thanks for reading this.

Sincerely,

Chr.Raible
imrevo
Posts: 26
Joined: Tue Jan 12, 2010 9:20 am
Location: Germany

Wed Jan 13, 2010 3:11 pm

Chr.Raible wrote: Is the problem, that the Starwind-Server are part of the ESX?
Definitely, as the "disk" I/O from your VM at ESX-1 is going through your NICs twice: out to the switch, back to the virtual nic of the 2008 server and again out to the partner server at ESX-2.

As ESX-1 is your primary server, you can surely imagine the way of the data when you are testing at ESX-2 :-)

You should use external servers as iSCSI targets.

cu
Volker
Constantin (staff)

Wed Jan 13, 2010 3:51 pm

Two questions: Did you use 9K JumboFrames and did you apply registry tweaks?
Youp
Posts: 5
Joined: Wed Jan 13, 2010 4:10 pm

Wed Jan 13, 2010 4:21 pm

Hello, i have the same performance issue , i test starwind under windows 2008 on ESX server , my disk are ssd intel X-25M and with other san software i have better performance , i use this command :

with DD

dd if=/dev/zero of=file bs=1024 count=$((1000*1024))
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB) copied, 34.7032 seconds, 30.2 MB/s

-> on a VM installed on this disk (not on iscsi connection) :

dd if=/dev/zero of=file bs=1024 count=$((1000*1024))
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB) copied, 12.0554 seconds, 87.0 MB/s


With Seeker

./seeker /dev/sda
Seeker v2.0, 2007-01-15, http://www.linuxinsight.com/how_fast_is_your_disk.html
Benchmarking /dev/sda [8192MB], wait 30 seconds..............................
Results: 2663 seeks/second, 0.38 ms random access time

-> on a VM installed on this disk (not on iscsi connection) :

./seeker /dev/sda
Seeker v2.0, 2007-01-15, http://www.linuxinsight.com/how_fast_is_your_disk.html
Benchmarking /dev/sda [20480MB], wait 30 seconds..............................
Results: 12225 seeks/second, 0.08 ms random access time


With HDPARM

hdparm -tT /dev/sda

/dev/sda:
Timing cached reads: 22488 MB in 2.00 seconds = 11266.96 MB/sec
Timing buffered disk reads: 428 MB in 3.01 seconds = 142.10 MB/sec


-> on a VM installed on this disk (not on iscsi connection) :

hdparm -tT /dev/sda

/dev/sda:
Timing cached reads: 22456 MB in 2.00 seconds = 11252.44 MB/sec
Timing buffered disk reads: 778 MB in 3.01 seconds = 258.72 MB/sec


how to increase performance ? how to find Bottleneck ? iscsi , windows 2008 ?

thanks
Constantin (staff)

Wed Jan 13, 2010 5:23 pm

On all network hardware, servers and clients you have to enable 9k JumboFrames and in Windows apply following tweaks:
HKLM\System\CurrentControlSet\Services\Tcpip\Parametres:
GlobalMaxTcpWindowSize=0x01400000
TcpWindowSize=0x01400000
Tcp1323Opts=3
SackOpts=1.
Youp
Posts: 5
Joined: Wed Jan 13, 2010 4:10 pm

Thu Jan 14, 2010 9:19 am

this settings are apply on iscsi storage ... the clients are esx4i ...
Constantin (staff)

Thu Jan 14, 2010 10:02 am

Did you turned on 9k JUmboFrames in ESX?
Youp
Posts: 5
Joined: Wed Jan 13, 2010 4:10 pm

Thu Jan 14, 2010 11:22 am

no , i compare with others solutions , and 9k jumbo are not activate ...
Constantin (staff)

Thu Jan 14, 2010 11:42 am

Please, active it, and on switches too. And please, compare after it.
Youp
Posts: 5
Joined: Wed Jan 13, 2010 4:10 pm

Thu Jan 14, 2010 1:03 pm

i have activated jumbo frame on esx and windows :

Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
vSwitch1 64 3 64 9000

PortGroup Name VLAN ID Used Ports Uplinks
Kernel ISCSI JUMBO 0 1
ISCSI JUMBO 0 1


on windows storage starwind : (10.0.10.227 is esx server)

Pinging 10.0.10.227 with 9000 bytes of data:
Reply from 10.0.10.227: bytes=9000 time<1ms TTL=64
Reply from 10.0.10.227: bytes=9000 time<1ms TTL=64
Reply from 10.0.10.227: bytes=9000 time<1ms TTL=64
Reply from 10.0.10.227: bytes=9000 time=2ms TTL=64


the result are the same :(

hdparm -tT /dev/sda

/dev/sda:
Timing cached reads: 22968 MB in 1.99 seconds = 11532.00 MB/sec
Timing buffered disk reads: 406 MB in 3.01 seconds = 134.91 MB/sec

./seeker /dev/sda
Seeker v2.0, 2007-01-15, [ ... ]
Benchmarking /dev/sda [8192MB], wait 30 seconds..............................
Results: 2689 seeks/second, 0.37 ms random access time

dd if=/dev/zero of=file bs=1024 count=$((1000*1024))
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB) copied, 35.2593 seconds, 29.7 MB/s
Constantin (staff)

Thu Jan 14, 2010 4:14 pm

Do use use RAID5 volumes? If yes, format the windows volumes that you are storing the images on using 64k (if it`s default) block size, and re-test
Youp
Posts: 5
Joined: Wed Jan 13, 2010 4:10 pm

Thu Jan 14, 2010 4:36 pm

no i don't use raid, the windows are installed one ssd drive
Constantin (staff)

Fri Jan 15, 2010 3:18 pm

We are investigating your case now, and will respond you early next week.
JLaay
Posts: 45
Joined: Tue Nov 03, 2009 10:00 am

Sat Jan 16, 2010 4:29 pm

Hi all,

Chr.Raible: I can imagine that you are disappointed, but
- How many and what kind of HD you use in the raid 5?
- How much memory does the physical server have?
- What was the configuration of this 'over 100MB/s'. Read or write IOs testing or a mix. Compare aples with apples :)
- Is this (Starwind as a VM) the config you wil also use for production or just for testing functionality? Otherwise testing this way will always disappoint you.
- Is Starwind developped for this kind of config?
- Start VM W2008 with 1 CPU, more isn't always better and might well slow down overall performance
- Add more memory to VM W2008 (steps)

For good iSCSI performance you need a properly configured network and a good/fast disksubsytem. both are interdependent for throughput
I 'm not familiar with SSDs but I 've read there are stil issues with write performance. Brand/type dependent.

For testing througput IOmeter (Windows) is a good tool (see VMware communities). Get the default input file for testing
http://communities.vmware.com/thread/73745

Network
For eg Starwind you need dedicated at least 1 1Gb NIC for iSCSI and 1 1GB NIC for sync link.
With 1GB NIC iSCSI link theoretical throughput can be no more the 128Mb.

Disk subsystem
For better read but certainly write performance you need multiple 10/15k SAS spindles in raid 5 or best raid 10 configuration.
SATA disk will work but relatively you will need even more spindles.

For a good iSCSI connection between ESXi hosts and storage you need to use jumbo frames (and flow control)
You need to put:

http://www.conetrix.com/Blog/post/Insta ... re-v4.aspx

Jumbo MTU 9000 on
1. vSwitch
# esxcfg-vswitch -l
2. vmkNICs
# esxcfg-vmknic -l
3. NIC in OS
4. ports on your physical switch(es). Depending on brand/type of the switch this is for the entire switch or per VLAN!

Flow control on
1. NIC in OS
2. ports on your physical switch(es)

+ tweaks (Windows) as suggested by Starwind
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Sat Jan 16, 2010 6:39 pm

Give a try to the most recent V5.x branch. It has new cache implemented so slowed down network issues (collisions, dropped frames) and remote disk subsystem performance should be "smoothed". V5.2 has non-HA nodes cache enabled and V5.5 and upp with have HA nodes cached as well.

Use Intel I/O Meter to run performance benchmarks. Both DD and HDPARM are on the ancient side of the world. Also just enabling Jumbo frames is not the only thing you need to do. Use NTTTCP and IPerf to make sure network does wire speed with the TCP (and thus iSCSI).
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Post Reply