Recommended SAN design for vSPhere

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
Philip_B
Posts: 11
Joined: Wed Apr 07, 2010 12:49 am

Wed Apr 07, 2010 1:05 am

Greetings-

I realize how complicated optimizing a SAN design can be, but within a few parameters I am hoping to get some pointers for our specific environment. I have read a great deal on this forum and also on the VMWare forum about the various little "caveats" that trip people up. I'm hoping that within some basic parameters some of the more experienced users could make a few recommendations that will get us started down the right path.

Hardware that we have:

VMWare ESXi 4.0 U1 servers: Qty (3) 32 GB ram , dual Intel 5410 CPUs booting from USB flash. Each server is a SuperMicro with qty (4) Intel 1GB NICS total.
Storage: LSI/3Ware 8 port SAS raid controller with 8 SAS HDDs (I'd like to configure RAID 50) Server has Qty (6) Intel GB NICs, Server 2003 Standard with 4 GB of ram
24 port GB Ethernet switch (managed) supporting VLANs, Trunking etc.

I'm thinking of a config that has multiple VLANs on each VMWare host (rather than bonding NICS) and presenting multiple LUNS at different IPs on the storage end.

I realize that I am focusing on performance and not failover - I'd like to do both at some point, but given our limited hardware and budget would like as much performance now and a path to increased availability down the road.

Does anyone have any recommendations on what to do with this hardware? Thanks in advance for any recommendations.
Constantin (staff)

Wed Apr 07, 2010 11:19 am

Binding the NICs allows you to use maximym performance and get higher redundancy, than without NIC teaming.
peekay
Posts: 27
Joined: Sun Aug 09, 2009 2:38 am

Wed Apr 07, 2010 1:19 pm

I had a similar challenge with our configuration. However, the ESX 4 server we have, has a dual iSCSI HBA (QLogic) which cannot be teamed. One item to note is that StarWind presents ALL LUNS on ALL NICS, so the only way to present specific LUNs to a NIC is by using access rights (for each NIC) or, in my case, using MPIO on ESX (with two VLANS) and selecting the preferred path for each LUN. With access rights setup on all StarWind NICS so that specific StarWind devices are "presented" to specific ESXi IPs, then you effectively limit what appears on each ESX i iSCSI adapter. I have not tried this but have discussed it in a thread not long ago. This should also work with NIC teaming since they appear as a single adapter.

:D
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed Apr 07, 2010 3:55 pm

We should receive some QLogic HBAs from friendly QLogic guys soon so we'll tell for sure how to configure and extract maximum performance from them :)
peekay wrote:I had a similar challenge with our configuration. However, the ESX 4 server we have, has a dual iSCSI HBA (QLogic) which cannot be teamed. One item to note is that StarWind presents ALL LUNS on ALL NICS, so the only way to present specific LUNs to a NIC is by using access rights (for each NIC) or, in my case, using MPIO on ESX (with two VLANS) and selecting the preferred path for each LUN. With access rights setup on all StarWind NICS so that specific StarWind devices are "presented" to specific ESXi IPs, then you effectively limit what appears on each ESX i iSCSI adapter. I have not tried this but have discussed it in a thread not long ago. This should also work with NIC teaming since they appear as a single adapter.

:D
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Philip_B
Posts: 11
Joined: Wed Apr 07, 2010 12:49 am

Wed Apr 07, 2010 4:38 pm

THank you for the comments.

It is my understanding that teaming NICs in ESXi 4 will just generate redundancy and not necessarily any performance increase. For example I believe that the two teamed nics could not be added to a pair of 1 GB ports that were in a trunk to generate an effective pipe of 2 GB... so you only get 1 GB connections with redundancy should one NIC fail.

Also, I believe that VMWare will limit connections to 1 GB per LUN. If this is not correct please let me know.

I think the best performance can be found (while sacrificing redundancy) by connecting each nic to a different vlan (different IP subnet) and connecting to a different LUN presention (different IP). The issue here ( I think) is sorting out multi pathing - which I am very confused on.

I was thinking of carving X LUNS on the storage side and placing each of the 6 NICs in 6 seperate subnets (VLANd on 1 switch).

Each VMWare server would connect to Y LUNs... but I'm not sure if this is the best idea and not sure how Vmotion would work... I realize that the same LUN can be presented across multiple IPs, so this gets a little tricky in terms of tuning the number of LUNS and the number of VM's per LUN.

If the goal is performance initially, there has to be a single config that makes the most sense and steps around the performance gotchas that exist in VMWare.

Anyone have input on this?
Constantin (staff)

Thu Apr 15, 2010 2:51 pm

Philip, I would recommend you to read following articles about network redundancy and maximum performance of ESX(i) with it:
http://kb.vmware.com/kb/1001938
http://www.tcpdump.com/kb/virtualizatio ... ation.html
First article is about requirements to achieve maximum performance, and second is how-to manual.
Philip_B
Posts: 11
Joined: Wed Apr 07, 2010 12:49 am

Thu Apr 15, 2010 10:04 pm

Thank you. Those links were helpful.

ALso, for other users, they may find this article helpful in making decisions in thier own environments:

http://virtualgeek.typepad.com/virtual_ ... phere.html
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Fri Apr 16, 2010 8:49 am

Wow! It's a good one. Thank you for posting this!
Philip_B wrote:Thank you. Those links were helpful.

ALso, for other users, they may find this article helpful in making decisions in thier own environments:

http://virtualgeek.typepad.com/virtual_ ... phere.html
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Philip_B
Posts: 11
Joined: Wed Apr 07, 2010 12:49 am

Mon Jul 05, 2010 3:42 pm

We continue to have very poor performance (15-25 mb/s on IO Meter) and have been unable to find the problem.

I'm hoping that if I detail the steps we have taken you may be able to find where we have gone wrong. Below is a summary of the hardware and config steps we have taken. We have also had Bob Boule connect and remotely verify our settings with no success.

Storage Box

Dual Xeon 5050 CPUs with 8 GB or RAM (Supermicro) 8 NICS
LSI 9690 8 port controller with 15K RPM disk - RAID 50

ESXi 4 Box
Supermicro Dual Xeon 5420 W/ 16 GB of RAM with 4 GB NICS.

We have a clean install of Server 2008 R2 on the storage server and updated firmware and drivers for controllers and NICs. Installed the latest version if Starwind yesterday and presented two image devices - one based on 100GB of disk and the other is a 2GB ram disk.

We have dedicated one nic to this test (to keep things simple), set Jumbo frames to 9014 (max for device) and set VLAN tagging to VLAN ID 2

This nic is plugged into a port that is in VLAN 2

We have followed the guidance provided by Bob Boule @ Starwind and created a Port Group on the ESX box and enabled jumbo frames by using the following steps:

CONFIGURING ESX SERVER

There is no GUI in VirtualCenter for configuring jumbo frames; all of the configuration must be done from a command line on the ESX server itself. There are two basic steps:

Configure the MTU on the vSwitch.
Create a VMkernel interface with the correct MTU.
First, we need to set the MTU for the vSwitch. This is pretty easily accomplished using esxcfg-vswitch:

esxcfg-vswitch -m 9000 vSwitch1

A quick run of “esxcfg-vswitch -l” (that’s a lowercase L) will show the vSwitch’s MTU is now 9000; in addition, “esxcfg-nics -l” (again, a lowercase L) will show the MTU for the NICs linked to that vSwitch are now set to 9000 as well.

Second, we need to create a VMkernel interface. This step is a bit more complicated, because we need to have a port group in place already, and that port group needs to be on the vSwitch whose MTU we set previously:

esxcfg-vmknic -a -i 172.16.1.1 -n 255.255.0.0 -m 9000 IPStorage

This creates a port group called IPStorage on vSwitch1—the vSwitch whose MTU was previously set to 9000—and then creates a VMkernel port with an MTU of 9000 on that port group. Be sure to use an IP address that is appropriate for your network when creating the VMkernel interface.

To test that everything is working so far, use the vmkping command:

vmkping -s 9000 172.16.1.200

Clearly, you’ll want to substitute the IP address of your storage system in that command.


The two nics that are in this portgroup are also plugged into ports onteh switch that are part of the vlan 2.

From teh storage box we can ping the ESX box and teh ESX box can see and mount the LUNS, but we can't ping the storage from the ESX box using vmkping -s 9000 x.x.x.x

If we attach to the storage with either a physical server or a VM and run IO Meter we get results in the range of 10-25 mb/s. There is very little difference between ram disk LUNS and RAID 50 luns, so I believe that we have an issue with the network setup.

Any guidance or thoughts would be appreciated. I have looked far and wide for any other documentation on how to configure this and have not found any information that resolves this issue.

Thanks in advance.
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Wed Jul 21, 2010 1:55 pm

Enable Jumbo Frames for your ESX Hosts (see step one below for the number of vSwitches to port groups to physical NICs)

Create your vSwitch

Set 9k MTU on the vswitch

Esxcfg-vswitch –m 9000 vSwitch1



Create the Port Group for the VMkernel interface

Esxcfg-vswitch –A JumboVMK vSwitch1



Create the VMkernel port with 9k MTU



Esxcfg-vmknic –a –i 192.168.1.64 –n 255.255.255.0 –m 9000 –p JumboVMK



Here are the steps which need to be taken in your case
1. Create Separate vSwitches for each VMkernel port you will use (one for each physical NIC)
2. Assign Each new vSwitch a physical Adapter and create a VMkernel port on new vSwitch (See above)
3. Bind the phycial NICs that are being used to the ESX iSCSI Stack

esxcli swiscsi nic add –n vmkernelportgroupname –d vmhba_name

4. Once this is done, rescan your HBA to discover your targets across all eligible paths
5. Next you want to set the path selection policy to round robin (to load balance across all available NICs) this can be done by grabbing the eui of the multipath target and issuing the following command:

esxcli nmp roundrobin getconfig –d eui.XXXXXXXXXX

6. Next you want to make sure that the round robin policy to load balance over the NICs by choosing how this will happen, some examples:

esxcli nmp roundrobin setconfig –device eui.XXXXXXXX –iops 1 –type iops

This will attempt to load balance based on each IO operation spreading the load evenly down the multiple paths
Max Kolomyeytsev
StarWind Software
Post Reply