understanding HA setup and performance

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
jsmith007
Posts: 3
Joined: Thu Apr 08, 2010 8:35 pm

Thu Apr 08, 2010 9:07 pm

Hi All,
I am new here and this is my first post. I am testing out HA on startwind iscsi latest release and have some questions regarding failover performance. My setup involves 2 servers (sever1 and server2 acting as HA) and hypever v server that runs vm. I followed the technical guide call "hyperv and high availability storage." I am not using microsoft clustering in anyway since I only have one hyperv server. What I did was setup HA storage on server1 and sever2 and using microsoft iscsi on hyperv server, I connected to clustered HA storage which is on both sever1 and server2. Server 1 was set as a primary and server 2 was secondary. I was able to export window7 vm to shared HA storage on hyperv server. Then imported back into hyperv server. Windows 7 vm run very smoothly while server1 was set as a primary (mpio is working and set it to round robin as suggested) and server1 and server2 are fully synced before the import of windows7. So now it is time to test HA of Starwinds software. windows 7 was running and pinning to google.com at this point. I then killed starwind process from task manager on server1 (testing HA), and windows7 continues without any issues and performance. A couple of seconds later starwinds software notified the loss of connection from server1. At that point I restarted the service on server1 and logged back onto it. Now I see yellow exclaimation mark indicating there is a problem with HA and sync. At this point windows 7 is clearly running from server 2 since I intentionally killed process on server1. I then select "fully sync" to resync on server1. First of all it takes forever to start. And then I start notcing performance problem on windows 7 vm. Very slow responses on everything. For example, when I click x to close, it took a while to close it. So, I wanted to investigate futher with HA and see what is going on. On server 2 where the image file (HA setup) is, the image file was barely 1GB. I intially setup HA disk for 50GB on both server as HA storage. That surprised me since I am kind of thinking this would be close to at least 20, 30 GB which is the default install size of windows 7. On the primary, server 1, the image file was slightly more than 50GB which seems to make sense.
The sync was so long, I decided to switch to "fast sync" and surprisingly still slow.
So my question is how does HA really work when it failed over? Is performance hit expected this bad for one vm? What if there are many vms on shared HA storage? What is the performance hit in term of % or number on failover server?
I have fairly decent systems on this testing and performance hit was really slow and concerned me with futher testing for production use. Hyperv server is dual quad core, raid 1 and raid 5 for data, dell 2900 with hardware raid card. Server1 and server2 is also 2900 box with 12GB of memory and raid 5 for data where HA storage is and it is dell hardware raid.
So it this how HA work by slowly failover or something wrong with my setup? Can anyone answer this question because I am afraid how I can run multiple production vms on HA storage this slow.
Thank you for your answer in advance.
jsmith007
Posts: 3
Joined: Thu Apr 08, 2010 8:35 pm

Thu Apr 08, 2010 11:50 pm

Hi I want to correct one thing before anyone has a chance to answer it. I was wrong about the size of img on secondary HA: it is exactly the same as primary. Good that is clear. Now, I managed to run another windows 7 on the same HA storage (I have total of 2 windows 7 vms), and disk queue length grow to 5 on both vms. At this point performance on windows 7 vm is really bad. I see circle everytime I click on something. I think I understand about HA and what is designed to do. Now, I don't understand why my vms have such a poor performance. I have hardware raid on raid 5 4 to 6 SATA disks running on it. Again, I see performance problem only on 2 windows vms on HA storage. How can I improve the performance? How many vm am I suppose to be able to run without having hit on the disk under my setup?

Thanks,
Henry
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed Apr 14, 2010 4:05 pm

First of all we'd love to see your disk subsystem benchmarks. You could have BRILLIANT hardware but b/c of the improper configuration you could easily have a bummer... That's why please use Intel I/O Meter or at least ATTO Disk Benchmark and trace some runs on the hardware you use. If we'll see it's OK and write is acceptable (if write back cache is delayed you're going to get clumsy 10-15 MB/sec and it's complete nonsense for 2010) we'll check what could be done at our side. This is number ONE. And here's NUMBER TWO: drop us an e-mail to support@starwindsoftware.com and apply for Beta program. And we'll issue you with a cache-re-worked V5.4 Beta version you could give a try. It has cross-bar switch implemented in the software and should be miles ahead of V5.3 when talking about HA performance. Hope this helps :)
jsmith007 wrote:Hi I want to correct one thing before anyone has a chance to answer it. I was wrong about the size of img on secondary HA: it is exactly the same as primary. Good that is clear. Now, I managed to run another windows 7 on the same HA storage (I have total of 2 windows 7 vms), and disk queue length grow to 5 on both vms. At this point performance on windows 7 vm is really bad. I see circle everytime I click on something. I think I understand about HA and what is designed to do. Now, I don't understand why my vms have such a poor performance. I have hardware raid on raid 5 4 to 6 SATA disks running on it. Again, I see performance problem only on 2 windows vms on HA storage. How can I improve the performance? How many vm am I suppose to be able to run without having hit on the disk under my setup?

Thanks,
Henry
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
jsmith007
Posts: 3
Joined: Thu Apr 08, 2010 8:35 pm

Wed Apr 14, 2010 8:41 pm

I used ATTO benchmark in default setting and here are the results. For iscsi targets that I setup with HA, read is 80MB and write is about 35MB (both HA target is running raid 5 with hardware raid card). For iscsi target without HA just normal image file setup, I have 110MB read and 130MB write on RAID 10. This is separate server but almost identical configuration to other HA servers. It just so happened I setup this one up with raid 10 to support oracle database on it. All the disks involved are SATA II. My question is, how do I improve my read/write on RAID 5 HA targets? Does is have to do with having HA? Do I need to install and configure any starport stuff on any of this HA targets? I don't have another raid 10 server with SATA II disk anywhere so I cannot test the same for HA with raid 10 setup. So if you have any information on this, I would appreciate this. Last question, in term of backing up HA, if the recommand way of backing up with starwind software or use third party backup software through VM or LUN?

Thank you for your answers in advance...
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Thu Apr 15, 2010 11:34 am

Oh, this way it makes sense. For HA data is reported as "written" only after physically written to the both nodes. So if you just split two writes you're going to have 130MB/sec / 2 = 65MB/sec. Add here broken pipeline, re-buffering and so on and your 35 MB/sec look pretty close to what you should have. With cached HA you'll get MUCH better performance (near to non-cached non-HA). So proceed with V5.4 Beta unless you have production environment :)
jsmith007 wrote:I used ATTO benchmark in default setting and here are the results. For iscsi targets that I setup with HA, read is 80MB and write is about 35MB (both HA target is running raid 5 with hardware raid card). For iscsi target without HA just normal image file setup, I have 110MB read and 130MB write on RAID 10. This is separate server but almost identical configuration to other HA servers. It just so happened I setup this one up with raid 10 to support oracle database on it. All the disks involved are SATA II. My question is, how do I improve my read/write on RAID 5 HA targets? Does is have to do with having HA? Do I need to install and configure any starport stuff on any of this HA targets? I don't have another raid 10 server with SATA II disk anywhere so I cannot test the same for HA with raid 10 setup. So if you have any information on this, I would appreciate this. Last question, in term of backing up HA, if the recommand way of backing up with starwind software or use third party backup software through VM or LUN?

Thank you for your answers in advance...
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Post Reply