Poor performance (hight latency) on Starwind vs freeNas

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
mardes
Posts: 11
Joined: Sat Dec 01, 2012 2:30 pm

Thu Jun 13, 2013 1:19 pm

I have to submit a question (hoping that It's a my mistake).

I've notice very poor performance on StarWind but only during multiple disk access (concurrent thread) especially with ESXi.
So I decided to investigate a little and make some comparison with freeNas.
The result is that on sequential write you can think that all is OK, but during the real work you detect a real bottleneck.


I run my test with last StarWind build (6.0.5569) and on 4 systems:
1) HP Proliant DL380 G4 Dual Xeon 3.2Ghz 64-Bit Server 4GB RAM
2) HP Proliant DL360 G4 Dual Xeon 3.2Ghz 64-Bit Server 4GB RAM
3) HP ProLiant DL160 G5 Storage Server - Dual Xeon 3.2Ghz 64-Bit Server - 12 GB RAM


The results are almost the same on all of these, so I think there is really a problem with StarWind.
The more remarkable, I think, is the high latency (if compared to freeNas)

I've executed all test on each system with freeNas and then Starwind (so both are been tested on the same identical system).

Please see the attached images...
Attachments
comp3.png
comp3.png (71.32 KiB) Viewed 13461 times
comp2.png
comp2.png (177.04 KiB) Viewed 13462 times
comp1.png
comp1.png (111.54 KiB) Viewed 13458 times
mardes
Posts: 11
Joined: Sat Dec 01, 2012 2:30 pm

Fri Jun 14, 2013 11:25 am

Anatoly, thanks for your suggestion.

I forgot to mention that:
1) I've already verified that I've no network issue (iPerf tested and at 100% 1 Gigabit)
2) I've tested StarWind with 512 Mb write-back cache (all file used for test are under 512 Mb -so they stay in cache-)
3) If I test StarWind with RAM disk It's OK (0,5 ms latency)
4) If I test performance on physical disk (where Starwind .img resides) the performances are also BAD (10~15 ms)

So, now I know that StarWind cannot do better.

But the question is what people should implement for a iSCSI SAN (StarWind or freeNAS)?

The fact is that if I use freeNAS on all HWs I've tested, I get an impressive high performance [may be for SO right-way in using hardware-disk vs windows disk-way (drivers)].
The fact is what you can see in the image I attach: I compared a "local physical SSD CRUCIAL drive"! with freeNas and StarWind.
The results show the huge difference in performance.
Attachments
freeNas vs StarWind vs SSD.png
freeNas vs StarWind vs SSD.png (83.32 KiB) Viewed 13426 times
mardes
Posts: 11
Joined: Sat Dec 01, 2012 2:30 pm

Sat Jun 15, 2013 9:34 am

I'm sorry but I must post another test, hoping that there is a solution for the Windows poor disk performance. Yes, Windows: because I think that it's not a Starwind limitation, but a Windows limitation (I've tried up to "Windows Server 2012" with all same results).

I've tested a crazy thing, just for demonstration purpose:
1) I've configured an iSCSI target on freeNas
2) I've accessed that target from the windows host on wich reside StarWind and now I've got (e.g.) my disk D:
2) On StarWind console I've created a target (with no cache) using a standard .img file on disk D: (disk served by freeNas) (yes it's crazy)
3) I've accessed this Starwind target by my windows client

So, I was testing performance on StarWind which use an iSCSI disk served by freeNas rather than a windows local physical disk.

The result is unequivocal.
Now StarWind performs very well (as well as it does for his pure-ram-based target disk).

Look at the attacched image:
Disk C is a local physical SSD disk on my windows client
Disk Z is an StarWind iSCSI disk stacked on windows physical disk
Disk G is an StarWind iSCSI disk stacked on freeNas iSCSI disk (crazy thing only for test).

What does It means?
It means that, if you need performance on disk, you should never use a windows OS.
I would to use StarWind, but It's only for Windows.

Paradoxically if you really want to use StarWind, you should put your physical disks on a Linux box, configure them as iSCSI target and then access these iSCSI targets from your Windows-StarWind host.

That's all.
Attachments
starwind stacked on freeNas.png
starwind stacked on freeNas.png (115.81 KiB) Viewed 13385 times
mardes
Posts: 11
Joined: Sat Dec 01, 2012 2:30 pm

Sun Jun 16, 2013 2:32 am

I run test with real data and also "random data pattern".
Tomorrow I'll post you result also with IOMETER.
Today I've migrated 6 VMs on ESXi from Starwind to freeNas
using the same physical machine (so with no change in SAN hardware).
So the test are run with real real real work.
Now my VMs are very more reactive especially on concurrent disk access.

Last year I went crazy looking for the reason for these poor performances.
Now that I've migrated the VMs on FreeNAS-iSCSI-datastore, I discovered the reason for this effort.
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Sun Jun 16, 2013 12:23 pm

I had to clean the topic a bit as is turned more emotional then professional :) Now back to our lambs (c) ...

1) Use Intel I/O Meter. There are tons of a good (and bad) benchmarks and we physically cannot afford to spend time learning all of them.

2) Configure proper test bed. Remove out of brackets as much as you can. Say for initiator use Windows Server 2012 machine (bare metal not VM) and
for target use the same hardware just replacing boot disks with Windows (for StarWind and MS target) and FreeBSD (FreeNAS actually). We're looking for
latency so one spindle will be fine for tests. Running tests in a loopback will give you number A (raw disk performance) and running the same pattern
test over the GbE will tell us how much latency every target (StarWind, MS target and FreeNAS) bring to the table with the numbe B.

3) I do understand you're interested in ESXi performance for now but to know is this a ESXi or StarWind or handshake or whatever issue - start with a
Windows-to-Windows or Windows-to-<others> tests. If everything will go flawlessly we'll find out what's wrong with ESXi config (if any) or ESXi drivers etc.

4) Document everything. Hardware, software configs and interconnection diagrams. Make sure you do run a "short stroke" (using first say 500GB of the disk)
and use the same space for all the tests. And write some random (no all zeros!) pattern on it as SSDs and ZFSs will give crazy numbers when doing I/O
with non-allocated data.

5) When running turn cache OFF. MS target has no one, StarWind should have cache disabled and the same should be configured for ZFS - no dedupe,
no WB cache, no L2ARC. Non-cached numbers will give us LOW watermark of performance, WORST case. Something we're looking for. Cache will improve
but with tests it will just mess everything up.

6) Run proper test pattern. 4-8 workers (simulating VMs) with 8-16 I/Os queue at least.

4KB blocks (native block for modern hard disk) 100% read
4KB blocks 100% write
64KB blocks 100% read
64KB blocks 100% write

the same but sequentially. Should be 8 charts for every target, 3 targets will give us 24 pictures. And 32 adding raw disk in a loopback (Windows should be fine).

Make sure you run the test for a long time (10+ minutes) to have everything stabilized. I'd even run 30 mins test if you care and have time.

After having Part I and finding out it's OK we'll continue with Part II (ESXi initiator). Then can dive with a stripped down set of tests with tests within a VMs.
Going to this scenario directly will tell us nothing as there are too many things where everything can break (ESXi hanshake, ZFS cache etc).

7) We'd be happy to help you with remote session (if required) and also we'd love to see you blogging final results for StarWind Vs. MS target Vs. FreeNAS
on the same hardware. Just *PLEASE* don't post pre-mature ones (the ones we did not confirm as we can do nothing here and giving up @ this moment).
People will love what you do.

Thank you!
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Sun Jun 16, 2013 12:27 pm

It's still not clear what;s going on... With the RAM and network as a tier StarWind does a good job for you so it's not iSCSI engine being
slow and addig extra huge latency - it's something wrong with either ESXi-to-StarWind handshake or StarWind-running Windows I/O subsystem
(need local disk performance tests to find out who's who).

Please use the methodology I've described and we'll see what's messy, what should be fixed (your config or our software?) and we'll be happy to
assist, proceed and do our home work if required. That's it :)
mardes wrote:I run test with real data and also "random data pattern".
Tomorrow I'll post you result also with IOMETER.
Today I've migrated 6 VMs on ESXi from Starwind to freeNas
using the same physical machine (so with no change in SAN hardware).
So the test are run with real real real work.
Now my VMs are very more reactive especially on concurrent disk access.

Last year I went crazy looking for the reason for these poor performances.
Now that I've migrated the VMs on FreeNAS-iSCSI-datastore, I discovered the reason for this effort.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed Jun 19, 2013 10:30 am

OK, so we ran the tests here of a non-cached StarWind Vs. MS target Vs. FreeNAS on the same hardware. We were not able to reproduce your issues.
Even close! We'll publish the numbers here and also I'll ask the guys to post some screenshots. My offer to jump in on your hardware config to see what's wrong
on your side is still valid. Thanks!
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Wed Jun 19, 2013 12:52 pm

So here are the test results.
Storage server:
Windows 2012 Datacenter
Intel Core i7 CPU 2600 @ 3.4 GHz 3.7 GHz
8 GB RAM
Realtek PCIe GBE Controller
OS disk - WD Blue 1 WD10EALX
Storage disk - OCZ Vertex3 120 GB
Storage SATA WD Black 1TB WD1002FAEX

Client server:
Windows 2012 Datacenter
Intel Core i7 CPU 2600 @ 3.40GHz 3.70 GHz
8 GB RAM
Realtek PCIe GBE Family Controller
System HDD WD Blue 1TB WD10EALX
20130619table_new.jpg
20130619table_new.jpg (105.62 KiB) Viewed 13158 times
* ART stands for Average response time
SSD
SSD
20130619chart1.jpg (46.02 KiB) Viewed 13172 times
20130619chart2.jpg
20130619chart2.jpg (36.12 KiB) Viewed 13170 times
Max Kolomyeytsev
StarWind Software
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed Jun 19, 2013 2:34 pm

Thank you Max!

* ART is Averate Response Time

It's clearly seen FreeNAS is loosing to everything including non-cached StarWind and cache-less from the birth MS target.

We'd like to continue the discussion to figure out what's wrong with your config.

P.S. Single worker / single I/O and 10 GbE tests are coming.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
mardes
Posts: 11
Joined: Sat Dec 01, 2012 2:30 pm

Thu Jun 20, 2013 3:38 pm

First of all, I apologize to everyone for the automatic translation of this post!

I encourage everyone to do testing on their own!
I did priove on 3 different HP servers.

Many of the posts on this thread are gone. But I have them all saved on my PC!
I could publish them in full on my blog but it would seem like something improper (such as INCORRECT is censor and cut out the post tailored to build a thesis on the table the magnificence of its product), but it is possible that publish them!

[ ... ]

We'll see you on my blog!

ps: They also have disabled my notification option on this topic. In fact I read it by accident and I discovered the cut and paste of the staff. What a shame!
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Thu Jun 20, 2013 3:56 pm

1) This is a support forum and not a place to demonstrate your ego. That's why some of your (and mine) posts were deleted.

2) You've raised a question blaming us in a low performance. We did our job veryfing it's not what you say at least in 100% of cases and also offered you to run proper tests and offered our
help (remote shell) to do this on your hardware just to figure out what's broken. You refused and now trying to play Che Guevara here. Nice. But not on this forum. People want a tracked solution to their
issues and your non-constructive position does not help to anybody (including you). See 1)

3) When you'll feel you want to HELP and ACCEPT HELP let me know (I've contacted you by forum e-mail with no response).

Good luck.

== Case closed ==
mardes wrote:First of all, I apologize to everyone for the automatic translation of this post!

I encourage everyone to do testing on their own!
I did priove on 3 different HP servers.

Many of the posts on this thread are gone. But I have them all saved on my PC!
I could publish them in full on my blog but it would seem like something improper (such as INCORRECT is censor and cut out the post tailored to build a thesis on the table the magnificence of its product), but it is possible that publish them!

[ ... ]

We'll see you on my blog!

ps: They also have disabled my notification option on this topic. In fact I read it by accident and I discovered the cut and paste of the staff. What a shame!
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
mardes
Posts: 11
Joined: Sat Dec 01, 2012 2:30 pm

Wed Jul 10, 2013 1:39 pm

I've run other tests last week.
I think I've resolved the mystery. The misunderstanding was due to the available cache.
The HW I used was an HP server with twin "Intel Xeon" and "12 GB Ram", so FreeNAS had set its cache to 7 GB but I could only test StarWind with 512 Mb cache.
This made results incomparable.
I've got a trial license from StarWind team and now I've re-run "passmark advanced disk benchmark" with StarWind (also) using 7GB cache.
This test has been run on same HW and same storage drive, doing first a FreeNAS boot and then a Windows 2012 boot (SatrWind).

Attached here, there are 2 screenshots:
1) cumulative passmark test on both FreeNAS and Starwind (all the lines are very close together)
around 100 MB/sec there are file-server-profile (passmark)
around 40 MB/sec there are database-profile (passmark)
FreeNAS are all result between h 12:06 and 12:37
StarWind are all result between h 13:03 and 13:34

2) salient differents: ------- iSCSI-disk = 200GB (so not full cache fillable) -------- passmark-profile=file-server (starwind has performed better)
legend: green(h 12:34)=FreeNAS red(h 13:32)=StarWind


So mystery is resolved. At same Ram/cache conditions there is NO slow StarWind performance if compared to FreeNAS.
Attachments
Passmark cumulaive.PNG
Passmark cumulaive.PNG (23.2 KiB) Viewed 12743 times
Passmark salient  - greenFN redSW passmark-fileServer-profile  on iSCSI 200GB (not full cache fillable).PNG
Passmark salient - greenFN redSW passmark-fileServer-profile on iSCSI 200GB (not full cache fillable).PNG (17.65 KiB) Viewed 12743 times
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed Jul 10, 2013 6:41 pm

OK, thank you! More stuff is coming (log-structured file system to accelerate writes, flash caching to compare with L2ARC and so on).
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Post Reply