performance questions

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

yyrkoon
Posts: 8
Joined: Sun Dec 10, 2006 10:08 pm

Sun Dec 10, 2006 10:47 pm

Hello,

First of all, I would like to point out that there seems to be a small compatibility issue with Windows Vista RC2 build 5744 (ultimate). When installing to %root%/Program Files/, it seems that Vista by default, requires admin privileges to install programs, then afterwards, marks the directory as read only. End result was the trial window of your program not opening, and the inability to modify any files, which also resulted in the in-ability to add new devices. I'm not sure if this setting is changeable in Vista, but that is part of the reason I'm "beta testing", to find these things out. The cheap fix, was to un-install / re-install to a different location (i.e. other than Windows default program files location).

Anyhow, I've done some simple testing, and have a question or two. Currently, I have StarWind running, and am using MS' initiator, which comes with Vista Ultimate. I've done a little benchmarking with Sandra (yeah, I know), and HDTune (HDTach doesn't seem to like Vista). I setup a 1GB RamDisk, then mounted it via the initiator on the same machine.

My questions are:

1) Sandra reported 109MB/s, and HDTune reported 115MB/s averages for throughput. The memory is capable of 10GB/s , so I know it is not the bottleneck, and I do notice that while I'm running the benchmarks, my CPU load is 100%, but the speed fluctuates very little. I would like to know the Architectural limitations of this iSCSI software implementation.

2) Assuming my CPU is not fast enough to handle the load, without bottlenecking the rest of the system in this case, what would be a good starting point (system wise) for ideal performance ?

3) Assuming I separated the the Target/Initiator on to separate machines, what load average can I expect from Starwind/the initiator?

4) Are there any other bottlenecks I need to watch out for ?

I do realize that using the "client/server" on the same system is less than Ideal, but at this point in time, I have no other alternative. Basically, this is a test case, to see if software iSCSI is feasible for my solution.
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Mon Dec 11, 2006 12:24 am

Did you check Vista Build 6000? It's gold already. However we'd verify what you've reported. Thanks for pointing!

1) I guess you're using the same machine for both initiator and target - you're measuring who knows what... In other words: there are multiple copy operations, multiple memory mapping/remapping, context switches, paging involved etc etc etc.

There are NO limitations for the software. It would ran as fast as you network would support. We're capable of doing wire speed with GbE and around 500-600 MB/sec with 10 GbE (don't know any single iSCSI implementation doing wire speed for 10 GbE).

2)-3) Depends of CPU of course. Depends of TCP offload engine on initiator. For cheap GbE and 2.4GHz P4 CPU I remember having 40% of processor used for iSCSI read/write.

4) Yes. Network hardware. Switches should do Jumbo frames. Software stack should be optimized (check this forum for iSCSI optimized settings) etc.

It's not ideal. It's complete waste of time.
yyrkoon wrote:Hello,

First of all, I would like to point out that there seems to be a small compatibility issue with Windows Vista RC2 build 5744 (ultimate). When installing to %root%/Program Files/, it seems that Vista by default, requires admin privileges to install programs, then afterwards, marks the directory as read only. End result was the trial window of your program not opening, and the inability to modify any files, which also resulted in the in-ability to add new devices. I'm not sure if this setting is changeable in Vista, but that is part of the reason I'm "beta testing", to find these things out. The cheap fix, was to un-install / re-install to a different location (i.e. other than Windows default program files location).

Anyhow, I've done some simple testing, and have a question or two. Currently, I have StarWind running, and am using MS' initiator, which comes with Vista Ultimate. I've done a little benchmarking with Sandra (yeah, I know), and HDTune (HDTach doesn't seem to like Vista). I setup a 1GB RamDisk, then mounted it via the initiator on the same machine.

My questions are:

1) Sandra reported 109MB/s, and HDTune reported 115MB/s averages for throughput. The memory is capable of 10GB/s , so I know it is not the bottleneck, and I do notice that while I'm running the benchmarks, my CPU load is 100%, but the speed fluctuates very little. I would like to know the Architectural limitations of this iSCSI software implementation.

2) Assuming my CPU is not fast enough to handle the load, without bottlenecking the rest of the system in this case, what would be a good starting point (system wise) for ideal performance ?

3) Assuming I separated the the Target/Initiator on to separate machines, what load average can I expect from Starwind/the initiator?

4) Are there any other bottlenecks I need to watch out for ?

I do realize that using the "client/server" on the same system is less than Ideal, but at this point in time, I have no other alternative. Basically, this is a test case, to see if software iSCSI is feasible for my solution.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
yyrkoon
Posts: 8
Joined: Sun Dec 10, 2006 10:08 pm

Fri Dec 15, 2006 11:11 pm

Hello again anton,

Well a few things I'd like to mention.

1) Main reason why I'm still(or was) using Vista RC2, is that I wanted to see if it was worth paying for. For now, I'm still trying to make that determination, so paying for Vista, is not an option for the moment.

2) The Program Files "bug" also doesn't seem to be limited to Vista only. I've since done away with Vista, and am running XP Pro, so maybe is a .NET security implementation of some sort (assuming your management application even uses .NET)? In both cases, I let the installer install to its default location, and the program would not run the trial dialog/form, and the options to add devices were all grayed out. Had to add/remove-> uninstall -> reinstall to a different location, before the option to add devices were enabled. Maybe I missed something in the documentation ?

[EDIT]

Hmm, maybe it is related to Windows defender ? Seems my program files directory in this copy of XP also has Program Files locked as read only, and if you change it, it will let you, but when you go back and check, its read only again . . .
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Sat Dec 16, 2006 12:35 am

1) I think you can install Vista 6000 and use it w/o registration for 30 days or so. That would be much better way to test everything.

2) We've already changed all installation directories in the scripts. Thank you very much for pointing!
yyrkoon wrote:Hello again anton,

Well a few things I'd like to mention.

1) Main reason why I'm still(or was) using Vista RC2, is that I wanted to see if it was worth paying for. For now, I'm still trying to make that determination, so paying for Vista, is not an option for the moment.

2) The Program Files "bug" also doesn't seem to be limited to Vista only. I've since done away with Vista, and am running XP Pro, so maybe is a .NET security implementation of some sort (assuming your management application even uses .NET)? In both cases, I let the installer install to its default location, and the program would not run the trial dialog/form, and the options to add devices were all grayed out. Had to add/remove-> uninstall -> reinstall to a different location, before the option to add devices were enabled. Maybe I missed something in the documentation ?

[EDIT]

Hmm, maybe it is related to Windows defender ? Seems my program files directory in this copy of XP also has Program Files locked as read only, and if you change it, it will let you, but when you go back and check, its read only again . . .
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
yyrkoon
Posts: 8
Joined: Sun Dec 10, 2006 10:08 pm

Wed Dec 20, 2006 8:21 pm

Ok, I've done a bit of testing, and have yet more questions now . . .

First of all, let me say that 85MB/s on cheap GbE equipments (Jumbo frames enabled), is what I'd consider pretty good. Well, that is 85MB/s using a RAM disk. Now, using an IMG file, I get 49MB/s on a RAID0 array that is capable of 96MB/s. Using a SPTI connection, is actually worse, and with a single ATA 100 drive, I get like 37.x MB/s, and around 39.xMB/s doing a read test on the RAID0 Array(This RAID array is actually the targets boot disk, so didnt want to give it anything more than READ permissions).

What I've noticed, is that It does not matter where the drive is located, IE, PCI bus controller, motherboard controller etc, the speed does not change. CPU usage on both machines is reasonably low, and both the Target, and initiators GbE controller supports Jumbo Frames, Offload checksum, and Offload TCP Largesend. Since one of these controllers is an onboard GbE controller, and the other is a cheap PCI GbE adapter, I was not really expecting even 80% capabilities, and am quite happy with 85MB/s. However, in the case of the single ATA 100 drive, I'm getting 10MB/s slower throughput, and on the 2x SATA RAID0 array, I'm nearly only getting half the throughput.

My question is: Is this expected, and possibly due to TCP transmission overhead? Is there any way to "Correct" this ? Ive changed all the settings I could possibly change in the Registry, and I think I've already proven that I can get 85MB/s, but am not achieving it with Hard drives. any thing else I'm missing ?
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed Dec 20, 2006 9:44 pm

Well, if you have PCI 32-bit/33MHz connected NICs - it's OK, if they have direct or PCIe or PCI-X attachment - it's slow. You should be hitting 110MB/sec at least in such a case.

This situation happens b/c 1) comparably long pipeline 2) iSCSI breaks original 64K requests into smaller ones and they are < stripe size on your array.

There are a couple of things you can do. First - try desceasing stripe size (say use 4KB, however it would be CPU killer), second approach would be for RDS to write own RAID0 implementation (we've got only own RAID1 so far) with variable stripe size and tighter iSCSI stack <-> storage stack interface.
yyrkoon wrote:Ok, I've done a bit of testing, and have yet more questions now . . .

First of all, let me say that 85MB/s on cheap GbE equipments (Jumbo frames enabled), is what I'd consider pretty good. Well, that is 85MB/s using a RAM disk. Now, using an IMG file, I get 49MB/s on a RAID0 array that is capable of 96MB/s. Using a SPTI connection, is actually worse, and with a single ATA 100 drive, I get like 37.x MB/s, and around 39.xMB/s doing a read test on the RAID0 Array(This RAID array is actually the targets boot disk, so didnt want to give it anything more than READ permissions).

What I've noticed, is that It does not matter where the drive is located, IE, PCI bus controller, motherboard controller etc, the speed does not change. CPU usage on both machines is reasonably low, and both the Target, and initiators GbE controller supports Jumbo Frames, Offload checksum, and Offload TCP Largesend. Since one of these controllers is an onboard GbE controller, and the other is a cheap PCI GbE adapter, I was not really expecting even 80% capabilities, and am quite happy with 85MB/s. However, in the case of the single ATA 100 drive, I'm getting 10MB/s slower throughput, and on the 2x SATA RAID0 array, I'm nearly only getting half the throughput.

My question is: Is this expected, and possibly due to TCP transmission overhead? Is there any way to "Correct" this ? Ive changed all the settings I could possibly change in the Registry, and I think I've already proven that I can get 85MB/s, but am not achieving it with Hard drives. any thing else I'm missing ?
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
yyrkoon
Posts: 8
Joined: Sun Dec 10, 2006 10:08 pm

Thu Dec 21, 2006 1:43 am

anton (staff) wrote: This situation happens b/c 1) comparably long pipeline 2) iSCSI breaks original 64K requests into smaller ones and they are < stripe size on your array.

There are a couple of things you can do. First - try desceasing stripe size (say use 4KB, however it would be CPU killer), second approach would be for RDS to write own RAID0 implementation (we've got only own RAID1 so far) with variable stripe size and tighter iSCSI stack <-> storage stack interface.
Well, if and when you guys do write a RAID0 plugin, I would be more than glad to beta test it for you (provided my trial hasn't run out).

Since playing with Starwind, I've thought about all kinds of fun things you can do with iSCSI, multiple targets -> single initiator with large RAM disks for instance (would cost a lot of money for sure).

Anyhow, what if I were to add another 2x HDDs to the RAID0 array, do you suppose this would speed transfers up a decent bit, or am I pretty much maxed out on HDDs ? The reason I ask, is, the plans that I had for this solution, was to run RAID5, or RAID6, and if 39MB/s is the best I can do . . . needless to say, I'm hoping that adding more disks would increase performance, but since I dont have another 2 HDDs to add to the mix right now, I can not test it.
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Thu Dec 21, 2006 2:08 am

1) We would. Not ATM, but in the beginning of the next year.

2) You can drop me a message to anton@rocketdivision.com and we'd be happy to generate longer trial key for you.

3) Yes, iSCSI itself is just an engine. It can power M1 Abrams or Hayabusa motorcycle :)

4) It would not work. See, stripe size of say 64K means one part of the virtual RAID volume is on the first disk, one is on the second etc. So any request smaller then 64KB has very big chances of being issued to only one hard disk. Physical one.

You can grab DataCore or FalconStor thing to check what they can do on the same hardware. Would be nice to compare the results BTW.
yyrkoon wrote:
anton (staff) wrote: This situation happens b/c 1) comparably long pipeline 2) iSCSI breaks original 64K requests into smaller ones and they are < stripe size on your array.

There are a couple of things you can do. First - try desceasing stripe size (say use 4KB, however it would be CPU killer), second approach would be for RDS to write own RAID0 implementation (we've got only own RAID1 so far) with variable stripe size and tighter iSCSI stack <-> storage stack interface.
Well, if and when you guys do write a RAID0 plugin, I would be more than glad to beta test it for you (provided my trial hasn't run out).

Since playing with Starwind, I've thought about all kinds of fun things you can do with iSCSI, multiple targets -> single initiator with large RAM disks for instance (would cost a lot of money for sure).

Anyhow, what if I were to add another 2x HDDs to the RAID0 array, do you suppose this would speed transfers up a decent bit, or am I pretty much maxed out on HDDs ? The reason I ask, is, the plans that I had for this solution, was to run RAID5, or RAID6, and if 39MB/s is the best I can do . . . needless to say, I'm hoping that adding more disks would increase performance, but since I dont have another 2 HDDs to add to the mix right now, I can not test it.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
yyrkoon
Posts: 8
Joined: Sun Dec 10, 2006 10:08 pm

Sun Dec 24, 2006 10:36 am

anton (staff) wrote:1
You can grab DataCore or FalconStor thing to check what they can do on the same hardware. Would be nice to compare the results BTW.

Well, just taking one look at those, and I can tell they probably want A LOT of money . . .

However, I've done A LOT more testing, with different iSCSI implementations , Samba, Windows file sharing, and I have to say, so far Starwind is the fastest. The closest was openfiler, but was still 10MB/s slower than Starwind (on a single PATA 100 drive). Ubuntu Dapper / Samba . . . what can I say, I love debian, but any OS that hard locks (crashes) when you set MTU to larger than 1500, is just plain unacceptable . . .

Too bad you guys/gals do not have a home edition, I could use this for data backup at home also, but $400 usd is far too much for home use.
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Sun Dec 24, 2006 4:33 pm

Well, both DataCore and FalconStor are not cheap :)

Which other iSCSI implementation did you test? What was used as iSCSI initiator?

Actually we're planning to have StarWind Personal back. So you can drop me a message to info@rocketdivision.com so we could discuss features and personal pricing for you :)

Thanks!
yyrkoon wrote:
anton (staff) wrote:1
You can grab DataCore or FalconStor thing to check what they can do on the same hardware. Would be nice to compare the results BTW.

Well, just taking one look at those, and I can tell they probably want A LOT of money . . .

However, I've done A LOT more testing, with different iSCSI implementations , Samba, Windows file sharing, and I have to say, so far Starwind is the fastest. The closest was openfiler, but was still 10MB/s slower than Starwind (on a single PATA 100 drive). Ubuntu Dapper / Samba . . . what can I say, I love debian, but any OS that hard locks (crashes) when you set MTU to larger than 1500, is just plain unacceptable . . .

Too bad you guys/gals do not have a home edition, I could use this for data backup at home also, but $400 usd is far too much for home use.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
yyrkoon
Posts: 8
Joined: Sun Dec 10, 2006 10:08 pm

Tue Dec 26, 2006 10:55 am

Ive tested openfilers iSCSI implementation, and Ubuntu Dapper 6.06 LTS, with iSCSI enterprise Target implementation so far.

Initiator, I've always used MS' Initiator.

The main problem I have with Dapper, is that driver support is not complete, and thus I'm not able to test with jumbo frames enabled on my hardware, and setting mtu > 1500 causes a serious page segment fault, completely locking up the system, and thus requiring a hard boot.

dd if=/dev/zero of=/home/test/test.img bs=1024k count=10000 gave me 157MB/s writes on this software RAID0 array (10GB file), and reading it back gave me 75.5MB/s ( dd if=/home/test/test.img of=/dev/null bs=4k ). However, disk access over the network only gave me 25MB/s, and network utilization was only roughly 20% average.

Openfilers Implementation is kind of neat, in that it uses a web interface, to control Volumes, quotas, etc, but left me feeling unsatisfied (the interface isnt very intuitive IMO). Plus, like I said before, it is at least 10MB/s slower, using a single ATA 100 disk (40GB Seagate barracuda 7200RPM drive).

Oh well, I dont think you guys really have much competition from the *NIX side, at least, not for a while. I've seen claims that IET has achieved some impressive numbers using 10GbE connections, but from what I've seen, it's mostly smoke, and mirrors.

[EDIT]

Oh, and whatever you did to the forums, as far as looks, I personally think it looked better before :)

Oh. and yeah, Openfilers Implementation reports to the OS as a USB drive, haven't checked IET's, but I know your implementation reports as a iSCSI device . . .
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Sat Jan 13, 2007 6:40 am

1) You may try to use StarPort instead of MS iSCSI initator. See, looks like it was never tested against anything except MS domestic iSCSI target (ex-SBS WinTarget). At least it's the only configuration working nearly at wire speed ATM.

2) I see... Well, that's the problem with open source solutions :) You always pay. Either with your money or with your working time trying to make them work :) However I beleive their maintainers would fix pointed issues one day.

3) No, we're not afraid of them YET. Freeware *nix iSCSI targets are playing in different league for now. Everything could be changed in the future however :)

4) Yup! You're not alone :) We've rolled back to old forum design finally...
yyrkoon wrote:Ive tested openfilers iSCSI implementation, and Ubuntu Dapper 6.06 LTS, with iSCSI enterprise Target implementation so far.

Initiator, I've always used MS' Initiator.

The main problem I have with Dapper, is that driver support is not complete, and thus I'm not able to test with jumbo frames enabled on my hardware, and setting mtu > 1500 causes a serious page segment fault, completely locking up the system, and thus requiring a hard boot.

dd if=/dev/zero of=/home/test/test.img bs=1024k count=10000 gave me 157MB/s writes on this software RAID0 array (10GB file), and reading it back gave me 75.5MB/s ( dd if=/home/test/test.img of=/dev/null bs=4k ). However, disk access over the network only gave me 25MB/s, and network utilization was only roughly 20% average.

Openfilers Implementation is kind of neat, in that it uses a web interface, to control Volumes, quotas, etc, but left me feeling unsatisfied (the interface isnt very intuitive IMO). Plus, like I said before, it is at least 10MB/s slower, using a single ATA 100 disk (40GB Seagate barracuda 7200RPM drive).

Oh well, I dont think you guys really have much competition from the *NIX side, at least, not for a while. I've seen claims that IET has achieved some impressive numbers using 10GbE connections, but from what I've seen, it's mostly smoke, and mirrors.

[EDIT]

Oh, and whatever you did to the forums, as far as looks, I personally think it looked better before :)

Oh. and yeah, Openfilers Implementation reports to the OS as a USB drive, haven't checked IET's, but I know your implementation reports as a iSCSI device . . .
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
yyrkoon
Posts: 8
Joined: Sun Dec 10, 2006 10:08 pm

Mon Jan 22, 2007 9:31 am

I'm currently using Starport now, but not for iSCSI, last night, I set my Dapper 6.06 LTS box up for AoE, and only Windows software that supported AoE Initiator was Starport . . .

I'm still at an impass, no matter what sort of storage protocol I use, ~30MB/s seems to be my limit, UNLESS, I use your Starwind iSCSI Target software.

Since last post, I've tried NFS, FTP, iSCSI, CIFS(Samba), and now as of last night ATA over ethernet. I guess I'll have to wait for my Intel Pro 1000 PCI-E card to arrive next week for results better than what I'm getting now, or until you guys make an RAID0-RAID5 iSCSI plugin for better results.

All this being said, when you're ready with RAID plugins, I'm willing to beta test for you, just keep in mind on my end, I would most likely be limited to either 1) motherboard fake RAID, or 2) Windows software RAID. I can not justify spedning more, until I see much better results than I've currently been experiencing.
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Mon Jan 22, 2007 9:20 pm

That's a good idea! I mean wait for proper GbE hardware (PCIe, better PCI-X or direct attached). 32-bit/33MHz PCI spoils everything.

Sure! You'll be the one of the first to know about own software RAID implementation :) Would keep you updated.
yyrkoon wrote:I'm currently using Starport now, but not for iSCSI, last night, I set my Dapper 6.06 LTS box up for AoE, and only Windows software that supported AoE Initiator was Starport . . .

I'm still at an impass, no matter what sort of storage protocol I use, ~30MB/s seems to be my limit, UNLESS, I use your Starwind iSCSI Target software.

Since last post, I've tried NFS, FTP, iSCSI, CIFS(Samba), and now as of last night ATA over ethernet. I guess I'll have to wait for my Intel Pro 1000 PCI-E card to arrive next week for results better than what I'm getting now, or until you guys make an RAID0-RAID5 iSCSI plugin for better results.

All this being said, when you're ready with RAID plugins, I'm willing to beta test for you, just keep in mind on my end, I would most likely be limited to either 1) motherboard fake RAID, or 2) Windows software RAID. I can not justify spedning more, until I see much better results than I've currently been experiencing.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
yyrkoon
Posts: 8
Joined: Sun Dec 10, 2006 10:08 pm

Thu Jan 25, 2007 10:22 pm

Well, anton, I'm not sure what to say, the Intel pro 1000 PT card (PCI-E) did make a difference, but not a whole lot. This time, I used Starwind + Starport, and the best transfer I recieved was an average of around 40MB/s, with a peak of around 55MB/s, using an img file. SPTI is much slower (between 20-30MB/s )

This is what I've done, just in case I've missed something important:

1) Installed Starwind on Target machine, enabled jumbo frames, and checked all other hardware options for the adapter.
2)Install Starport on initiator, enabled 9014 byte Jumbo frames, checked, and changed any necessary hardware options.
3) Followed instructions here -> http://www.starwindsoftware.com/forums/ ... mbo+frames, but instead of putting it in the key you suggested, I did it for the individual adapters, in their respective locations
4) Setup the disks on the Target.
5) Initialized the 'disks' on the Initiator side, and formated with varying block sizes, which didnt seem to make much, if any difference(at least towards better speed, sometimes it did get worse).

Anyhow, I'm almost hoping I missed something somewhere, because ~30MB/s is what I'd also get using Samba, or other file level access method, and we all know, iSCSI HAS to be better than Samba. It is really frustrating. If the above is not the case, then perhaps my hardware is holding me back, but to be honest, it is not worth spending large amounts on new hardware to find this out (at least, not for me).
Post Reply