Jumbo Frame needed?

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Locked
Chris

Thu Dec 09, 2004 7:10 pm

I'm still doing tests to see if iSCSI performance is acceptable. I've set up a test lab with the following hardware

Host A: IBM xSeries 330 Server
- dual P3 1.4
- 4G ECC RAM
- 2X18G 15K RPM SCSI
- Intel Pro 1000 MT Desktop Adapter (32bit/66mhz).
- OS: CentOS 3.3 (Red Hat Enterprise Linux 3 clone)

Host B:
- P4 3.0C HT
- 2G DDR 400
- 74G Raptor 10K RPM
- Intel Pro 1000 CT CSA
- OS: Windows XP SP1

Host C:
- P4 1.6C (no HT)
- 1G DDR 333
- 2x200G Maxtor ATA 8M cache
- Intel Pro 1000 MT Desktop (PCI Bus)
- Winidows 2K Server SP4

Switch: Dell PowerEdge 2616
- 16 port gigabit
- 48Gb fabric capacity
- No Jumbo Frame Support

Running Iperf, I 'm getting the following results
Host A - Host B
- 8K - 320Mb/s CPU: 7%
- 16K - 510Mb/s CPU: 12%
- 32K - 630Mb/s CPU: 15%
- 64K - 800Mb/s CPU: 22%
- 128K - 810Mb/s CPU: 22%
- 256K - 810Mb/s CPU: 27%
- 512K - 810Mb/s CPU: 33%
- Over 512K 810Mb/s CPU: 35%

Host B - Host C
- 8K - 320Mb/s CPU: 7%
- 16K - 510Mb/s CPU: 12%
- 32K - 630Mb/s CPU: 15%
- 64K - 810Mb/s CPU: 20%
- 128K - 830Mb/s CPU: 22%
- 256K - 860Mb/s CPU: 25%
- 512K - 860Mb/s CPU: 35%
- Over 512K 860Mb/s CPU: 37%

The Intel NICs have send and checksum offload, so the CPU time isnot too bad. Host C is running gigabit over regular PCI bus, so the result is very good. I'm waiting for the new board with Intel CSA NIC to be shipped.

I haven't completely tune up TCP stack, especially on Linux, no tune up has been performed at all.

We'll be running RAID 10 on the iSCSI target so disk performance should not be the issue.

I don't have access to a swith with Jumbo Frame suport at this time (I know I can directly connect two hosts but that is not the senario we are going to deploy), so what can you achieve without Jumbo Frame? Does Jumbo Frame really help in speed and CPU load?

Thanks

Chris
Guest

Fri Dec 10, 2004 5:18 pm

Chris,

I don't think anybody would give you exact numbers :-) From what I've got with my tests -- enabling Jumbo frames boosts throughoutput just a bit (maybe you'll be able to get around 950-970 megabits), but CPU usage would drop noticably.

About RAID mapping. Of course you'll notice performance degradation. It highly depends of the tests and applications you'll run. Sustained transfer from mapped RAID would not differ a lot (pipeline), but occasional requests would be delayed...

Let me know when you'll have some time to tune TCP stacks on both Linux and Windows machines. We'll try to help you to tune StarWind for your task (asynchronous vs. synchronous access, caching etc).

Thank you!
Chris wrote:I'm still doing tests to see if iSCSI performance is acceptable. I've set up a test lab with the following hardware

Host A: IBM xSeries 330 Server
- dual P3 1.4
- 4G ECC RAM
- 2X18G 15K RPM SCSI
- Intel Pro 1000 MT Desktop Adapter (32bit/66mhz).
- OS: CentOS 3.3 (Red Hat Enterprise Linux 3 clone)

Host B:
- P4 3.0C HT
- 2G DDR 400
- 74G Raptor 10K RPM
- Intel Pro 1000 CT CSA
- OS: Windows XP SP1

Host C:
- P4 1.6C (no HT)
- 1G DDR 333
- 2x200G Maxtor ATA 8M cache
- Intel Pro 1000 MT Desktop (PCI Bus)
- Winidows 2K Server SP4

Switch: Dell PowerEdge 2616
- 16 port gigabit
- 48Gb fabric capacity
- No Jumbo Frame Support

Running Iperf, I 'm getting the following results
Host A - Host B
- 8K - 320Mb/s CPU: 7%
- 16K - 510Mb/s CPU: 12%
- 32K - 630Mb/s CPU: 15%
- 64K - 800Mb/s CPU: 22%
- 128K - 810Mb/s CPU: 22%
- 256K - 810Mb/s CPU: 27%
- 512K - 810Mb/s CPU: 33%
- Over 512K 810Mb/s CPU: 35%

Host B - Host C
- 8K - 320Mb/s CPU: 7%
- 16K - 510Mb/s CPU: 12%
- 32K - 630Mb/s CPU: 15%
- 64K - 810Mb/s CPU: 20%
- 128K - 830Mb/s CPU: 22%
- 256K - 860Mb/s CPU: 25%
- 512K - 860Mb/s CPU: 35%
- Over 512K 860Mb/s CPU: 37%

The Intel NICs have send and checksum offload, so the CPU time isnot too bad. Host C is running gigabit over regular PCI bus, so the result is very good. I'm waiting for the new board with Intel CSA NIC to be shipped.

I haven't completely tune up TCP stack, especially on Linux, no tune up has been performed at all.

We'll be running RAID 10 on the iSCSI target so disk performance should not be the issue.

I don't have access to a swith with Jumbo Frame suport at this time (I know I can directly connect two hosts but that is not the senario we are going to deploy), so what can you achieve without Jumbo Frame? Does Jumbo Frame really help in speed and CPU load?

Thanks

Chris
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Fri Dec 10, 2004 5:20 pm

Last post was from me :-)
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Val (staff)
Posts: 496
Joined: Tue Jun 29, 2004 8:38 pm

Fri Dec 10, 2004 5:46 pm

Hi Chris,

I've done some simple tests with Jumbo frames using our iSCSI initiator and target.

Client: StarPort on Pentium 2.4, 1GB, Intel 875 PBZ, Intel 1000 CT onboard
Server: StarWind on Pentium 2.8, 512MB, Intel 875 PBZ, Intel 1000 CT onboard
Direct point-to-point Ethernet connection

The TCP/IP stacks use default parameters (NTttcp test showed 952 Mb/s for 9K Jumbo frames, 930Mb/s - no Jumbo frames).

Here are the results with different Jumbo frame settings:
(I used the IOmeter connected to a remote StarWind remdrive, 100 requests, R/W - 50/50, for 2 request sizes: 256K and 64k)

(the first value is througput in MB, the last - client PC processor load)

1) no Jumbo frames:
- 256K: 61.7, 16.7%
- 64K : 52.2, 17.4%

2) Jumbo - 4KB:
- 256K: 69.3, 12.1%
- 64K : 62.3, 13.2%

3) Jumbo - 9KB:
- 256K: 75.9, 9.9%
- 64K : 67.5, 10.5%

4) Jumbo - 16KB:
- 256K: 58.3, 7.8%
- 64K : 52.3, 8.5%

As you see the best results reached using 9K Jumbo frames.
Best regards,
Valeriy
Chris

Fri Dec 10, 2004 6:28 pm

Thanks a lot.

So it seems Jumbo Frame does not help much on speed, but makes quite a difference for CPU load.
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Sat Dec 11, 2004 8:15 am

Correct :-) And I have a bit different hardware from what Valery has and I also have the best results ( Xfer / % CPU ) with 9K Jumbo frames. 16K ones seems to be very expensive and kind of overkill :-)
Chris wrote:Thanks a lot.

So it seems Jumbo Frame does not help much on speed, but makes quite a difference for CPU load.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Locked