AoE performance slowww

Initiator (iSCSI, FCoE, AoE, iSER and NVMe over Fabrics), iSCSI accelerator and RAM disk

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
cdevidal
Posts: 7
Joined: Sat Aug 02, 2008 11:07 pm

Wed Aug 20, 2008 1:59 pm

Problem
I am averaging 1.5MB/sec. backing up data from my AoE setup. Sometimes I can get up to 4MB/sec., but that's rare. Only ONE client is pulling the files, there is zero multiuser access going on. Anyone have any thoughts?

Target
Software: vblade 18 OR kvblade alpha-3 (same results)
CPU: 1.7GHz P4
Memory: 256MB (system only uses 23MB)
OS: CentOS 4.6, current updates and kernel
Network: 32-bit Gigabit Broadcom card
Disk: Single 750GB PATA

Network
Switch: US Robotics Gigabit
Cables: Cat 5e

Initiator
Software: StarPort 3.6 Build 0x20080410
CPU: Dual Xeon 3.4GHz
Memory: 3GB
OS: Windows 2003 SP2, current updates and drivers
Network: 64-bit Gigabit Broadcom (I think? Built into the motherboard, which is a relatively new HP DL380 G4, pretty fast)
Partition: NTFS
Files: Usually megabytes in size, rarely small
Backup program: BakBone's NetVault

The bottleneck could be the target's NIC but I don't think that's the case; I can pull a 1GB file using netcat+cpio at an average of 10MB/sec., even when the backup is running over AoE at the same time!

I've noticed that often the backup will stall for several seconds for no apparent reason. But even when it's going at full steam the speed hovers at around 2.5MB/sec. (with the stalls, I average 1.5MB/sec.) This same backup software can back up files directly from the initiator's local hard drive at around 10MB/sec.

At the moment I'm using kvblade on the target; uptime is less than 0.10 and kvblade itself uses less than 3% of the CPU.

Thoughts?
Last edited by cdevidal on Wed Aug 20, 2008 2:12 pm, edited 1 time in total.
-- Chris de Vidal

You're a good person? Yeah, right ;-)
Prove it: TenThousandDollarOffer.com
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed Aug 20, 2008 2:10 pm

You should get wire speed. Enable Jumbo frames, make sure you have latest firware flashed into your CoRAID box. Does SMB of iSCSI over the same hardware works well?
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
cdevidal
Posts: 7
Joined: Sat Aug 02, 2008 11:07 pm

Wed Aug 20, 2008 2:18 pm

anton (staff) wrote:You should get wire speed. Enable Jumbo frames
The switch doesn't support jumbo :-(

anton (staff) wrote:make sure you have latest firware flashed into your CoRAID box.
It's a Linux kvblade or vblade setup (both do the same thing).

anton (staff) wrote:Does SMB of iSCSI over the same hardware works well?
I'm just about to test an iSCSI target with the StarPort initiator, will let you know. Will try SMB and also NetVault.
-- Chris de Vidal

You're a good person? Yeah, right ;-)
Prove it: TenThousandDollarOffer.com
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed Aug 20, 2008 2:26 pm

1) No Jumbo - no peformance.

2) AoE w/o dirt-cheap per gigabyte AoE hardware setups is senseless.

3) Sure.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
cdevidal
Posts: 7
Joined: Sat Aug 02, 2008 11:07 pm

Wed Aug 20, 2008 4:21 pm

anton (staff) wrote:1) No Jumbo - no peformance.
See my first post; I'm able to get 10MB/sec with netcat+cpio even while backups are running, so I don't think Jumbo is the limiting factor. I would be perfectly happy with 10MB/sec. I guess I could try NIC->NIC and enable Jumbo that way, but I don't think that is what is limiting us.

anton (staff) wrote:2) AoE w/o dirt-cheap per gigabyte AoE hardware setups is senseless.
We're extremely tight on our budget and from what I read on Google, kvblade seemed to be good enough.

I'm not looking for the ultimate in speed but 1.5MB/sec. is ridiculous. Even 5MB/sec. would be sufficient.

anton (staff) wrote:3) Sure.
The iSCSI test (IET v0.4.5, StarPort initiator) is giving me 5-6MB/sec, which is good enough.

I just found a page on AoE optimization, I'll try these, or maybe I'll just use iSCSI.
-- Chris de Vidal

You're a good person? Yeah, right ;-)
Prove it: TenThousandDollarOffer.com
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed Aug 20, 2008 6:44 pm

Sounds like latency issue to me. Check NIC properties for number of collisions it reports.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
cdevidal
Posts: 7
Joined: Sat Aug 02, 2008 11:07 pm

Sat Aug 23, 2008 11:59 am

anton (staff) wrote:Sounds like latency issue to me. Check NIC properties for number of collisions it reports.
No collisions, according to Linux. Not sure how to find that in Windows.

Do you say that because backups are pausing (which is still happening) or because AoE was so slow? Because iSCSI is three to five times faster, so I really don't think collisions would be the cause of the AoE slowness; if that were true, iSCSI would be slow, too.

But iSCSI backups are still pausing at times for no apparent reason. It is probably because the tape drives are shoe-shining; that is, they don't get data fast enough to keep the tape streaming at their minimum native speed, so they must pause, wait for the buffers to fill, and start again. I'll have to do some test backups to a virtual library to confirm this; virtual libraries don't suffer from shoe-shining.


UPDATE
The virtual library did stop and start, and there were significant variations in speed (from 2 to 30MB/sec.), even within individual files. I don't know what's up with that; perhaps the initiator server was busy handling requests? Its CPU isn't maxed out... and the connection between the iSCSI target and the initiator is just a single network cable, no network switch is involved. So there isn't network traffic.

So I don't have an answer for why backups pause mid-stream, but hey, 6-10MB/sec. is much better than before and it's good enough for our backups.
-- Chris de Vidal

You're a good person? Yeah, right ;-)
Prove it: TenThousandDollarOffer.com
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Sat Aug 23, 2008 3:59 pm

Well... On your place instead of find out what's broken I'd invest into pair of cheap GbE NICs and good 5e cross-over cable. In such a case you'd stop wasting 1/3 of the bandwidth with AoE two 512 byte sectors packed into single 1500 byte Ethernet frame (Jumbo frames have much better effeciency) and also will have smaller latency for both AoE & iSCSI.
cdevidal wrote:
anton (staff) wrote:Sounds like latency issue to me. Check NIC properties for number of collisions it reports.
No collisions, according to Linux. Not sure how to find that in Windows.

Do you say that because backups are pausing (which is still happening) or because AoE was so slow? Because iSCSI is three to five times faster, so I really don't think collisions would be the cause of the AoE slowness; if that were true, iSCSI would be slow, too.

But iSCSI backups are still pausing at times for no apparent reason. It is probably because the tape drives are shoe-shining; that is, they don't get data fast enough to keep the tape streaming at their minimum native speed, so they must pause, wait for the buffers to fill, and start again. I'll have to do some test backups to a virtual library to confirm this; virtual libraries don't suffer from shoe-shining.


UPDATE
The virtual library did stop and start, and there were significant variations in speed (from 2 to 30MB/sec.), even within individual files. I don't know what's up with that; perhaps the initiator server was busy handling requests? Its CPU isn't maxed out... and the connection between the iSCSI target and the initiator is just a single network cable, no network switch is involved. So there isn't network traffic.

So I don't have an answer for why backups pause mid-stream, but hey, 6-10MB/sec. is much better than before and it's good enough for our backups.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
cdevidal
Posts: 7
Joined: Sat Aug 02, 2008 11:07 pm

Thu Aug 28, 2008 2:39 pm

anton (staff) wrote:Well... On your place instead of find out what's broken I'd invest into pair of cheap GbE NICs and good 5e cross-over cable.
Actually, I just switched to that with the same performance. But I'll try to enable Jumbo; as of right now it is broken on the initiator server.
-- Chris de Vidal

You're a good person? Yeah, right ;-)
Prove it: TenThousandDollarOffer.com
aaron (staff)
Posts: 70
Joined: Fri Jan 11, 2008 6:13 am
Location: BVI

Fri Aug 29, 2008 12:36 am

Updating hardware is not enough. You need to apply some tricks to make TCP work fast with iSCSI: Nagle, Delayed ACK etc.
Regards,
Aaron Korfer

Sales & Support
Rocket Division Software
Post Reply