degrading Performance

Public beta (bugs, reports, suggestions, features and requests)

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Delo123
Posts: 30
Joined: Wed Feb 12, 2014 5:15 pm

Wed Feb 12, 2014 9:18 pm

Doing first tests with Beta 3 and Starwind in general.

When copying files onto a Starwind Lun i tend to see massive drops in bandwith during file copy.
Sometimes i get rates around 200mb/s, and the next time i copy it will start around 10mb/s and sometimes droppes to 0kb/s for a few seconds.
Storage is exclusive for Starwind, no other running things.
Link is 4gb (4x1gb) using ms mpio connecting to 4 target ip's on starwind so using 16 channels in total.
Iometer can max out the 4gbit/s for a sustained period.

Any ideas before getting into details?
Attachments
19.jpg
19.jpg (55.22 KiB) Viewed 9491 times
200.jpg
200.jpg (55.17 KiB) Viewed 9490 times
Iometer.jpg
Iometer.jpg (247.42 KiB) Viewed 9490 times
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed Feb 12, 2014 9:57 pm

There can be MILLIONS of reasons. File system cache on client is one of the most obvious ones. To proceed we need to know:

1) StarWind software configuration (what storage did you use? FLAT Vs. LSFS? LU size? How much RAM allocated for write back cache?)

2) Task Manager video (at least check the numbers yourself) for reported memory used as a file cache

Please provide at least 1) so we could proceed futher. Thanks!
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Delo123
Posts: 30
Joined: Wed Feb 12, 2014 5:15 pm

Wed Feb 12, 2014 11:43 pm

Hi Anton,

Thanks.
I created 3 Luns. Both thin provisioned.
1. 150GB Lun without dedupe, LSFS Write-back Cache 1GB
2. 4TB Lun with dedupe. LSFS Write-Back Cache 4GB
3. 250GB Lun without Dedupe. Write-back Cache 4GB.

Storage Source for Lun 1 and 2 is a local Adaptec 71605Q. 6 Raid 5'2 of each 18TB (4TB Disks). So 36 Disks in Total.
Storage Source for Lun 3 is a iSCSI SAN connected with ms iSCSI. No "real" raid since it is a 20TB flash Device.
Guest seems to have enough memory free (3GB RAM reported used when drop occured this time).
Starwind Server has 256GB Memory with 33GB currenty used.
I see the same "drops" on all 3 luns

Actually we are looking to possibly replace DataCore with this system and are trying to find optimal configuration. Currently limited to the 4gb/s since existing enviroment is 100% fc, and i am wating for 10gb/s nics and switches to arrive. But i wonder anyway...
I am willing to change every setting / setup to reach better number as this is POC for Starwind.

Ps. Is it possible to change writeback cache after the volume has been created?
Attachments
filecopy.jpg
filecopy.jpg (184.84 KiB) Viewed 9483 times
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed Feb 12, 2014 11:47 pm

OK, I'll kick LSFS R&D team tomorrow to look @ your issue. Is ther any chance you'd try FLAT images with the same amout of cache allocated? Thanks!
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Delo123
Posts: 30
Joined: Wed Feb 12, 2014 5:15 pm

Wed Feb 12, 2014 11:49 pm

File copy from Lun3 (Flash) to Lun1 or Lun2 (DIsk) (@Guest) is even worse. Not getting over 50mb/s,
both arrays however should be able to do much more than that.

Ps. The Flash Lun is a native Lun from the Array. Lun1 and 2 are created from thick provisioned vdisks on windows storage pool (simple/stripe)
Attachments
Copylun3Lun1.jpg
Copylun3Lun1.jpg (163.06 KiB) Viewed 9480 times
Delo123
Posts: 30
Joined: Wed Feb 12, 2014 5:15 pm

Thu Feb 13, 2014 12:02 am

I now created 2 new Luns (Thick provisioned).
Copying to the "DIsk" Luns seems much better. However i still see a drop after about 40%. But overal pretty consistent @220mb/s
Copying to "Flash" is weird. Starts out at 220mb/s also, but after 59% speed is going toward 0kb/s and just seems to hang there...

Ps. added 3rd screenshot, where you actually see the "hanging"
Attachments
Filecopy_FlattoFlashhang.jpg
Filecopy_FlattoFlashhang.jpg (169.58 KiB) Viewed 9477 times
Here you can see speed dropping when copying to flash
Here you can see speed dropping when copying to flash
Filecopy_FlattoFlash59hang.jpg (155.08 KiB) Viewed 9478 times
Filecopy_FlattoDisk.jpg
Filecopy_FlattoDisk.jpg (178.03 KiB) Viewed 9478 times
Delo123
Posts: 30
Joined: Wed Feb 12, 2014 5:15 pm

Thu Feb 13, 2014 12:30 am

I just found TcpAckFrequency setting in one of your posts.
Will set this for all iscsi nics and let you know in the morning if it changes something...
Delo123
Posts: 30
Joined: Wed Feb 12, 2014 5:15 pm

Thu Feb 13, 2014 1:12 am

copying to the "disk" lun now works without dip,
however copying to the "flash" lun it starts dropping again around 57% as before.
However i am getting 80K iops now on the flash lun (40k before).
Still not near 250k iops i get on the starwind hostside, but ok, its only a 4gbit link...

will do some more tests tomorrow
Attachments
flash.jpg
flash.jpg (224.52 KiB) Viewed 9471 times
disk.jpg
disk.jpg (218.13 KiB) Viewed 9472 times
Delo123
Posts: 30
Joined: Wed Feb 12, 2014 5:15 pm

Thu Feb 13, 2014 12:39 pm

Seeing somie drops anyway after all...
However just checked the switch, seeing output drops there.

Last clearing of "show interface" counters never
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 316350
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 110834000 bits/sec, 1599 packets/sec
5 minute output rate 619000 bits/sec, 1098 packets/sec
38201875 packets input, 96968650787 bytes, 0 no buffer
Received 1109 broadcasts (1049 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 1049 multicast, 0 pause input
0 input packets with dribble condition detected
62671944 packets output, 99101848772 bytes, 0 underruns
0 output errors, 0 collisions, 1 interface resets
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 PAUSE output
0 output buffer failures, 0 output buffers swapped out

It's a Cisco 2960 and for some reason it doesn't seem to support send flowcontrol, only recieve. Quite stutip if you ask me.
Receive side is disabled and also disabled on all nics. Could this explain? I guess i then have to wait for 10Gbit Nics to compare.
You agree?
Attachments
112.jpg
112.jpg (54.88 KiB) Viewed 9436 times
Delo123
Posts: 30
Joined: Wed Feb 12, 2014 5:15 pm

Fri Feb 14, 2014 9:46 am

Anton,

I created the lsfs Luns on Storage Spaces Virtual Disks. In the RC Post i just read SS isn't supported/working.
May i ask what exactly shoudn't work? Except performance "issues" which are probably network related on my side, it does seem to be "working"
Delo123
Posts: 30
Joined: Wed Feb 12, 2014 5:15 pm

Fri Feb 14, 2014 10:08 am

maybe an Issue anyway:

What i try to archieve:
-Created 108TB Virtual Disk from 6x18TB Adaptec Luns-
- In Starwind i create 10x10TB Luns with Deduplication enabled and offer these Luns to the Guest.
- On the Guest i try to either create a dynamic disk out of these 10 Luns or add these luns to a Storage Pool. Both will fail
Windows Error:
The IO operation at logical block address \Device\MPIODisk6 for Disk (PDO name: ) was retried.

You can download starwind Logs from here to have a look.
http://share.greenpower.nl/share/i7qfzv ... q177ehyt1p

Due to Storage Spaces on the Starwind?
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Mon Feb 17, 2014 6:49 pm

Yes, that's a known issue. We don't work on top of the Storage Spaces well (yet).
Delo123 wrote:maybe an Issue anyway:

What i try to archieve:
-Created 108TB Virtual Disk from 6x18TB Adaptec Luns-
- In Starwind i create 10x10TB Luns with Deduplication enabled and offer these Luns to the Guest.
- On the Guest i try to either create a dynamic disk out of these 10 Luns or add these luns to a Storage Pool. Both will fail
Windows Error:
The IO operation at logical block address \Device\MPIODisk6 for Disk (PDO name: ) was retried.

You can download starwind Logs from here to have a look.
http://share.greenpower.nl/share/i7qfzv ... q177ehyt1p

Due to Storage Spaces on the Starwind?
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
AKnowles
Posts: 27
Joined: Sat Feb 15, 2014 6:18 am

Tue Feb 18, 2014 4:06 am

Just an FYI regarding storage spaces ... I've done some testing using storage spaves vs the standard (Computer Manager) mirrored/striped options. What I found was the native method was always faster than storage spaces. This is without using flash based caching. I personally decided that storage spaces isn't quite there yet and doesn't even come close to using a hardware raid controller anyway. Not does RFS do much for you as it doesn't support NFS shares among other things. Maybe the next generation of both storage apces and RFS will do better, but as of 2012 R2 I'm not seeing much use for it.
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue Feb 18, 2014 10:35 am

You're not alone with this conclusion :)
AKnowles wrote:Just an FYI regarding storage spaces ... I've done some testing using storage spaves vs the standard (Computer Manager) mirrored/striped options. What I found was the native method was always faster than storage spaces. This is without using flash based caching. I personally decided that storage spaces isn't quite there yet and doesn't even come close to using a hardware raid controller anyway. Not does RFS do much for you as it doesn't support NFS shares among other things. Maybe the next generation of both storage apces and RFS will do better, but as of 2012 R2 I'm not seeing much use for it.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Klas
Posts: 13
Joined: Mon Aug 27, 2012 10:49 am

Tue Feb 18, 2014 10:39 am

AKnowles wrote:Just an FYI regarding storage spaces ... I've done some testing using storage spaves vs the standard (Computer Manager) mirrored/striped options. What I found was the native method was always faster than storage spaces. This is without using flash based caching. I personally decided that storage spaces isn't quite there yet and doesn't even come close to using a hardware raid controller anyway. Not does RFS do much for you as it doesn't support NFS shares among other things. Maybe the next generation of both storage apces and RFS will do better, but as of 2012 R2 I'm not seeing much use for it.
Hello!

I think you should reconsider Storage Spaces. Waiting for Starwind8 we have a new hyper-v cluster running storage spaces. The setup is 8 x 900 GB SAS disk and one Intel PCI 910 SSD as cache and Tier0 layer. We created one large lun with 10 GB write cache and the performance is "like" PCI SSD on all the disk.
Writes go to the SSD and read usually is on Tier0, if not, the Tier1 (SAS-disk) is mostly idling and there for responsive. If you create the hyper-v vhdx files using logical block size of 4k the performance in the VM is almost the same as on the lun and the lun performance is almost as the Intel PCI SSD.

We have ~80 concurrent active users ( 825 user accounts) and is planning to run fileserver, exchange, webserver, RDS, DB’s. AD etc. etc. on one hyper-v cluster. Fast, resilient and low footprint.
Post Reply