Loopback Performance Issues (Where am i going wrong!)

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

verticalsoft
Posts: 14
Joined: Mon Oct 08, 2018 3:07 pm

Mon Oct 08, 2018 3:28 pm

I have some Starwind Virtual SAN performance Issues and would love some pointers.

I have a 2 node setup with the following-:

2 X Dell R630 each with:-

2 x 1GB Nics (Management)
4 x 10GB Intel X-540 (All setup with Jumbo and performance checked with iperf3) getting approx 9gb transfer!
ALL SAS SSD BASED:-
Boot 2 x sas Mirrored.
Data 6 x 1.6TB SAS SSD (Raid 5) 7.2TB Total
64 GB Ram
Dual Intel E5-2620 V3

The underlying data performance is good with a sustained sequential write of 2500 mbs and read of 3500 mbs (ATTO Disk Bench)

If I create a disk (Not HA just yet) and connected using 127.0.0.1 loopback and bring online and perform some tests the performance is horrible, please not I've not setup High Availability yet.

If tried various settings with Thick & Thin provisioned with various cache settings and the performance is virtually always the same.

Am i doing something wrong here or am I testing this incorrectly? Please note I've not even replicated yet so no NIC interfaces are actually benig tested, I'm quite familiar with Starwind and am looking for some suggestions.

Kind regards
David
Attachments
ATTO Benchmark
ATTO Benchmark
Starwind Test.jpg (165.43 KiB) Viewed 9345 times
Michael.Blanchard2
Posts: 8
Joined: Mon Sep 17, 2018 9:17 pm

Mon Oct 08, 2018 5:39 pm

I had similar issues at first, and what fixed it for me was:

1. Disabling all the cache, at that speed, cache is almost useless anyway.
2. What is the underlying network speed for the replication network? you have a separate network just for the replication right? it's over 1gb right?
3. iSCSI or SMB? Did you optimize the connect? disable un-needed protocols? disable nagle? jumbo frames?
4. I run infiniband (regretfully) and I did end up finding out I had a few settings wrong and a bad NIC that was affecting the replication
verticalsoft
Posts: 14
Joined: Mon Oct 08, 2018 3:07 pm

Mon Oct 08, 2018 7:15 pm

Michael.Blanchard2 wrote:I had similar issues at first, and what fixed it for me was:

1. Disabling all the cache, at that speed, cache is almost useless anyway.
2. What is the underlying network speed for the replication network? you have a separate network just for the replication right? it's over 1gb right?
3. iSCSI or SMB? Did you optimize the connect? disable un-needed protocols? disable nagle? jumbo frames?
4. I run infiniband (regretfully) and I did end up finding out I had a few settings wrong and a bad NIC that was affecting the replication
Hello and Thanks for your reply.

I'm not even talking about replication yet, I'm talking about local starwind storage using iSCSI loopback on 127.0.0.1, this should not even use any network resource, it's not replicated yet.
I just don't get the performance loss.

It's no point replicated it yer and when I do it will use dedicated 10gbe links.

Kind regards
Dave
verticalsoft
Posts: 14
Joined: Mon Oct 08, 2018 3:07 pm

Mon Oct 08, 2018 10:10 pm

Bump
verticalsoft
Posts: 14
Joined: Mon Oct 08, 2018 3:07 pm

Tue Oct 09, 2018 6:41 pm

Anyone?
Vitaliy (Staff)
Staff
Posts: 35
Joined: Wed Jul 27, 2016 9:55 am

Tue Oct 09, 2018 8:17 pm

Hello,
Could you please try next:

1. Use Microsoft DiskSPD utility for benchmarking:
https://gallery.technet.microsoft.com/D ... e-6cd2f223
Below are few commands for cmd and several patterns:
diskspd.exe -t6 -b4K -r -w0 -o64 -d60 -h -L -c50G D:\test.io > c:\test\4k_random_readSSD.txt
diskspd.exe -t6 -b4K -r -w100 -o64 -d60 -h -L -c50G D:\test.io > c:\test\4k_random_writeSSD.txt
diskspd.exe -t6 -b64K -r -w0 -o64 -d60 -h -L -c50G D:\test.io > c:\test\64k_random_readSSD.txt
diskspd.exe -t6 -si -b64K -w0 -o64 -d60 -h -L -c50G D:\test.io > c:\test\64k_seq_readSSD.txt
diskspd.exe -t6 -b64K -r -w100 -o64 -d60 -h -L -c50G D:\test.io > c:\test\64k_random_writeSSD.txt
diskspd.exe -t6 -si -b64K -w100 -o64 -d60 -h -L -c50G D:\test.io > c:\test\64k_seq_writeSSD.txt

Device size has to be more than 20 GB.
Avoid any caching on StarWind device so far.

2. Connect device in iSCSI initiator twice by loop back.

3. For understanding create an HA device and benchmark as well.
verticalsoft
Posts: 14
Joined: Mon Oct 08, 2018 3:07 pm

Tue Oct 09, 2018 8:20 pm

I shall try your bench suggestions and come back to you.

Kind regards
David
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Wed Oct 10, 2018 7:32 am

Thank you!
Yes, please share your results.
verticalsoft
Posts: 14
Joined: Mon Oct 08, 2018 3:07 pm

Wed Oct 10, 2018 10:05 am

Vitaliy (Staff) wrote:Hello,
Could you please try next:

1. Use Microsoft DiskSPD utility for benchmarking:
https://gallery.technet.microsoft.com/D ... e-6cd2f223
Below are few commands for cmd and several patterns:
diskspd.exe -t6 -b4K -r -w0 -o64 -d60 -h -L -c50G D:\test.io > c:\test\4k_random_readSSD.txt
diskspd.exe -t6 -b4K -r -w100 -o64 -d60 -h -L -c50G D:\test.io > c:\test\4k_random_writeSSD.txt
diskspd.exe -t6 -b64K -r -w0 -o64 -d60 -h -L -c50G D:\test.io > c:\test\64k_random_readSSD.txt
diskspd.exe -t6 -si -b64K -w0 -o64 -d60 -h -L -c50G D:\test.io > c:\test\64k_seq_readSSD.txt
diskspd.exe -t6 -b64K -r -w100 -o64 -d60 -h -L -c50G D:\test.io > c:\test\64k_random_writeSSD.txt
diskspd.exe -t6 -si -b64K -w100 -o64 -d60 -h -L -c50G D:\test.io > c:\test\64k_seq_writeSSD.txt

Device size has to be more than 20 GB.
Avoid any caching on StarWind device so far.

2. Connect device in iSCSI initiator twice by loop back.

3. For understanding create an HA device and benchmark as well.

Hello Vitaliy,

I've attached all the logs as requested, every prefixed with US = Underlying Storage and SW = Starwind Storage
The SW Config was a 100GB Thick Provisioned attached locally using 2 MPIO 127.0.0.1 connections formatted to NTFS ans attached as Drive Z:

Does this help?
Attachments
Tests.zip
(17.84 KiB) Downloaded 340 times
Vitaliy (Staff)
Staff
Posts: 35
Joined: Wed Jul 27, 2016 9:55 am

Wed Oct 10, 2018 4:28 pm

Thank you for report.
Could you please create a replica, and test same HA device?
We need this for deep undestarding of the performance issue.

Also, below you can find the best practices for RAID controller:
RAID5 on SSD:
Disk cache policy: default/enabled;
Read policy: No Read Ahead;
Write Policy: Write Through;
Strip Size: 64 KB

Do they match with your environment?
verticalsoft
Posts: 14
Joined: Mon Oct 08, 2018 3:07 pm

Wed Oct 10, 2018 5:03 pm

Vitaliy (Staff) wrote:Thank you for report.
Could you please create a replica, and test same HA device?
We need this for deep undestarding of the performance issue.

Also, below you can find the best practices for RAID controller:
RAID5 on SSD:
Disk cache policy: default/enabled;
Read policy: No Read Ahead;
Write Policy: Write Through;
Strip Size: 64 KB

Do they match with your environment?

Hello Vitaliy,
RAID5 on SSD: Yes they match 100%

I created a HA and re-setup the iSCSI to the following:-

2 Loopback connections only (no MPIO link to replicated device) (Policy Least Queue Depth)
The results were the virtually the same.

I then added the 3rd connection to the remote HA via MPIO and I've attached the results, this performance was increased but this was because I could see the iSCSI traffic going to the remote device at circa 6 GBps, shouldn't the iSCSI be using the loopback and if so why is it's performance so bad?
Attachments
Tests with HA.zip
(9.29 KiB) Downloaded 337 times
Vitaliy (Staff)
Staff
Posts: 35
Joined: Wed Jul 27, 2016 9:55 am

Wed Oct 10, 2018 6:55 pm

David,

Did you configure everything by this guide?
https://www.starwindsoftware.com/resour ... erver-2016
verticalsoft
Posts: 14
Joined: Mon Oct 08, 2018 3:07 pm

Wed Oct 10, 2018 6:59 pm

Vitaliy (Staff) wrote:David,

Did you configure everything by this guide?
https://www.starwindsoftware.com/resour ... erver-2016
100% I've setup This before several times, I also use the same configuration for my customers...

Kind regards
David.

I'm Just wondering if this is a bug with current version?
Vitaliy (Staff)
Staff
Posts: 35
Joined: Wed Jul 27, 2016 9:55 am

Thu Oct 11, 2018 2:29 pm

verticalsoft wrote:
100% I've setup This before several times, I also use the same configuration for my customers...

I'm Just wondering if this is a bug with current version?
1. And customers have a good performance, right?
2. No, we have not notieced the performance issue with the current build.
3. What SSD disks do you have?
verticalsoft
Posts: 14
Joined: Mon Oct 08, 2018 3:07 pm

Thu Oct 11, 2018 2:39 pm

Vitaliy (Staff) wrote:
verticalsoft wrote:
100% I've setup This before several times, I also use the same configuration for my customers...

I'm Just wondering if this is a bug with current version?
1. And customers have a good performance, right?
2. No, we have not notieced the performance issue with the current build.
3. What SSD disks do you have?
Hello,

6 X 1.6TB SSD SAS Enterprise Drive - HGST in each node!
The underlying storage works beautiful, I'm just not seeing the fastpath on the Iscsi loopback!

Why would the iScsi use the remote replica and not it's local storage?

I need to test my customers but they are on an older version and It's in production so cant really mess around too much.

Kind regards
Dave
Post Reply