Poor performance

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Fri Oct 01, 2021 6:33 am

Hello All!
According to StarwWnd performance benchmarking best practices
https://www.starwindsoftware.com/best-p ... practices/
I've created a RAM disk without any replication etc.Then I've made performance tests with parameters as in:

https://www.starwindsoftware.com/blog/s ... ce-testing
and
https://www.starwindsoftware.com/cache- ... erformance

for 4K random read with the same parameters of diskspd as you published:

-r -F10 -o20 -c50G -d30
-r -b 4 -t 2 -o 32 -w 0 -d 0

My results are:

| MiB/s    | IOPS         |  latency |
| 937.73 |       240059 |        0.8 |
| 533.30 |       136523 |        0.4 |

and your published results are:

| MiB/s   | IOPS         |  latency |
|   1560  |      399499 |        0.5 |
| ~2200  |    ~550000 |    ~0.68 |

As you can see, your results are more pretty. What should I configure to get "the same" results?

Best regards,
Yury
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Fri Oct 01, 2021 1:52 pm

Hi,

I'd like to mention that Storage Spaces is a software RAID that may have performance degraded with the flow of time. Please try hardware RAID for your storage system instead of a software one.
Did you benchmark the underlying storage performance? If so, please share the results.
jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Fri Oct 01, 2021 3:28 pm

Hi Yaroslav,
I've made performance test on the same volume where StarWind image belongs (free space). The results are below:
-c50G -r -b4K -d30 -w0 -F10 -o20 -D -L -h
| MiB/s | IOPS | latency |
| 1944.49 | 497789.71 | 0.401 |

-c50G -r -b4K -d30 -w0 -t2 -o32 -D -L -h
| 819.97 | 209912.80 | 0.304 |

So, I've measured RDMA traffic on the sync channels and iscsi initiators.
The value of bytes/sec not exceed about 450 MB/s and equal for RDMA and iscsi target(s).
Additionally, when I run test at the same time on two different volumes, that belongs on different storage pools the summary traffic stays roughly the same.

Best regards,
Yury
jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Fri Oct 01, 2021 4:11 pm

Dear Yaroslav,
My first results were for StarWind RAM disk storage.

Best regards,
Yury
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Fri Oct 01, 2021 10:29 pm

Hi,

It seems iSCSI here itself is a bottleneck. Try populating the initiator with 5 local and 5 partner iSCSI sessions.
Also, try connecting local IP to local IP. Say, if your iSCSI IP address is 172.16.10.10, try connecting 172.16.10.10 to 172.16.10.10 in iSCSI Initiator.
jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Sat Oct 02, 2021 4:49 pm

Hi Yaroslav!
OK, and how can I troubleshoot this case?
One of the nodes have big CPU load, when I work with StarWind disk. I suspect, that your loopback accelerator does not works properly on that node.
On "good" nodes, when I add connection, I have a list of ip addresses:
good.png
good.png (16 KiB) Viewed 4988 times
and on "poor" node only 127.0.0.1
bad.png
bad.png (7.96 KiB) Viewed 4988 times
What should I do?
How can I check that the accelerator driver working properly?
Do you have any registry keys, files etc. to check?
Could your advanced support staff help me?

Best regards,
Yury
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Tue Oct 05, 2021 4:57 am

Please provide me with a screenshot of the discovery tab. This can be an MS iSCSI Initiator GUI glitch.
jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Tue Oct 05, 2021 1:44 pm

I have reinstalled OS and StarWind software on the server. At this time CPU load is the same, but I still have an open question:
How can I determine that local accelerator driver works properly? It's a bad idea to reinstall sofware instead of diagnose the source of problem :)

Best regards,
Yury
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Thu Oct 07, 2021 9:56 am

CPU utilization depends on the synthetic test parameters. When the threads parameter is configured, CPU is utilized to the maximum possible extent for the number of CPUs you specified. Did you try connecting local-to-local iSCSI connections (e.g., 172.16.10.1 to 172.16.10.1)?
jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Fri Oct 08, 2021 9:23 am

Dear Yaroslav,
I've made many tests in various combinations :) and below are my finally results:

I've used three node HA configuration of StarWind vSan free whth volume size 1 TB (volume with size 3 TB cannot be created via powershell) on Microsoft Hyper-V Server 2019.
Each node has one dual-port Mellanox ConnectX-6 100 Gbit/s adapter for sync, one dual-port Mellanox ConnectX-4 25 Gbit/s adapter for iscsi/hb, one dual-port Mellanox ConnectX-4 adapter 25 GBit/s for cluster and management in lacp team and dual-port Marwell 4xxx adapter 10 GBit/s for VM external net.
Simple volume in storage space that contain 8x480 GB Micron MAX 5200 disks was used as a storage for StarWind images on the each server.
Each server has 512 GB of RAM and 2xAMD 7543 processors (2.8 GHz, 32 cores), with HT (SMT) is off and configured for max performance.

1. If I use 127.0.0.1 address then I restricted about 40 000 IOPS 4K random write.

2. If I use 10.1.X.Y ip addresses, the diskspd results are more pretty:
~90 000 iops 4K write, ~150 000 iops 4K read and ~1800 MB/s 128K sequential write. The CPU consumption is about 25% max on 2x AMD 7543 (32 core, 2.8 GHz) processors.
At the same time, underlying storage 8x480 GB Micron Max 5200, Simple volume has ~450 000 iops 4K write, ~300 000 iops 4K read and 3 500 MB/s 128K random write.
Below are results for disk queue inside vm and underlying storage
VM.png
VM.png (29.49 KiB) Viewed 4895 times
X.png
X.png (30.21 KiB) Viewed 4895 times
As you can see, the disk queue inside of vm is about 197 and at the same time, disk queue for underlying storage is about 9. So, I think that StarWind is a bottleneck.

When I try to copy a file inside of my VM I get following results:
copy_file.png
copy_file.png (10.77 KiB) Viewed 4895 times
If I use 127.0.0.1 address the results are more pretty and average throughput is about 150 MB/s

I hope that my results will helpful for someone who try to find any information about performance of StarWind and you to make your product better

Best regards,
Yury
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Fri Oct 08, 2021 12:33 pm

Thanks for sharing more details with me. I believe this to be related to iSCSI itself. Try at least 5 "local" and 3 "partner" iSCSI sessions. Should improve the performance.
What you can also do is
Finally, file copy is not a reliable test: Windows Server has SMB transfer speed https://docs.microsoft.com/en-us/window ... e-transfer. What you can do is using xcopy as MS suggest, or tweaking these small things
1. Set FirstBurstLength to 262144
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\{4d36e97b-e325-11ce-bfc1-08002be10318}\000X\Parameters
2. Remove the antivirus.
3. Set BIOS to high performance. Source: https://community.mellanox.com/s/articl ... nce-tuning In general, we wish to tune the BIOS for high performance. In most cases, BIOS performance tuning will be executed once. Note: Maximum performance configuration does not suit all applications as it consumes much power. However, in case of benchmark testing and performance sensitive clusters, performance configuration is recommended.
4. Check power settings on hosts, should be High performance.
5. Disable VMQ, RSS, and RSC for iSCSI adapters.
jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Fri Oct 08, 2021 1:27 pm

Points 2,3,4 were done.
I've tried point 1, but no big effect.
I've used 4 iscsi sessions to a local StarWind server.

Best regards,
Yury
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Sat Oct 09, 2021 2:07 pm

Thank you for your update. Please see if other fixes help to improve iSCSI performance.
jdeshin
Posts: 63
Joined: Tue Sep 08, 2020 11:34 am

Sat Oct 09, 2021 3:41 pm

Dear Yaroslav,
Unfortunately my test period in the datacenter has expired and I no longer have access to the hardware. Since I was not able to achieve acceptable performance, I will have to choose the classic solution with shared SAS storage on the DAS.

May be in the future I will try your software again (linux version, nvmeof etc.).

Best regards,
Yury
yaroslav (staff)
Staff
Posts: 2361
Joined: Mon Nov 18, 2019 11:11 am

Sat Oct 09, 2021 4:54 pm

Yury,

It is sad to read that you were not able to implement the suggested changes.
Hope hearing back from you!
Post Reply