Page 1 of 1

Poor perfomance with replica enabled on sequential writes

Posted: Thu Jan 14, 2016 11:40 am
by santiagocastro
Hi all!

During a few tests with a new replica configuration I have found a serious issue involving sequential writes with outstanding operations. I have used Starwind Virtual SAN v8.0.8730 on W2012R2

Here are my results using CrystalDiskMark. Note that the Sequential Write (Q= 32,T= 1) IO test is 128KiB per IO, while Sequential Write (T= 1) IO test is 1MiB per IO. There aren't any bottlenecks on physical storage due to the large L1 cache.

Scenario #1 10GB LSFS volume with 20GB L1 WB cache and WITHOUT replica configured. Good results.

-----------------------------------------------------------------------
CrystalDiskMark 4.0.3 x64 (C) 2007-2015 hiyohiyo
Crystal Dew World : [ ... ]
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

Sequential Read (Q= 32,T= 1) : 1504.296 MB/s
Sequential Write (Q= 32,T= 1) : 1122.517 MB/s Very good result
Random Read 4KiB (Q= 32,T= 1) : 136.157 MB/s [ 33241.5 IOPS]
Random Write 4KiB (Q= 32,T= 1) : 90.135 MB/s [ 22005.6 IOPS]
Sequential Read (T= 1) : 1076.068 MB/s
Sequential Write (T= 1) : 655.435 MB/s
Random Read 4KiB (Q= 1,T= 1) : 25.029 MB/s [ 6110.6 IOPS]
Random Write 4KiB (Q= 1,T= 1) : 24.109 MB/s [ 5886.0 IOPS]

Test : 8192 MiB [P: 0.8% (0.1/10.0 GiB)] (x9)
Date : 2016/01/14 10:00:42
OS : Windows Server 2012 R2 [6.3 Build 9600] (x64)

Scenario #2 10GB LSFS volume with 20GB L1 WB cache and with replica configured and synchorized. Bad results.

-----------------------------------------------------------------------
CrystalDiskMark 4.0.3 x64 (C) 2007-2015 hiyohiyo
Crystal Dew World : [ ... ]
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

Sequential Read (Q= 32,T= 1) : 980.588 MB/s
Sequential Write (Q= 32,T= 1) : 14.287 MB/s !!!!! Why??????
Random Read 4KiB (Q= 32,T= 1) : 122.178 MB/s [ 29828.6 IOPS]
Random Write 4KiB (Q= 32,T= 1) : 74.826 MB/s [ 18268.1 IOPS] --> Good result, there is not any network latency or bandwith issue for this value
Sequential Read (T= 1) : 992.622 MB/s
Sequential Write (T= 1) : 130.435 MB/s --> Good result, there is not any network latency or bandwith issue for this value
Random Read 4KiB (Q= 1,T= 1) : 24.311 MB/s [ 5935.3 IOPS]
Random Write 4KiB (Q= 1,T= 1) : 5.279 MB/s [ 1288.8 IOPS]

Test : 8192 MiB [P: 0.8% (0.1/10.0 GiB)] (x9)
Date : 2016/01/14 10:30:13
OS : Windows Server 2012 R2 [6.3 Build 9600] (x64)

Scenario #3 10GB LSFS volume with 20GB L1 WB cache and with replica removed from replication manager while configured. Still bad results.

-----------------------------------------------------------------------
CrystalDiskMark 4.0.3 x64 (C) 2007-2015 hiyohiyo
Crystal Dew World : [ ... ]
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

Sequential Read (Q= 32,T= 1) : 837.927 MB/s
Sequential Write (Q= 32,T= 1) : 30.132 MB/s !!!!! Why this bad value?????? There is no network activity!!
Random Read 4KiB (Q= 32,T= 1) : 5.656 MB/s [ 1380.9 IOPS]
Random Write 4KiB (Q= 32,T= 1) : 94.785 MB/s [ 23140.9 IOPS] --> Better result because there is no network activity
Sequential Read (T= 1) : 50.536 MB/s
Sequential Write (T= 1) : 307.899 MB/s --> Better result because there is no network activity
Random Read 4KiB (Q= 1,T= 1) : 0.522 MB/s [ 127.4 IOPS]
Random Write 4KiB (Q= 1,T= 1) : 15.577 MB/s [ 3803.0 IOPS]

Test : 8192 MiB [P: 0.8% (0.1/10.0 GiB)] (x9)
Date : 2016/01/14 10:42:02
OS : Windows Server 2012 R2 [6.3 Build 9600] (x64)

Because this results I suspect that there is some software related problem managing large IOs while outstanding operations with a volume is configured as HA.

Any suggestion to get better results will be highly appreciated.

Re: Poor perfomance with replica enabled on sequential writes

Posted: Fri Jan 15, 2016 10:03 am
by LogitComputer
Hello Santiago,

I was to write the same issue down because I had headaches with my Database guy yesterday.
Workaround (not the final solution) : Move VM to your other Clusternode until the root-cause found:

Scenario 1, Flat-Disk, iSCSI via 127.0.0.1, no replica
Very good read/write ratios as expected

Scenario 2, Flat-Disk, iSCSI MCS via 127.0.0.1 as Active / 10GB Ethernet 10.0.0.2 as Passive, with replica
Very good read (around 1200-1400MB/sec), very poor write (somewhat 20-45MB/sec) on SSD

Scenario 3, Flat-Disk, iSCSI MCS via 10GB Ethernet 10.0.0.2 as Active / 127.0.0.1 as Passive, with replica (moved VM to other Clusternode)
good read (around 850MB/sec), good write (around 650MB/sec) on SSD

Conclusion:
Non-Hyper Converged via 127.0.0.1 --> very good performance overall as expected
Hyper-Converged via 127.0.0.1 --> very poor write performance, Read as expected
Hyper-Converged via "partner clusternode" --> expected good performance overall

I will post my results later, will use Crystal Diskmark and/or IOMeter if Support want.

Regards,

Josip.

Re: Poor perfomance with replica enabled on sequential writes

Posted: Fri Jan 15, 2016 10:27 am
by santiagocastro
Hi Josip,

Thanks for your reply, aame conclussion here!. Sequential write performance also is increased when using hyperconverged scenario using iSCSI connections to local private IPs (i.e 192.168.x.x, not 127.0.0.1).

I think the problem is related to loopback acceleration, either of the Starwind software or the Windows implementation.

Could you try my workarround in your scenario with your database?

Regards,

Santi.

Re: Poor perfomance with replica enabled on sequential writes

Posted: Fri Jan 15, 2016 10:38 am
by LogitComputer
Not in the next couple of hours, maybe tomorrow, but it will be a different test-machine because the new DB2 Server goes live in 7 hours :)

Re: Poor perfomance with replica enabled on sequential writes

Posted: Fri Jan 15, 2016 6:03 pm
by anton (staff)
Guys,

1) Please post your numbers (ones you can share) here.

2) Please DO NOT USE CrystalDiskMark, use Intel I/O Meter, Oracle VDBench (very good for performance tests with storages doing caching, spoofing and in-line dedupe, very flexible in terms of I/O range, I/O patter etc).

We'll be happy to take a look and troubleshoot the issue. Especially if it's a repro from many customers!

Re: Poor perfomance with replica enabled on sequential writes

Posted: Wed Feb 10, 2016 4:46 pm
by Anatoly (staff)
Getntlemen, any update on this please?

Re: Poor perfomance with replica enabled on sequential writes

Posted: Thu Feb 11, 2016 1:48 pm
by santiagocastro
Sorry, too busy on these days.

I shall not be able to repeat tests until a few weeks...

Re: Poor perfomance with replica enabled on sequential writes

Posted: Fri Feb 12, 2016 3:09 pm
by Tarass (Staff)
No problem, take your time and come back again with an interesting update :-)

Re: Poor perfomance with replica enabled on sequential writes

Posted: Fri Oct 21, 2016 7:26 am
by leon3147
same here :-(
poor write speed when replica with 10g link

support answer with this is welcome !

Re: Poor perfomance with replica enabled on sequential writes

Posted: Mon Oct 24, 2016 4:19 pm
by Al (staff)
Hello gentleman,

Please refer to original thread:
https://forums.starwindsoftware.com/vie ... f=5&t=4581