Slow sync data rate

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
happymeal
Posts: 3
Joined: Thu Feb 27, 2020 11:38 am

Thu Feb 27, 2020 11:53 am

Morning!

So this weekend I've tried to setup a 2-node Cluster on-site, using Starwind VSAN, in Server 2016 Datacenter.
I've created a HA Device between my two nodes which is 5.5TB (I'm using it to host several VM storage), and once the device is created, the sync is...slow. The most I've seen it top was 56KBps.

My Servers are HP DL380 G9's, with 4 x 2TB SAS drives in a RAID5. They connect out to the network using 1GB NIC's, with one used for the iSCSI heartbeat, however I've connected both Servers together using a built-in 10GB NIC.

Both 10GB nics have 192.168.x addresses, and are set to Private.
They both have:
-Flow Control disabled
-Jumbo Packet set to 9014 bytes
-Receive buffers set to 3000
-Transmit buffers set to 5000
-Speed/Duplex set to 10GBps Full Duplex
-Disabled VMQ

I've run iperf between the two using the 192.168.x addresses and I'm seeing a connection speed of 500-600MBps and a bandwidth of 5GBps.

I've also tried doing this with just a 10GB HA device, but the sync rate is still the same.

Don't suppose anyone out there has had this issue before, and could help improve the sync speed?
Is there anything I can add to this to help clarify it?

*EDIT- I've since restarted the Servers, made sure the Drivers were up-to-date, and now seeing a Txfr of 1.25MBps, and Bandwidth of 10.4MBps. Checked the Servers, and can see I've used a Cat5 Crossover.*
Michael (staff)
Staff
Posts: 317
Joined: Thu Jul 21, 2016 10:16 am

Fri Feb 28, 2020 5:24 pm

Most probably the reason is here: - > 4 x 2TB SAS drives in a RAID5.
Check out recommended settings here: https://knowledgebase.starwindsoftware. ... ssd-disks/
I believe it makes sense to start the investigation from double-checking the performance of underlying storage on both servers.
happymeal
Posts: 3
Joined: Thu Feb 27, 2020 11:38 am

Mon Mar 02, 2020 9:53 am

Ok I'll check the storage out.
We've been using RAID5 for a while (Not seen any performance issues with backing up/restoring). We've had to squeeze as much storage out of it as possible hence the RAID5, and because we can't afford to jump over to SSD's just yet.
Michael (staff)
Staff
Posts: 317
Joined: Thu Jul 21, 2016 10:16 am

Tue Mar 03, 2020 11:02 am

Please keep the community updated.
happymeal
Posts: 3
Joined: Thu Feb 27, 2020 11:38 am

Mon Jun 15, 2020 10:57 am

Ok - just to bring you up to speed:

3rd time writing this post - 3rd times the charm.

I've replaced both Cat5e cables with Cat7e's, direct connection to both servers using a 10GB NIC for CSV Sync, and another for the Heartbeat, and I'm now seeing a Transfer Rate of around 6.5GBps. It currently takes between 20-40 mins to create a 1TB share using the 2 Node Creation script - so there's an improvement!

The downside now: I'm seeing Sync issues.

Before I can put this into Production, I need to test out some common scenarios - one is an unexpected Server outage. I've tested it by shutting down one of our Servers out of hours, but when the Server comes back up Starwind struggles to recover. I've tried running the Sync script from the StarwindX Folder and even though it tells me it's synced in the script, the GUI shows its not. I've even left it overnight to see if Starwind recovers and syncs, but no joy. The CSV works as expected when I add data to it though (I'm guessing that's expected?)

Anyone else seeing this? I can't put it into Production unless I've tested every scenario and I can't seem to recover from a Server outage. Any suggestions?
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Tue Jun 16, 2020 6:12 am

but when the Server comes back up Starwind struggles to recover.
Could you be more specific? Does it have troubles with starting synchronization or full sync gets triggered? Do you use write-back cache? Logs will be helpful; see how to collect logs here https://knowledgebase.starwindsoftware. ... collector/. Use Dropbox or Google drive to share the logs.
The downside now: I'm seeing Sync issues.
Could you tell me what StarWind build is installed?
Any suggestions?
As long as StarWind HA device is Not Synchronized on one side, the local iSCSI target (i.e., the target on the not synchronized side) will be Reconnecting. All I/O is done on the synchronized partner. You can see that CSV is in the Redirected state as long as it is owned by the Not Synchronized side. That's expected behavior. You should see what's going on in iSCSI Initiator. Please share the screenshot with me.

I'd like to see the logs anyway.
Post Reply