Maximum Performance in 2-Node-HA-Cluster

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
KPS
Posts: 51
Joined: Thu Apr 13, 2017 7:37 am

Thu Apr 13, 2017 2:09 pm

Hi!

I need to replicate a disk to another room. As I need to run a high performance database on the systems, I need an expectation, if Starwind can bring this.

I need at least 60.000 IOPS on the 4TB volume (8k random 50% read/write), that will be replicated. Is this possible? If yes - can you tell me something abount the needed hardware?

- Can a full flash array provide that IO-Speed?
- Are additional NVMe-Cards an improvement, or is a full-flash array without NVMe faster?

Thank you for your help

Regards,
KPS
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Thu Apr 13, 2017 7:10 pm

Hi KPS,
That's definitely doable.
So for 60K IOPS @4 TB you'll probably need something like 6x 800 GB SSDs or 11x 400 GB SSDs. 6 SSDs should bring you to ~55-65K IOPS and 11 SSDs will definitely do more than that.
NVMe is not necessary at these speeds.
One more thing you need is 10 GbE redundant connections between the nodes to make HA work correctly.
Max Kolomyeytsev
StarWind Software
KPS
Posts: 51
Joined: Thu Apr 13, 2017 7:37 am

Thu Apr 13, 2017 7:29 pm

Hi!

That sounds really good! Thank you for your answer!!

Just some additional questions:

- Is a redundant 10 GbE-Connection sufficient as sync-connection, or would there be a benefit through 40 or 100 GbE Cards?
- In this setup: should I use a standard (thick-provisioned) or LSFS volume to get the best performance? If LSFS: Should I use Raid 5 or 6 then instead of a Raid 10?
- Is there _any_ benefit, if an NVMe is added to a full flash array, or does it only make sense with spindles and slows down fast full-flash arrays?

Thank you and best wishes,
KPS
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Tue Apr 18, 2017 12:47 pm

- Is a redundant 10 GbE-Connection sufficient as sync-connection, or would there be a benefit through 40 or 100 GbE Cards?
Redundant 10 GbE and 1x 1 GbE is sufficient. 10 GbE used for sync & iSCSI connections, 1 GbE - Heartbeat link
- In this setup: should I use a standard (thick-provisioned) or LSFS volume to get the best performance? If LSFS: Should I use Raid 5 or 6 then instead of a Raid 10?
I would suggest using thick provisioned device for best performance, since you're using SSDs already you don't need to worry about RAID5 performance penalty. LSFS overhead doesn't pay off at your capacity and also won't accelerate much beyond what you can get with SSDs. It could be used if you had a hybrid storage environment.
- Is there _any_ benefit, if an NVMe is added to a full flash array, or does it only make sense with spindles and slows down fast full-flash arrays?
In a nutshell: Is there any benefit - yes, is there one big enough in your case to use NVMe - don't think so. NVMe does give a performance benefit, but don't expect a huge boost if your database is actively using its entire capacity. The best results are to be observed if NVMe cache/tier size is greater or equal to the working set.
Max Kolomyeytsev
StarWind Software
KPS
Posts: 51
Joined: Thu Apr 13, 2017 7:37 am

Tue Apr 18, 2017 2:34 pm

Hi Max!

Thank you for your input!

Just some additional questions:

@Raid: You wrote "Raid 5 on SSDs". Would you prefer Raid 5 over Raid 10?

@NVMe: I will test this, next week.

At the moment, I am still not seeing the real work-case for LSFS. If I need to reserve 2-3 times the space of the working-set, a flat SSD-only-set should not be more expensive, than a hybrid-array with LSFS, or am I wrong?

Best wishes,
KPS
User avatar
Max (staff)
Staff
Posts: 533
Joined: Tue Apr 20, 2010 9:03 am

Wed Apr 19, 2017 1:23 pm

RAID preference depends on required performance and capacity. Need IOPS - go for RAID10, Need Capacity - RAID5.
As for LSFS - 2-3 times is the maximum possible number. Typical 65/35 Read/Write workload doesn't cause more than 1.5 overhead. This allows to get a hybrid array with flash features for ~$0.3-0.4/GB before Dedupe even kicks in. However, as flash is getting cheaper LSFS starts making much more sense at bigger capacities.
Max Kolomyeytsev
StarWind Software
Alexey Sergeev
Posts: 26
Joined: Mon Feb 13, 2017 12:48 pm

Tue Aug 22, 2017 1:03 pm

Here's been a discussion about performance so I'd like to post another question.

What actual performance boost should I expect with LSFS-device based on this hardware?

5 x 2TB SATA HDD's, 5400RPM, 4k, RAID 5
IBM ServerRAID M5120
RAM L1 Cache: 1GB
RAM Deduplication cache: 4GB
Logical block size: 512 bytes

Compared with thick device based on the same hardware I've got very good results in random read/write operations but sequential writes noticeably slower.
I've attached a few diskspd test results, could you take a look, please? Those were standalone targets connected only via loop-back address.
We don't have any write intensive applications in our environment (common file shares mostly) so I need to decide which type of device would be better.
I know about overprovisioning in case of LSFS-device and going to place 2TB of data in there with 15GB RAM for deduplication and 2GB RAM for L1 cache.

What do you think about it?
Attachments
flat.7z
(2.83 KiB) Downloaded 587 times
lsfs.7z
(2.76 KiB) Downloaded 578 times
User avatar
anton (staff)
Site Admin
Posts: 4021
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue Aug 22, 2017 2:33 pm

With LSFS you shift workload so you get moderate OK sequential writes and reads, same random reads (we can't do much except to cache them, and it's a bad pattern for caching) and random small writes don't exist anymore: you get bandwidth / 4K = IOPS (magic!!!).

So... If your workload is sequential - don't use LSFS. If your workload is mostly small random reads and (especially!) writes - absolutely go for it.

In your case 100K 4KB 100% random write IOPS should be achievable.
Alexey Sergeev wrote:Here's been a discussion about performance so I'd like to post another question.

What actual performance boost should I expect with LSFS-device based on this hardware?

5 x 2TB SATA HDD's, 5400RPM, 4k, RAID 5
IBM ServerRAID M5120
RAM L1 Cache: 1GB
RAM Deduplication cache: 4GB
Logical block size: 512 bytes

Compared with thick device based on the same hardware I've got very good results in random read/write operations but sequential writes noticeably slower.
I've attached a few diskspd test results, could you take a look, please? Those were standalone targets connected only via loop-back address.
We don't have any write intensive applications in our environment (common file shares mostly) so I need to decide which type of device would be better.
I know about overprovisioning in case of LSFS-device and going to place 2TB of data in there with 15GB RAM for deduplication and 2GB RAM for L1 cache.

What do you think about it?
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Post Reply