New hyperconverged cluster - need some advices

tiaccadi · Tue Oct 24, 2017 7:59 am

Hello everybody,

I'm planning to build a new hyperconverged infrastructure based on Hyper-V cluster (W 2012 R2 Datacenter) and StarWind Virtual SAN

Idea is to use two old Fujitsu RX300 S7, configured as below:

2x E5-2640 (6-core with HT)
96 GB of RAM
12x 600 GB SAS 10k on a DS2616 RAID controller (even if I'm not sure this one could handle 12 disks)
2x (or 6x) 1 Gbit ports
2x 10 Gbit ports

Some questions raised in my mind:

1. Will I have enough "power" (related to disks and CPUs) to handle both virtualization (more or less 20 virtual machine with low to medium load, and 1 with medium to high -SQL Server-) and storage layer?
2. I was thinking about building a huge unique RAID 10 volume on both servers using 10 disks (I want at least a ready spare per server).. therefore 24 600 GB disks will provide me an usable volume of only 3 TB (3 TB per server, mirrored..); is there something else I could do for saving some space?
3. Are there some best practices for sizing HA devices that I'll create inside the 3 TB volume?
4. Is it ok to use only a single 10 Gbit port per server using a direct connection, leaving the second one for future use (e.g. another node)?
5. Is it better to use LSFS or image based volumes? Is it possible to mix them (e.g. LSFS for the high load VM and images for the low to medium ones)

That's all for now, but I guess more questions will come..

Thanks everybody in advance!

Tue Oct 24, 2017 8:01 pm

Hello tiaccadi,

1. Will I have enough "power" (related to disks and CPUs) to handle both virtualization (more or less 20 virtual machine with low to medium load, and 1 with medium to high -SQL Server-) and storage layer?

Just looking through the specs, the server looks nice. However, as an engineer, I can't, in all honesty, confirm the spec without more detail. Assuming that low to medium load is a VM with 2 cores and 4Gb RAM, this means that these VMs will need 40 virtual cores and 80Gb RAM. The server you described has 12 physical cores-24 threads-roughly 48 vcores. So after the VMs, you have 16Gb RAM and 8 vcores left, which should be enough for the SQL (either as a role or as a VM). But that's just theorycrafting

2. I was thinking about building a huge unique RAID 10 volume on both servers using 10 disks (I want at least a ready spare per server).. therefore 24 600 GB disks will provide me an usable volume of only 3 TB (3 TB per server, mirrored..); is there something else I could do for saving some space?

For a spindle drives, we recommended RAID-10 with Write-Back as write policy and Read-Ahead as the read policy on the raid controller. As the option, you can test the RAID-50/60 (2 SPANs), but in this scenario, the performance will not be the same as the RAID-10

3. Are there some best practices for sizing HA devices that I'll create inside the 3 TB volume?

You can create the full-size StarWind device on this volume and present this device to the Failover Cluster. Do not forget about the Quorum/Witness device.

4. Is it ok to use only a single 10 Gbit port per server using a direct connection, leaving the second one for future use (e.g. another node)?

The best practice over here will be using both 10 GB NICs for Sync and iSCSI to achieve the best performance.

5. Is it better to use LSFS or image-based volumes? Is it possible to mix them (e.g. LSFS for the high load VM and images for the low to medium ones)

I would recommend using the image files instead of LSFS drives. Yes, it is possible to mix them, but you need be aware of over-provisioning on the LSFS device and other requirements which are described in our KB article here : https://knowledgebase.starwindsoftware. ... scription/

Thank you!

tiaccadi · Thu Oct 26, 2017 7:14 am

Hello Ivan,

thank you very much for your answers!

One clarification: did I understand correctly that Write-Back as write policy must be set on StarWind, while Read-Ahead as read policy must be set on the RAID controller?
What about write policy on the RAID controller?

Moreover: may you please better clarify what are the "power availability/reliability" requirements? I mean: how much a StarWind cluster will suffer a power outage?
Is it BETTER to avoid not-graceful shutdown, or is it a MUST?

Thu Oct 26, 2017 1:56 pm

Hello tiaccadi,

One clarification: did I understand correctly that Write-Back as write policy must be set on StarWind, while Read-Ahead as read policy must be set on the RAID controller?
What about write policy on the RAID controller?

I mean the Write-back policy should be set on the raid controller. As far as I can see your RAID controller (based on LSI SAS2108) supports Write-back mode.

Moreover: may you please better clarify what are the "power availability/reliability" requirements? I mean: how much a StarWind cluster will suffer a power outage?
Is it BETTER to avoid not-graceful shutdown, or is it a MUST?

Of course, I would recommend avoiding the non-graceful shutdown if you have L1 (RAM) cache in WB mode on StarWind devices because you will get a Full sync after each non-graceful shutdown.
However, if you will have a total power-outage, the StarWind devices will be "Not Synchronized" on both nodes and it should be "Marked as Synchronized" manually.

tiaccadi · Mon Oct 30, 2017 10:30 am

Ok, so write-back and read-ahead on RAID controllers

What about expected performance? 10 10k disks in RAID 10 should reach more or less:

1450 read-IOPS
725 write-IOPS

How much these data will be affected by StarWind VSAN?

And last but not least: with 2.7 TB (10 600 GB disks) of usable data, how much L1 (RAM) cache I should configure on StarWind?

Thank you very much Ivan!

Mon Oct 30, 2017 3:57 pm

Hello tiaccadi,

What about expected performance? 10 10k disks in RAID 10 should reach more or less:

1450 read-IOPS
725 write-IOPS

Depends on patterns. Since the spindle drives are good on sequential writes/reads and not good enough on the random operations.
Basically you can calculate the performance by this link: http://wintelguy.com/raidperf.pl

And last but not least: with 2.7 TB (10 600 GB disks) of usable data, how much L1 (RAM) cache I should configure on StarWind?

We recommend specifying 1 GB of L1 WB cache per 1 TB of Storage. So if StarWind device will be 2.7 TB I would recommend specify 2.7 GB of RAM cache.

Thank you

tiaccadi · Tue Oct 31, 2017 8:03 am

Hello Ivan,

what I meant was: 10x 10k SAS disks in RAID 10 should perform

1450 100% random read IOPS and
725 100% random write IOPS

These are the pure performances of disks/RAID, that I could reach directly accessing the volume (e.g. using an Hyper-V VM with VHDX hosted on the Windows volume)

What will happen once I'll add the StarWind storage layer (single volume, no cluster at the moment)? The disks will perform the same, but the added layer should cause more latency and maybe less IOPS seen by the "consumer" (the same VM, but this time hosted on the StarWind volume).. isn't it?
Or the additional L1 cache will help the writes, causing, instead, more performance?

And last but not least: how these performance will change when I'll add the second node (which will require sync of every write)?

Wed Nov 01, 2017 3:11 am

Hello tiaccadi,
The performance penalty of adding StarWind in to the mix on one node will be negligible.
Once you add in the second node your write IO will go down slightly (which can be countered with write back L1 cache), however your reads IO will be nearly double what it was (due to read operations being processed by both nodes). Exact numbers are hard to pinpoint, however the only major change will be the read increase.

And last but not least: how these performance will change when I'll add the second node (which will require sync of every write)?

I would recommend using Least Queue Depth or Round-Robin as the MPIO policy. With those MPIO policies, you will receive the better performance.

tiaccadi · Mon Nov 06, 2017 1:06 pm

What about networking? Is it ok to use a configuration similar to the one below, in a Hyper-V 2012 R2 cluster?

iSCSI1 and iSCSI2 will be the two 10 Gbit ports, and they will host StarWind replication traffic too

The virtual switch will use all the other 1 Gbit ports, and I'll add two more virtual NICs for StarWind heartbeat

Will it be okay?

Wed Nov 08, 2017 10:39 am

Hello tiaccadi,

iSCSI1 and iSCSI2 will be the two 10 Gbit ports, and they will host StarWind replication traffic too

I would recommend you to use different one 10 GB port per server for replication (StarWind Sync) and it is highly recommended direct connection for StarWind Synchronization (port-to-port). If you will put iSCSI and Sync on one channel you could have performance issues

Since it will be HyperConverged you can connect iSCSI directly as well. So, if it possible I would recommend connecting both 10 GB NICs directly between servers.

The virtual switch will use all the other 1 Gbit ports, and I'll add two more virtual NICs for StarWind heartbeat

Basically, you can team those 1 GB NIC and create the Virtual Switch for your VMs on top of Microsoft Teaming.

tiaccadi · Wed Nov 08, 2017 11:19 am

I'm going to connect both 10 GB ports of Server1 directly to the corresponding ports of Server2 (I know the diagram shows NICs connected to switches.. excuse me).. somethink like this:

Server1-Port1 <-> Server2-Port1
Server1-Port2 <-> Server2-Port2

In this way, I'll have two different 10 Gbit channels.. but I cannot understand what are you advicing me to do with them

Do you suggest me to use 1 10 Gbit channel for sync and the other 10 Gbit channel for iSCSI?

---

Another question: I read that L2 cache algoritm wasn't so optimized in the past.. what's the situation now? Does it worth to use that now?

What about using a single 240 GB eMLC SSD per server (Seagate Nytro XF1230)? Am I wrong or StarWind will mirror them, so "local" RAID could be avoided?
What will happen to HA devices and/or L2 cache when one SSD will break?

Thank you!

Wed Nov 08, 2017 3:41 pm

Tiaccadi,

In this way, I'll have two different 10 Gbit channels.. but I cannot understand what are you advicing me to do with them Do you suggest me to use 1 10 Gbit channel for sync and the other 10 Gbit channel for iSCSI?

Yes, that's correct. To avoid any issues with sync channel and performance we always recommend connecting 10 GB channels directly and use separated ports for Sync/iSCSI

I read that L2 cache algoritm wasn't so optimized in the past.. what's the situation now? Does it worth to use that now?

If your environment will be read intensive you can use L2 caching. StarWind L2 cache right now working only in Write-Through mode.

What about using a single 240 GB eMLC SSD per server (Seagate Nytro XF1230)? Am I wrong or StarWind will mirror them, so "local" RAID could be avoided?

Feel free to use this scenario. And yes, it will be "mirrored via network".

What will happen to HA devices and/or L2 cache when one SSD will break?

HAdevice will show as "Non-active" on site which has failed drive.

Thank you.