New to Starwind vSAN - Rolling out on a homelab - best practices?

Benoire · Tue Jan 09, 2018 12:49 am

Hi

I'm currently utilising vSAN from VMWare on a 3 node HA / DRS enabled host via my VMUG Advantage subscription. Its a fine performing product except that it doesn't deal with more than 1 host loss without crashing every VM due to the way it works. My interest in Starwind vSAN is due to the nature of full replicated storage rather than partial + witness as per the current situation.

Before I go full long in to the system, I have been trying it out on my current setup, appreciating that performance will not be there and I have to say it was quite simple once I got my head around it. Now I'm curious to try it out properly but want to plan it out correctly. Having run the VMware solution I do have SSDs as cache layers and HDDs as storage so can implement L2 cache fine. I won't do anything until my 'new' Dell R710 arrives and I transfer the content of my Supermicro setup on to it so that gives time to go through the best setup.

So the new setup would be 2 of the following:

Dell R710 with 60GB Ram
1 x 250GB HDD for direct ESXi datastore (Storing the starwind VM on it)
1 x 240GB SSD as L2 Cache
4 x 300GB SAS Data drives with the ability to add more as required
4 x 1gbe NIC
vSphere 6.5

General rack stuff
Dell 1920W UPS
Unifi managed switches
Xpenology storage server using DSM 6.1

So now on to the nitty gritty stuff

VM Configuration
Is there any best practice for the VM? The OS's will be Windows Server 2016 from my uni access. How much memory should be assigned to each vSan VM? VMware allocates approximately 6-8GB per host in hybrid mode; what are peoples opinions and experiences?

Disk pass-through
I intend to use RDM to pass through the storage drives and SSDs to the VMs directly for max performance while keeping the simple 250GB drive as a datastore for the vSAN VM and other non-moveable VMs on.

SSD
Will act as L2 cache. What is better, write-back or through? VMWare utilises a split between both but I've read that if the write back L2 is not flushed before the host shuts down then it can cause all sorts of errors and stops an automatic boot up of the HA images due to mismatch? As this is homelab, and limited by the 1GBe network, performance doesn't have to be blistering; just enough to not cause trouble with the OS.

HDD
I was planning on using MS's storage spaces with simple storage to combine the data drives together to create a single 1.2TB drive to serve the HA image. I presume this wouldn't be an issue? I could use parity to ensure that I get resilience from the data drives but I suspect that storage spaces in Parity Mode would be shockingly bad from what I've read.

Anyone got any thoughts on this and experiences they wish to share?

Network
Network will be 4 1GBe ports per server, I was planning on using 2 for VM/Management and two for heartbeat and synchronization.. These are fed in to my unifi switches - all 1gbe as I'm not sure I need 10Gb although it would be rather fun! This essentiallt how I run the current config but between different providers its worth asking.

changing number of nodes once trial period is over.
I read that once the trial is over and GUI access is lost, that you can't add new nodes to a 2 node setup with powershell. I'm not sure if I'll need more nodes, the only reason I ran three was due to VMwares implementation needing a minimum of 3. I presume for the home user the powershell approach is perfectly fine?

Controlled UPS shutdown
Now one of the main reasons for my change was controlled shutdowns under UPS. At the moment, the current vSAN uses 3 nodes; 2 copies of the data and a witness. Once you lose a host, you're not protected from data loss - Shutting down to a single host with VMware vsan is not possible without killing the VMs. What I want to happen is that on power loss, the secondary host migrates any VMs that need to remain up to the primary host then both hosts shutdown VMs that are not necessary, secondary host then enters maintenance mode and then powers off when complete to extend the battery life for as long as possible before shutting down.

Using MUMC (Dell UPS management software) and PowerCLI I can script in the ability for HA / DRS to be disabled and then run a controlled power off on certain groups of VMs when the power goes out but the main concern is that a) that Starwind vSAN can shutdown to a single host and still remain functional (without split-brain), b) that on a controlled shutdown via UPS software that the on power restore, Data HA via vSAN will restore automatically and without issue when both hosts resume (or host 2 comes back online if the power is restored before the batteries fully deplete.

Appreciate a lot of questions here but would be very interested in thoughts and views on the subject as well as any advice.

Thanks,

Chris

Tue Jan 09, 2018 5:17 pm

Using MUMC (Dell UPS management software) and PowerCLI I can script in the ability for HA / DRS to be disabled and then run a controlled power off on certain groups of VMs when the power goes out but the main concern is that a) that Starwind vSAN can shutdown to a single host and still remain functional (without split-brain), b) that on a controlled shutdown via UPS software that the on power restore, Data HA via vSAN will restore automatically and without issue when both hosts resume (or host 2 comes back online if the power is restored before the batteries fully deplete.

You need to make sure you have got an HBA rescan script configured on both hosts. You can get a copy of that script from the configuration guide for 2-node ESXi hyperconverged setup. The main action required from your side then will be switching on both nodes after power is restored. The nodes will negotiate their priority in terms of data synchronized and start the fast/full synchronization process. If for some reason only one node is online, while the other one is broken, you will need to manually mark the devices on it as synchronized for ESXi to see the datastores upon HBA rescan (done manually or through the rescan script.

Benoire · Sat Jan 13, 2018 8:44 pm

Boris (staff) wrote: You need to make sure you have got an HBA rescan script configured on both hosts. You can get a copy of that script from the configuration guide for 2-node ESXi hyperconverged setup. The main action required from your side then will be switching on both nodes after power is restored. The nodes will negotiate their priority in terms of data synchronized and start the fast/full synchronization process. If for some reason only one node is online, while the other one is broken, you will need to manually mark the devices on it as synchronized for ESXi to see the datastores upon HBA rescan (done manually or through the rescan script.

Thanks, so from what I can determine there should be no issues controlling the shutdown process and leaving a single host up as long as its partner was shutdown correctly? On power restoration, as long as both hosts boot up at the same time, then everything should start fine, assuming that script is run?

Any comment on the SSD L2 cache? How long does the data sit on the SSD before handed off to the main array? I'm not likely to suffer a power drop out and therefore uncontrolled shutdown so in the event of a normal shutdown is the SSD flushed before the service is terminated and the host shut off?

Mon Jan 15, 2018 4:43 pm

Thanks, so from what I can determine there should be no issues controlling the shutdown process and leaving a single host up as long as its partner was shutdown correctly? On power restoration, as long as both hosts boot up at the same time, then everything should start fine, assuming that script is run?

Correct.

Any comment on the SSD L2 cache?

L2 cache in StarWind operates in Write-Through mode only.

Benoire · Wed Jan 17, 2018 12:18 am

Great thanks. Are there plans to operate the L2 in write back mode?

Any other comments on the setup from your experiences?

A question re: shrinking from 3 to 2 using powershell; my trial license for the app is going to expire soon and I haven't implemented this yet, so I'm curious as to whether you can shrink a 3 node HA in to a 2 node HA with powershell? I sought from sales how much access might be and it was clearly far too much for a home user and I'm looking to consolidate in the future and don't want to have to kill the HA image if removing a node isn't possible.

Also in an all flash storage situation, i presume you wouldn't actually need the L2 cache as the main array would be fast enough?

Wed Jan 17, 2018 12:34 pm

Great thanks. Are there plans to operate the L2 in write back mode?

Earlier, L2 cache could work either in WB or WT mode. For now, WB option has been disabled and there are no plans to bring it back.

A question re: shrinking from 3 to 2 using powershell; my trial license for the app is going to expire soon and I haven't implemented this yet, so I'm curious as to whether you can shrink a 3 node HA in to a 2 node HA with powershell? I sought from sales how much access might be and it was clearly far too much for a home user and I'm looking to consolidate in the future and don't want to have to kill the HA image if removing a node isn't possible.

In the current build, there is no StarWindX functionality that would cover you in this case. Only recreation of devices can help you in that case.

Also in an all flash storage situation, i presume you wouldn't actually need the L2 cache as the main array would be fast enough?

There is no need to speed up SSD disks with the help of SSD disks, so you are completely correct.