StarWind VSAN Free - Starting with single node, add second node later

Software-based VM-centric and flash-friendly VM storage + free version
Post Reply
thx1200
Posts: 10
Joined: Fri Jun 19, 2026 6:42 am

Fri Jun 19, 2026 6:48 am

Hello, I'm trying out StarWind VSAN Free. Done a bunch of reading but want to get clear on a point. I am building a Windows Failover Cluster that will use StarWind VSAN as the storage system. I want to start with a single node, to try it out. Then I want to bring a second node online later. It seems there isn't a way to "add a node" (that I can see) available to free version? But I think I can accomplish this as follows:

- Create image file
- Use Add-ImageDevice to attach the drive to the single node
- Later on use Remove-Device to detach the image file from the single node
- Create Failover Cluster with two nodes
- Use Add-HADevice to bring the existing image on both nodes with data preserved with Node 1 CreateImage = $false and Node 2 CreateImage = $true

is this is a correct path? Or am I way off base? :-)
yaroslav (staff)
Staff
Posts: 4350
Joined: Mon Nov 18, 2019 11:11 am

Fri Jun 19, 2026 8:06 am

Welcome to StarWind Forum!
- Use Add-ImageDevice to attach the drive to the single node
The image is already attached to that node; there's no need to attach it.
Your flow implies having 2 nodes ready in the cluster when you create the device. If so, you can create a HA device > bring down that node > start it at some point again > wait for sync to complete.

What you can do is set up 2 StarWind VSAN instances and use createHA_2. Create a small device. Remove the HA device with the Remove* script (run it from the node that you plan to leave; it will target the partner). Expand. This gives you an HA device that you can recreate later.
thx1200
Posts: 10
Joined: Fri Jun 19, 2026 6:42 am

Fri Jun 19, 2026 2:17 pm

Oh thanks, good idea, I can create a two node from the start that way. I like that. So I basically...

Create HA with 2 nodes
Remove Node
(do stuff)
Add real second node later on

I'll play with this some in my lab.
yaroslav (staff)
Staff
Posts: 4350
Joined: Mon Nov 18, 2019 11:11 am

Fri Jun 19, 2026 3:35 pm

Yup.
Good luck with your project!
thx1200
Posts: 10
Joined: Fri Jun 19, 2026 6:42 am

Thu Jun 25, 2026 10:34 pm

I got both nodes online, synced, everything seems to be working, but I'm running into an issue with the HA behaving as true HA.

When Node 2 reboots, Node 1 detects the down server and storage still works, as a single node with node majority due to the SMB share witness.
When Node 1 reboots, Node 2 suddenly says it's out of sync and storage disappears, all iSCSI connections go down. Once Node 1 finishes rebooting, Node 2 comes back online and and they get into sync.

It's like Node 1 is behaving as an "active" and Node 2 as a "passive" rather than mutually HA to each other.

How do I troubleshoot this? What config might be off here?

As a reminder, I started with a single node, then added a second node later.
thx1200
Posts: 10
Joined: Fri Jun 19, 2026 6:42 am

Thu Jun 25, 2026 10:39 pm

Screenshot of Node 2 while Node 1 is rebooting attached.
Attachments
2026-06-25_183616_008_LOVEBIRD.png
2026-06-25_183616_008_LOVEBIRD.png (26.33 KiB) Viewed 549 times
thx1200
Posts: 10
Joined: Fri Jun 19, 2026 6:42 am

Thu Jun 25, 2026 10:45 pm

And Node 1 (for completeness) after reboot...
Attachments
2026-06-25_184341_009_LOVEBIRD.png
2026-06-25_184341_009_LOVEBIRD.png (30.23 KiB) Viewed 548 times
yaroslav (staff)
Staff
Posts: 4350
Joined: Mon Nov 18, 2019 11:11 am

Fri Jun 26, 2026 6:35 am

That's expected. You have a 2-way replica configured with node majority. When one and the only replication partner goes down, the remaining partner gets isolated as it cannot confirm its sync status.
2 ways out of it.
1. Stop service on both nodes.
2. Go to _HA.swdsk for each node
3. Find <failover> area
4. Set to 0.
5. Start the service for both nodes.
Also, make sure that redundant network cards are used for replication node interfaces (i.e., sync and heartbeat). If there are switches or inter-site connections involved in StarWind SYNC and iSCSI, make sure those are redundant too.
thx1200
Posts: 10
Joined: Fri Jun 19, 2026 6:42 am

Fri Jun 26, 2026 1:44 pm

Thanks! Maybe I'm misunderstanding the design, but a two node can be run this way, right? I'm using an SMB file witness share as the tie-breaker node for the cluster to given quorum, not the heartbeat interface strategy. I have four network connections for VSAN. Two are for sync and two are for data. I have no configured "heartbeat" interfaces.

My expectation, coming from my experience with two-node Windows clusters, is that as long as one server and the file share witness are online, that's enough to keep the cluster alive. Only if both a server and the witness were offline would there be a problem that would require shutting the cluster down.

Speaking of, how do I tell if the nodes can reach the file share witness? I don't see an obvious indicator in the GUI.
thx1200
Posts: 10
Joined: Fri Jun 19, 2026 6:42 am

Fri Jun 26, 2026 1:51 pm

Looking at the hadisk config, the second node didn't take any of the file share witness config when I ran the "addpartner" script, so that would explain the issue! I'll need to compare the two files and make sure all the settings match...
yaroslav (staff)
Staff
Posts: 4350
Joined: Mon Nov 18, 2019 11:11 am

Fri Jun 26, 2026 3:00 pm

My expectation, coming from my experience with two-node Windows clusters, is that as long as one server and the file share witness are online, that's enough to keep the cluster alive. Only if both a server and the witness were offline would there be a problem that would require shutting the cluster down.
You are right, it looks like there's an issue with communication with the SMB witness.
Speaking of, how do I tell if the nodes can reach the file share witness? I don't see an obvious indicator in the GUI.
Look the service logs for it. There will be an event where the SMB not reachable
If network connections are redundant, go with Heartbeat (much simpler strategy to handle).
the second node didn't take any of the file share witness config when I ran the "addpartner" script, so that would explain the issue!
Yeah, that explains it. Sharp eye!
thx1200
Posts: 10
Joined: Fri Jun 19, 2026 6:42 am

Fri Jun 26, 2026 8:11 pm

Got it working. For future reference for anybody else in this situation.

Shut down node 2's storage.

Added this section to the new node (example):

Code: Select all

<storage id="2" name="\\share\path\witness.dat" type="smb_witness">
	<authentication type="basic">
		<basic login="user" passwd="pass"/>
	</authentication>
	<transport/>
</storage>
Made sure the "id" was changed on the ha disk and updated it in this section like: <storage_ref id="3"/>

Set witness_type>1</witness_type> because it was 0 (which I presume is a regular non-SMB node).

After starting the starwind services back up, the failover behaved correctly and smoothly!
yaroslav (staff)
Staff
Posts: 4350
Joined: Mon Nov 18, 2019 11:11 am

Fri Jun 26, 2026 8:17 pm

Thanks for sharing, and wishing you a refreshing weekend!
Post Reply