Invalid Partner on 2-node HA - Win Server Core 2019

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
james643
Posts: 11
Joined: Wed Apr 07, 2021 1:57 pm

Wed Apr 07, 2021 2:37 pm

I am attempting to set up a POC of HA 2 node Hyper-V cluster on Server Core 2019 using "CreateHA_2.PS1" but am getting this error. I'm not clear on what to do with some of the addresses in the script.
(edit: I followed this to stage the servers -- https://www.starwindsoftware.com/resour ... rver-2016/)

Code: Select all

Request to  CL1.DOMAIN.LOCAL ( 192.168.1.111 ) : 3261
-
control HAImage -CreateHeader:"" -DeviceHeaderPath:"My computer\D\starwind\CL1-img_HA.swdsk" -Type:"ImageFile_HA" -file:"CL1-Storage" -size:"1024" -Priority:"#p0=0;#p1=1" -nodeType:"#p0=1;#p1=1" -PartnerTargetName:"#p1=iqn.2008-08.com.starwindsoftware:cl2.DOMAIN.local-san-cl2" -PartnerIP:"#p1=172.16.20.12:sync:3260:1,172.16.10.12:1" -IsAutoSynchEnabled:"1" -AuthChapLogin:"#p1=0b" -AuthChapPassword:"#p1=0b" -AuthMChapName:"#p1=0b" -AuthMChapSecret:"#p1=0b" -AuthChapType:"#p1=none" -Offset:"0" -CacheMode:"wb" -CacheSizeMB:"128" -Replicator:"#p0=0" -WitnessType:"0" -AluaAccessState:"#p0=0;#p1=1"
-
200 Failed: invalid partner info..
Here is my network layout. NIC1 are on the local network where I manage the hosts. NIC5 is iSCSI+HB and are connected by isolated 10Gbit switch. NIC6 for SYNCH are 10Gbit direct connection. NICs in each pair successfully jumbo ping each other.

Code: Select all

NIC1-Host1-Mgmt                                 NIC1-Host2-Mgmt
192.168.1.111        >     1Gb Switch     >     192.168.1.121

NIC5-Host1-iSCSI+HB                             NIC5-Host2-iSCSI+HB
172.16.10.11        >     10Gb Switch     >     172.16.10.12

NIC6-Host1-SYNCH                                NIC6-Host2-SYNCH
172.16.20.11        >     10Gb Direct     >     172.16.20.12

(Both hosts: NIC2 & 3 are dedicated HV switches, NIC4 disabled)
Here are my script parameters:

Code: Select all

param($addr="192.168.1.111", $port=3261, $user="root", $password="starwind",
	$addr2="192.168.1.121", $port2=$port, $user2=$user, $password2=$password,
#common
	$initMethod="Clear",
	$size=1024,
	$sectorSize=4096,
	$failover=0,
#primary node
	$imagePath="My computer\D\starwind",
	$imageName="CL1-img",
	$createImage=$true,
	$storageName="CL1-Storage",
	$targetAlias="SAN-CL1",
	$autoSynch=$true,
	$poolName="pool1",
	$syncSessionCount=1,
	$aluaOptimized=$true,
	$cacheMode="wb",
	$cacheSize=128,
	$syncInterface="#p2=172.16.20.12:3260" -f $addr2,
	$hbInterface="#p2=172.16.10.12",
	$createTarget=$true,
#secondary node
	$imagePath2="My computer\D\starwind",
	$imageName2="CL2-img",
	$createImage2=$true,
	$storageName2="CL2-Storage",
	$targetAlias2="SAN-CL2",
	$autoSynch2=$true,
	$poolName2="pool1",
	$syncSessionCount2=1,
	$aluaOptimized2=$false,
	$cacheMode2=$cacheMode,
	$cacheSize2=$cacheSize,
	$syncInterface2="#p1=172.16.20.11:3260" -f $addr,
	$hbInterface2="#p1=172.16.10.11",
	$createTarget2=$true
	)
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Wed Apr 07, 2021 3:02 pm

Welcome to StarWind Forum. How many physical network cards do you use? Please make sure to use at least 2 different network cards to avoid split-brain. See more in system requirements https://www.starwindsoftware.com/system-requirements.
See the good script here https://forums.starwindsoftware.com/vie ... p+3#p31505.
james643
Posts: 11
Joined: Wed Apr 07, 2021 1:57 pm

Wed Apr 07, 2021 3:41 pm

Each server has a builtin physical adapter with 4x 1Gbit ports. NIC1 on both hosts are port 1 of this and this is the management network.
Each server has an add-on physical adapter with 2x 10Gbit ports, NIC5 iSCSI+HB and NIC6 Synch are ports 1 and 2 respectively on this.

Would you please clarify here - are you saying I need to add a second add-on card to each server? As it is now there are 3 separate networks on separate ethernet ports for iSCI+HB (and added mgmt network as backup heartbeat I think in params below) and for Synchronization and for system management.

From the post you linked to, these are my new parameters and it appears to have worked and synchronized.

Code: Select all

param($addr="192.168.1.111", $port=3261, $user="root", $password="starwind",
   $addr2="192.168.1.121", $port2=$port, $user2=$user, $password2=$password,
#common
   $initMethod="Clear",
   $size=1200,
   $sectorSize=4096,
   $failover=0,
#primary node
   $imagePath="My computer\D\starwind",
   $imageName="masterImg21",
   $createImage=$true,
   $storageName="",
   $targetAlias="targetha21",
   $autoSynch=$true,
   $poolName="pool1",
   $syncSessionCount=1,
   $aluaOptimized=$true,
   $cacheMode="wb",
   $cacheSize=128,
   $syncInterface="#p2=172.16.20.12:3260" -f $addr2,
   $hbInterface="#p2=172.16.10.12:3260,192.168.1.121:3260" -f $addr2,
   $createTarget=$true,
#secondary node
   $imagePath2="My computer\D\starwind",
   $imageName2="partnerImg22",
   $createImage2=$true,
   $storageName2="",
   $targetAlias2="partnerha22",
   $autoSynch2=$true,
   $poolName2="pool1",
   $syncSessionCount2=1,
   $aluaOptimized2=$true,
   $cacheMode2=$cacheMode,
   $cacheSize2=$cacheSize,
   $syncInterface2="#p1=172.16.20.11:3260" -f $addr,
   $hbInterface2="#p1=172.16.10.11:3260,192.168.1.111:3260" -f $addr,
   $createTarget2=$true
   )
james643
Posts: 11
Joined: Wed Apr 07, 2021 1:57 pm

Wed Apr 07, 2021 8:27 pm

I mean that I put the management network in as a second heartbeat now. So since that is on the built in NIC, rather from the add-on NIC which has both iSCSI+HB and Sync, I am in now configured to avoid split-brain with just the two adapters, correct?
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Thu Apr 08, 2021 7:13 am

Hi, yes you are right. I was saying that at least one StarWind Link should go over another network card, be it the heartbeat or sync.
In your case, adding heartbeat over the management should almost eliminate the risk of split brains. And, that is what you did with the script.
james643
Posts: 11
Joined: Wed Apr 07, 2021 1:57 pm

Thu Apr 08, 2021 6:10 pm

Great info thanks. The network requirements wording in the setup guide makes a lot more sense with that context.

So far I have:
Created devices with Create_HA2.ps1
Connected the iSCSI links on both ends
Validated and created the Failover Cluster

Now I am about to extend the storage.

Can you point me to the next steps?
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Fri Apr 09, 2021 8:22 am

Make sure to connect the witness disk only locally.
You can extend the device with the ExtendDevice script from C:\Program Files\StarWind Software\StarWind\StarWindX\Samples\powershell.
james643
Posts: 11
Joined: Wed Apr 07, 2021 1:57 pm

Wed Apr 14, 2021 8:59 pm

yaroslav (staff) wrote:Make sure to connect the witness disk only locally.
You can extend the device with the ExtendDevice script from C:\Program Files\StarWind Software\StarWind\StarWindX\Samples\powershell.
Ok so I do need a witness in this 2 node. I was a bit confused by the witness part earlier since documentation seemed to say not required with 2 node and the setup script didn't create an image for it.

Things seem to be running and synced up. I got the disk and volume extended, now trying to add a witness but it does not show up in Failover Cluster Manager when I try to add a disk under Storage > Disks in order to select it in quorum settings.

My servers are CL1 (showing 0 votes) and CL2 (1 vote), so I created the witness on CL1 with CreateImageFile.ps1, attached to it through iscsicpl from CL1 only, it showed up in DiskPart, brought it online and cleared readonly, set it to GPT, formatted NTFS with no drive letter.

Running cluster validation gives me this:
Physical disk eb7d6d5a is visible from only one node and will not be tested. Validation requires that the disk be visible from at least two nodes. The disk is reported as visible at node: CL1.fiveflags.local
james643
Posts: 11
Joined: Wed Apr 07, 2021 1:57 pm

Wed Apr 14, 2021 9:13 pm

Based on that validation finding I just went ahead and did connect to the witness disk from CL2, it popped up in available disks and was able to be added to the quorum selection, and now CL1 and CL2 are each showing 1 vote.

Witness disk is now online as Cluster Disk 2, assigned Disk Witness in Quorum, but the owner node is CL2..

I am new to clustering so maybe I've misunderstood some of these concepts and maybe some of your statements, but shouldn't that be owned by CL1?
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Thu Apr 15, 2021 6:53 am

Thank you for your question.
It is necessary to create a HA device for wiitness (i.e., replicate it), and connect it locally for each server (i.e., only 127.0.0.1) connections are allowed. It is wrong to connect witness to both servers due to potential issues with failover. Please disconnect other IPs than 127.0.0.1 from witness.
While validating, it produces an MPIO warning for that disk as it does not have the partner connections.
It does not matter which node owns the witness. Just make sure to mow it from the node before rebooting it. Alternatively, just pause the node with roles drain.

Let me know if you have more questions.
james643
Posts: 11
Joined: Wed Apr 07, 2021 1:57 pm

Thu Apr 15, 2021 4:43 pm

yaroslav (staff) wrote:Thank you for your question.
It is necessary to create a HA device for wiitness
I used CreateImageFile.ps1 so apparently I did not create a HA device. That makes sense. I looked in the CreateHAPartnerWitness.ps1 and it appears to be for creating a whole 3 node environment from nothing.

Is there a script sample for just adding a new HA device into the existing environment?
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Thu Apr 15, 2021 6:13 pm

Hi,

There is a CreateHA script mentioned herein. You cannot create HA for an img file. Since it is just a witness, create a new 1 GB HA, connect it as described here and connect it as a witness in cluster. Once you connect the HA (witness), remove old target and file.
james643
Posts: 11
Joined: Wed Apr 07, 2021 1:57 pm

Fri Apr 16, 2021 7:38 pm

yaroslav (staff) wrote: There is a CreateHA script mentioned herein. You cannot create HA for an img file. Since it is just a witness, create a new 1 GB HA, connect it as described here and connect it as a witness in cluster. Once you connect the HA (witness), remove old target and file.
^^^ THIS!!

I had though the HA devices in all of these contexts meant these were storage images on the starwind HA servers, now I see that each shared image is a starwind HA that happens to be on these servers.. so much of my confusion was based around that!

Destroyed the cluster, restored the two hosts from pre-starwind backups, did a bit of RTFM on iSCSI / Clustering, and started over. Thanks to documenting all of my steps in powershell I got it all configured in no time, and the first HA test VM is online :D

Is there any documentation on security best practices for locking down these services and connections?
james643
Posts: 11
Joined: Wed Apr 07, 2021 1:57 pm

Fri Apr 16, 2021 8:55 pm

Cluster networks settings question - guide wasn't too clear on which networks can be used for Cluster / Cluster and Client communication. I have Synch, iSCSI, Management, and IPv6 link local listed in my Networks. Currently I only have Management set to allow Cluster and Client, the rest are all set to None. Is that correct?
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Sat Apr 17, 2021 5:15 am

Is there any documentation on security best practices for locking down these services and connections?
Yes, you can set CHAP (https://www.starwindsoftware.com/resour ... ocol-chap/)and access rights (https://www.starwindsoftware.com/blog/a ... w-it-works) they can be set only from GUI, i.e., you need either trial or perpetual license. You can also configure the Management link password as described here https://forums.starwindsoftware.com/vie ... ord#p28525.
Cluster networks settings question - guide wasn't too clear on which networks can be used for Cluster / Cluster and Client communication. I have Synch, iSCSI, Management, and IPv6 link local listed in my Networks. Currently I only have Management set to allow Cluster and Client, the rest are all set to None. Is that correct?
Please enable Cluster Communication over Sync while Cluster and Client are to be enabled over Management. Disable StarWind Sync from accepting live migrations. In Failover Cluster, uncheck the appropriate checkbox and move it down as discussed in the guide you shared.
Post Reply