Unable to connect to Starwind from Xenserver after full reboot of all devices

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
zendzipr
Posts: 9
Joined: Tue Jan 23, 2018 3:04 am

Mon Jul 02, 2018 5:29 pm

During testing, I have powered off all devices, storage nodes, xen server compute nodes and switches to simulate a catastrophic power loss.

Current configuration is with 2 node starwind cluster, 2 dedicated 10G sync, 2 dedicated 10G iscsi, 1 dedicated 1G heartbeat and 1 dedicated management port.

After power is restored, the starwind nodes begin re-sync as expected, however the xen server devices are unable to connect to the starwind iscsi devices until a full synchronization has been completed.

I would expect that the node which is being synchronized from would be accessible to the xen server pool, however none of the paths are available.

Another issue I am seeing is during initial synchronization when I created the cluster synchronized at 10-12 Gbps, while the current resync is only syncing at 1 Gbps.

After booting, the xen server node appears to be attempting to connect to the starwind node that is not ready.

Any ideas how I can get xenserver to us one of the nodes while the second is synchronizing. As it is now, it appears to take several hours for the nodes to re-sync and during that time none of the vm's can boot.
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Mon Jul 02, 2018 6:01 pm

I would expect that the node which is being synchronized from would be accessible to the xen server pool, however none of the paths are available.
This is the exact behavior offered. Which StarWind build do you use that you have issues with?
Post the output of

Code: Select all

multipath -ll
from Xen Server nodes before simulating disaster and after.
zendzipr
Posts: 9
Joined: Tue Jan 23, 2018 3:04 am

Mon Jul 02, 2018 6:21 pm

I will need to send the the results after second volume re-synchronizes.

but before it does, I can say the output of multipath -ll is blank. But immediately after reboot, it is not. I have attached a screenshot of the xen server node looks like after a reboot.

I am using an older version of starwind. You and I worked with an issue where the newer install would not create the second sync or iscsi interface via powershell. I have tested with newer versions but it has been a few months.

Currently running 8.0.0.11456

The output of multipath -ll changes.. first it shows one storage repo, then the second, finally they all go away and results are empty.
Attachments
after_boot.png
after_boot.png (17.48 KiB) Viewed 5070 times
zendzipr
Posts: 9
Joined: Tue Jan 23, 2018 3:04 am

Mon Jul 02, 2018 8:31 pm

Node is back up and running. Below is the good multipath -ll output

Code: Select all

[root@xs4 ~]# multipath -ll
2ec9fc038c6f797a1 dm-0 STARWIND,STARWIND
size=3.2T features='0' hwhandler='1 alua' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 20:0:0:0 sdb 8:16 active ready running
  |- 21:0:0:0 sdc 8:32 active ready running
  |- 22:0:0:0 sdd 8:48 active ready running
  `- 23:0:0:0 sde 8:64 active ready running
[root@xs4 ~]#
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Tue Jul 03, 2018 2:44 pm

zendzipr,

Submit a ticket referring to this thread. I will keep posting any solutions here after we find out what is wrong there.
zendzipr
Posts: 9
Joined: Tue Jan 23, 2018 3:04 am

Thu Jul 05, 2018 4:25 pm

Boris,

Thank you. Request submitted. Ticket number is 56702
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Fri Jul 06, 2018 12:34 am

Great. Keeping our communication there for now, with the final solution to be posted here later.
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Mon Jul 09, 2018 3:51 pm

During a remote session with zendzipr we were unable to reproduce the issue he had encountered earlier. StarWind VSAN worked as expected and provided the Xen Servers with access to storage with only one partner node of the HA setup being synchronized.
Post Reply