Waiting for automatic synchronisation

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
aandrewc
Posts: 4
Joined: Thu Apr 15, 2021 3:35 pm

Thu Apr 15, 2021 3:49 pm

Hi
I have a 2 Node Starwind setup that has been working with no issues for about a year. No changes have been made.
I have noticed that now all my targets are no longer synced between nodes. The active Node is showing synced and there are active iscsi connections to a number of hyperv hosts.
On the none active host i initiated a sync of one of the targets but still not syncing. I decided to restart the Starwind HA service on the none active node. After i did that and i attempt a manual sync i get the following message.

"Failed connection with the synchronizer or partner node is invalid. Here is an extract from the log

4/15 16:03:43.255356 1a40 HA: HANode::synchronizeCurrentNode: Params(partnerNode = 0x000002103EBE0080, syncType = 2, snapshotId = 0x0, ignoreOwnSyncState = 0) ENTERed
4/15 16:03:43.255950 1a40 HA: HANode::synchronizeCurrentNode: Shutdown node before sending synchronization request!
4/15 16:03:43.255981 1a40 HA: HANode::shutdownCurrentNode_internal: Params(closeClientConnections = 1, sendShutdownNotifications = 1, partnerNodeNotificationExcluded = 0x0000000000000000, synchronizeCache = 1, synchronizeCacheSynchronously = 0, useHeaderCache = 0) ENTERed for target 'iqn.2008-08.com.starwindsoftware:192.168.15.3-csv-01'.
4/15 16:03:43.256108 1a40 HA: HANode::setIsNodeActive: Params(active = 0) ENTERed for target 'iqn.2008-08.com.starwindsoftware:192.168.15.3-csv-01'.
4/15 16:03:43.266922 1a40 HA: CHAPartnerNode::SendSynchronizationRequestCommand: ENTERed with params: p_SyncType = 2, p_ulSnapshotID = 0x0, p_bContinue = no, p_bVaaiWriteSame = yes
4/15 16:03:43.267890 1a40 HA: HA_Partner_iScsi_Device::send_custom_control_scsi_command: custom cmd code 0xd, send failed on the target(name: iqn.2008-08.com.starwindsoftware:cv-san01-csv-01, lun: 0x0) side
4/15 16:03:43.267937 1a40 HA: HA_Partner_iScsi_Device::send_custom_control_scsi_command: custom cmd code 0xd, SCSI sense data dump: Error Code: 0x00, Is Sense Valid: No, Segment Number: 0x00 (0)
4/15 16:03:43.267962 1a40 HA: HA_Partner_iScsi_Device::send_custom_control_scsi_command: custom cmd code 0xd, Sense Key: 0x02, Reserved: 0x00, Incorrect Length: No
4/15 16:03:43.267984 1a40 HA: HA_Partner_iScsi_Device::send_custom_control_scsi_command: custom cmd code 0xd, Additional Sense Code (ASC): 0x08, Additional Sense Code Qualifier (ASCQ): 0x00
4/15 16:03:43.268007 1a40 HA: *** CHAPartnerNode::SendSynchronizationRequestCommand: EXITing with failure, SendCustomControlScsiCommand() failed, error code 0, scsi status = 2!
4/15 16:03:43.268023 1a40 HA: *** HANode::synchronizeCurrentNode: Sending of synchronization request failed, error state 21!
4/15 16:03:48.269608 1a40 HA: HANode::synchronizeCurrentNode: Params(partnerNode = 0x000002103EBE0080, syncType = 2, snapshotId = 0x0, ignoreOwnSyncState = 0) ENTERed
4/15 16:03:48.270081 1a40 HA: HANode::synchronizeCurrentNode: Shutdown node before sending synchronization request!
4/15 16:03:48.270096 1a40 HA: HANode::shutdownCurrentNode_internal: Params(closeClientConnections = 1, sendShutdownNotifications = 1, partnerNodeNotificationExcluded = 0x0000000000000000, synchronizeCache = 1, synchronizeCacheSynchronously = 0, useHeaderCache = 0) ENTERed for target 'iqn.2008-08.com.starwindsoftware:192.168.15.3-csv-01'.
4/15 16:03:48.270295 1a40 HA: HANode::setIsNodeActive: Params(active = 0) ENTERed for target 'iqn.2008-08.com.starwindsoftware:192.168.15.3-csv-01'.
4/15 16:03:48.277278 1a40 HA: CHAPartnerNode::SendSynchronizationRequestCommand: ENTERed with params: p_SyncType = 2, p_ulSnapshotID = 0x0, p_bContinue = no, p_bVaaiWriteSame = yes
4/15 16:03:48.278236 1a40 HA: HA_Partner_iScsi_Device::send_custom_control_scsi_command: custom cmd code 0xd, send failed on the target(name: iqn.2008-08.com.starwindsoftware:cv-san01-csv-01, lun: 0x0) side
4/15 16:03:48.278283 1a40 HA: HA_Partner_iScsi_Device::send_custom_control_scsi_command: custom cmd code 0xd, SCSI sense data dump: Error Code: 0x00, Is Sense Valid: No, Segment Number: 0x00 (0)
4/15 16:03:48.278309 1a40 HA: HA_Partner_iScsi_Device::send_custom_control_scsi_command: custom cmd code 0xd, Sense Key: 0x02, Reserved: 0x00, Incorrect Length: No
4/15 16:03:48.278331 1a40 HA: HA_Partner_iScsi_Device::send_custom_control_scsi_command: custom cmd code 0xd, Additional Sense Code (ASC): 0x08, Additional Sense Code Qualifier (ASCQ): 0x00
4/15 16:03:48.278355 1a40 HA: *** CHAPartnerNode::SendSynchronizationRequestCommand: EXITing with failure, SendCustomControlScsiCommand() failed, error code 0, scsi status = 2!
4/15 16:03:48.278370 1a40 HA: *** HANode::synchronizeCurrentNode: Sending of synchronization request failed, error state 21!
yaroslav (staff)
Staff
Posts: 2360
Joined: Mon Nov 18, 2019 11:11 am

Thu Apr 15, 2021 4:05 pm

Were you able to start synchronizing?
Please share all logs with us (from all servers) https://knowledgebase.starwindsoftware. ... collector/. Share them via Google Drive or any other file sharing service.
aandrewc
Posts: 4
Joined: Thu Apr 15, 2021 3:35 pm

Fri Apr 16, 2021 8:41 am

Okay

Thank you for the feedback. I have looked this morning and still still the same status.
I have have collected the logs for both of the SAN's and here is the link https://drive.google.com/drive/folders/ ... sp=sharing
yaroslav (staff)
Staff
Posts: 2360
Joined: Mon Nov 18, 2019 11:11 am

Mon Apr 19, 2021 4:32 am

I have checked the log.
First, try restarting the StarWind Service on the non-synchronized node.
Please remove the replica to 02, the not synchronized one, and replicate the HA on node 01. Make sure to have the backups though (just a safety measure). See scripts at C:\Program Files\StarWind Software\StarWind\StarWindX\Samples\powershell. To remove the replica, use RemoveHAParter, to add, use AddHAPartner. I would try rather the script first for a test StarWind HA device.
Once removed and full sync of CSV-01 is over, update StarWind VSAN to 14033 build. See the update procedure at https://knowledgebase.starwindsoftware. ... d-version/. The latest build is available at https://www.starwindsoftware.com/tmplin ... ind-v8.exe
aandrewc
Posts: 4
Joined: Thu Apr 15, 2021 3:35 pm

Mon Apr 19, 2021 3:18 pm

Okay making some progress now.
Before i had just restarted the Starwind HA service and it did not result in synchronization occurring. However when i restarted all the Starwind services on unit 02, the targets started to sync and have completed successfully for 2 out of the 3 targets. However the 3rd target has not. But what i am seeing is on Unit02 the device is no longer attached to the target. Do i need to add the existing device back? (I am very nervous as how do i know when i add the device back it wont come up as status synchronized or worse still it sync's back to unit01 instead of syncing data from unit01. )
If i look on Unit01 and look in replication manager its status is not connected. Is there a way to proceed with minimal risk even if its starting again with the unit02. My main concern is unit01 does not get over written from the stale data on unit02.
yaroslav (staff)
Staff
Posts: 2360
Joined: Mon Nov 18, 2019 11:11 am

Mon Apr 19, 2021 3:58 pm

Hi,

If you are using the commercial license, you could have run beyond the allowed HA capacity.
Please check if you can navigate to the file location and if headers are showing fine: there should be .img, _HA.swdsk, and .swdsk files. It can be malware, bad path to the storage, underlying storage malfunctioning, using more HA storage than allowed by license, etc.

What you can do is recreating the replica. In Management Console, go to Replication Manager for the healthy device, remove the replica to the affected site, navigate to device folder on the not-synchornized side, and delete everything there. Recreate the replica of the healthy device. Please mind HA devices priorities (see more here https://forums.starwindsoftware.com/vie ... ity#p32173).

Please consider updating both hosts to the latest build after full sync and reconnecting iSCSI targets.
aandrewc
Posts: 4
Joined: Thu Apr 15, 2021 3:35 pm

Thu Apr 22, 2021 10:16 am

Just thought i would let you know that has worked. All back up and running now thanks for your help.
So for the last part i removed in replication manager and added it back in and to be safe a used a new image on the target. It has now all synced up.
yaroslav (staff)
Staff
Posts: 2360
Joined: Mon Nov 18, 2019 11:11 am

Thu Apr 22, 2021 10:45 am

That's cool. Thank you for the update.
Post Reply