It took some time to find a slot for the maintenance... I shut down all vmware-servers that were accessing the VSAN and shut down the starwind services on both nodes. Then I corrected the cfg.files on both nodes. I also added some more heartbeatinterfaces by editing the swdsk-files on both nodes (I read somewhere that it is better to have several interfaces to prevent split-brain-situations, I think it was somewhere in your FAQ but I can't find it anymore).
Code: Select all
<storage id="3" name="iqn.2008-08.com.starwindsoftware:vsan01-vsandatastore1" type="remote" lun="0x0">
<transport type="iSCSI">
<links>
<link id="1" type="data" priority="1" connections="1">
<peer ip="10.254.251.1" port="3260"/>
</link>
<link id="2" type="data" priority="1" connections="1">
<peer ip="10.254.250.1" port="3260"/>
</link>
<link id="3" type="control" priority="1" connections="1">
<peer ip="192.168.181.50" port="3260"/>
</link>
<link id="4" type="control" priority="1" connections="1">
<peer ip="10.254.250.1" port="3260"/>
</link>
<link id="5" type="control" priority="1" connections="1">
<peer ip="10.254.251.1" port="3260"/>
</link>
</links>
</transport>
</storage>
I added link id 4 and 5 on both nodes (the directly connected sync-interfaces between the nodes with no switch in between).
After I started the starwindservices on both nodes again everything came up fine. I tried to do another extend of the device which now was quit by a message that it was unsuccessful (different message as it previously said device doesn't support extend). After running the command a sync was triggered which took some time. It didn't seem to be a full sync though, as it was done in about 1.5 hours.
The system is up and running and all paths of the vmwareservers to the vsan are active, however something is strange now: Both nodes show status synced but one of the nodes shows 0%, the other 100%.
Node1:
...
highavailability="yes"
ha_serialid_string="FD9D5EBB682ACEC2"
ha_synch_status="1"
ha_synch_percent="0"
ha_synch_type="0"
ha_sync_elapsed_time="0"
ha_sync_estimated_time="0"
ha_priority="0"
ha_is_node_removed_from_partners="no"
ha_is_storage_extend_supported="yes"
ha_is_storage_snapshot_supported="no"
ha_is_storage_device_ready="yes"
ha_is_storage_device_readonly="no"
ha_is_SMISHidden="no"
ha_autosynch_enabled="yes"
ha_wait_on_autosynch="0"
ha_auto_sync_priority="1"
ha_maintenance_mode="0"
ha_sync_traffic_share="25"
ha_alua_group_node_state="0"
ha_tracker="no"
ha_tracker_frozen="no"
ha_tracker_snapshots_storage=""
ha_tracker_mount_time="0"
ha_tracker_mount_snapshot=""
ha_tracker_status="-1"
ha_tracker_pending="0"
ha_tracker_replicated="0"
ha_tracker_replicating="0"
ha_tracker_scheduled="0"
ha_node_type="1"
...
ha_partner_nodes_count="1"
ha_failover_config_type="0"
ha_partner_node1_target_name="iqn.2008-08.com.starwindsoftware:192.168.181.51-vsandatastore1"
ha_partner_node1_priority="1"
ha_partner_node1_type="1"
ha_partner_node1_storage_device_type="ImageFile"
ha_partner_node1_sync_channels="10.254.251.2$3260$1;10.254.250.2$3260$1"
ha_partner_node1_heartbeat_channels="192.168.181.51$3260$1;10.254.250.2$3260$1;10.254.251.2$3260$1"
ha_partner_node1_is_exist_sync_valid_connection="1"
ha_partner_node1_is_exist_heartbeat_valid_connection="1"
ha_partner_node1_sync_status="1"
ha_partner_node1_sync_percent="100"
ha_partner_node1_sync_type="0"
ha_partner_node1_sync_elapsed_time="0"
ha_partner_node1_sync_estimated_time="0"
ha_partner_node1_tracker_frozen="no"
ha_partner_node1_tracker_snapshots_storage=""
ha_partner_node1_tracker_mount_time="0"
ha_partner_node1_tracker_mount_snapshot=""
ha_partner_node1_auth_chap_type="None"
ha_partner_node1_auth_chap_login=""
ha_partner_node1_auth_chap_password=""
ha_partner_node1_auth_mutual_chap_name=""
ha_partner_node1_auth_mutual_chap_secret=""
...
Node 2:
...
highavailability="yes"
ha_serialid_string="FD9D5EBB682ACEC2"
ha_synch_status="1"
ha_synch_percent="100"
ha_synch_type="0"
ha_sync_elapsed_time="0"
ha_sync_estimated_time="0"
ha_priority="1"
ha_is_node_removed_from_partners="no"
ha_is_storage_extend_supported="yes"
ha_is_storage_snapshot_supported="no"
ha_is_storage_device_ready="yes"
ha_is_storage_device_readonly="no"
ha_is_SMISHidden="no"
ha_autosynch_enabled="yes"
ha_wait_on_autosynch="0"
ha_auto_sync_priority="1"
ha_maintenance_mode="0"
ha_sync_traffic_share="25"
ha_alua_group_node_state="0"
ha_tracker="no"
ha_tracker_frozen="no"
ha_tracker_snapshots_storage=""
ha_tracker_mount_time="0"
ha_tracker_mount_snapshot=""
ha_tracker_status="-1"
ha_tracker_pending="0"
ha_tracker_replicated="0"
ha_tracker_replicating="0"
ha_tracker_scheduled="0"
ha_node_type="1"
...
ha_partner_nodes_count="1"
ha_failover_config_type="0"
ha_partner_node1_target_name="iqn.2008-08.com.starwindsoftware:vsan01-vsandatastore1"
ha_partner_node1_priority="0"
ha_partner_node1_type="1"
ha_partner_node1_storage_device_type="ImageFile"
ha_partner_node1_sync_channels="10.254.251.1$3260$1;10.254.250.1$3260$1"
ha_partner_node1_heartbeat_channels="192.168.181.50$3260$1;10.254.250.1$3260$1;10.254.251.1$3260$1"
ha_partner_node1_is_exist_sync_valid_connection="1"
ha_partner_node1_is_exist_heartbeat_valid_connection="1"
ha_partner_node1_sync_status="1"
ha_partner_node1_sync_percent="0"
ha_partner_node1_sync_type="0"
ha_partner_node1_sync_elapsed_time="0"
ha_partner_node1_sync_estimated_time="0"
ha_partner_node1_tracker_frozen="no"
ha_partner_node1_tracker_snapshots_storage=""
ha_partner_node1_tracker_mount_time="0"
ha_partner_node1_tracker_mount_snapshot=""
ha_partner_node1_auth_chap_type="None"
ha_partner_node1_auth_chap_login=""
ha_partner_node1_auth_chap_password=""
ha_partner_node1_auth_mutual_chap_name=""
ha_partner_node1_auth_mutual_chap_secret=""
...
Btw, the output is from telnetting to the managementinterface.
Performance seems to be fine and as I said, all paths are active. However it doesn't fell "right" to see it this way. Any kind of advice is appreciated.
Thnaks
Holger