StarWind iSCSI SAN
StarWind Native SAN for Hyper-V
 

HAImage (Non-Active)

Pure software-based VM-centric and flash-friendly VM storage (iSCSI, SMB3, NFS, iSER and NVMe over Fabrics) including free version. Same software embedded on StarWind ready nodes.

Moderators: art (staff), anton (staff), Anatoly (staff), Max (staff)

HAImage (Non-Active)

Postby Elcore » Tue Jan 15, 2019 3:13 pm

We had a power outage on Saturday one of the 2 cluster computers shut down unexpectedly and since then the storage for the server is showing as (Non-Active) in the StarWind Management Console. I have reconnected the iSCSI targets and the drives show up in Windows Disk Management but the StarWind Management Console doesn't connect them. I need some help on getting things back up and running properly again.

Thanks for any help that can be provided.
Elcore
 
Posts: 12
Joined: Tue Jan 15, 2019 2:51 pm

Re: HAImage (Non-Active)

Postby Boris (staff) » Tue Jan 15, 2019 8:03 pm

Please post as much information as possible including logs, screenshots etc. Sensitive information can be posted via direct messages.
Boris (staff)
Staff
 
Posts: 806
Joined: Fri Jul 28, 2017 8:18 am

Re: HAImage (Non-Active)

Postby Elcore » Wed Jan 16, 2019 2:19 pm

Here is the log file and a screen capture. I don't know what else is needed but if you do, please let me know and I will do my best to get you what you need to help me.

Actually the system is not allowing me to upload the log file...

John
Attachments
StarWindCapture.jpg
StarWindCapture.jpg (190.35 KiB) Viewed 5573 times
Elcore
 
Posts: 12
Joined: Tue Jan 15, 2019 2:51 pm

Re: HAImage (Non-Active)

Postby Boris (staff) » Wed Jan 16, 2019 2:44 pm

Upload the logs to some file sharing service (Dropbox, Google Drive, WeTransfer etc.) and post me a link to the bundle via private messages. Use StarWind Log Collector from https://knowledgebase.starwindsoftware. ... collector/

I need a bit more information on your configuration. Particularly, I am interested in the storage type.
1. What do you use as storage on node 1? Is it a physical RAID configuration or Storage Spaces?
2. Is the partition where the StarWind files available on node 1? Does it have a drive letter assigned?
3. Have you tried restarting the StarWind service on node 1? If not, I would suggest you doing so.

Feel free to share any information.
Boris (staff)
Staff
 
Posts: 806
Joined: Fri Jul 28, 2017 8:18 am

Re: HAImage (Non-Active)

Postby Elcore » Wed Jan 16, 2019 3:00 pm

I will work on getting you the logs. In the meantime the storage on both nodes is RAID using SSD and the partition is available on node 1 and drive letter assigned. I have indeed tried restarting the StarWind Service and the server several times even after attempting changes that have not worked. The cluster and storage has been up and running for over 2 years and suddenly this happened over the weekend due to a power outage. Currently our VMs are all running on node 2 as that server didn't shut down during the power outage.
Elcore
 
Posts: 12
Joined: Tue Jan 15, 2019 2:51 pm

Re: HAImage (Non-Active)

Postby Boris (staff) » Thu Jan 17, 2019 3:56 pm

Unfortunately, I cannot download the logs bundle using the the link you have sent me via PM. Just change file access rights to "Anyone with the link" for me to proceed.
Boris (staff)
Staff
 
Posts: 806
Joined: Fri Jul 28, 2017 8:18 am

Re: HAImage (Non-Active)

Postby Elcore » Thu Jan 17, 2019 8:01 pm

I have updated the link and changed the access and sent you a new link via PM.
Elcore
 
Posts: 12
Joined: Tue Jan 15, 2019 2:51 pm

Re: HAImage (Non-Active)

Postby Boris (staff) » Thu Jan 17, 2019 8:47 pm

Show me screenshots of the content of the below folders on host 1:
Code: Select all
D:\Witness\
D:\Storage1\
D:\Storage2\
D:\FileServerStorage\
Boris (staff)
Staff
 
Posts: 806
Joined: Fri Jul 28, 2017 8:18 am

Re: HAImage (Non-Active)

Postby Boris (staff) » Thu Jan 17, 2019 9:08 pm

Preferably with file extensions enabled.
Boris (staff)
Staff
 
Posts: 806
Joined: Fri Jul 28, 2017 8:18 am

Re: HAImage (Non-Active)

Postby Elcore » Thu Jan 17, 2019 9:25 pm

Here are the screenshots you requested.

John
Attachments
Storage2.jpg
Storage2.jpg (117.24 KiB) Viewed 5540 times
Storage1.jpg
Storage1.jpg (121.76 KiB) Viewed 5540 times
FileServerStorage.jpg
FileServerStorage.jpg (119.84 KiB) Viewed 5540 times
Elcore
 
Posts: 12
Joined: Tue Jan 15, 2019 2:51 pm

Re: HAImage (Non-Active)

Postby Boris (staff) » Fri Jan 18, 2019 12:28 am

Information you provided in your original post in this thread does not coincide with the logs.
According to you, the issue manifested on Saturday (i.e. January, 12), as the nodes went down after power outage. Yet, according to the logs, the nodes were able to get all disks synchronized on Sunday, Jan 13:
Code: Select all
788   HyperCluster1.electro-core.com   1948230   Information   High Availability Device iqn.2008-08.com.starwindsoftware:hypercluster1.electro-core.com-downloads, current Node Synchronization complete, Synchronizer is Partner Node iqn.2008-08.com.starwindsoftware:hypercluster2-downloads   StarWindService   1/13/2019 11:37:30 AM
773   HyperCluster1.electro-core.com   1948229   Information   High Availability Device iqn.2008-08.com.starwindsoftware:hypercluster1.electro-core.com-downloads, current Node State has changed to "Synchronized"   StarWindService   1/13/2019 11:37:30 AM
787   HyperCluster1.electro-core.com   1948228   Information   High Availability Device iqn.2008-08.com.starwindsoftware:hypercluster1.electro-core.com-downloads, current Node Synchronization started, Synchronizer is Partner Node iqn.2008-08.com.starwindsoftware:hypercluster2-downloads   StarWindService   1/13/2019 11:37:28 AM
774   HyperCluster1.electro-core.com   1948227   Warning   High Availability Device iqn.2008-08.com.starwindsoftware:hypercluster1.electro-core.com-downloads, current Node State has changed to "Synchronizing"   StarWindService   1/13/2019 11:37:28 AM
788   HyperCluster1.electro-core.com   1948226   Information   High Availability Device iqn.2008-08.com.starwindsoftware:hypercluster1.electro-core.com-fileserverstorage, current Node Synchronization complete, Synchronizer is Partner Node iqn.2008-08.com.starwindsoftware:hypercluster2-fileserverstorage   StarWindService   1/13/2019 11:37:27 AM
773   HyperCluster1.electro-core.com   1948225   Information   High Availability Device iqn.2008-08.com.starwindsoftware:hypercluster1.electro-core.com-fileserverstorage, current Node State has changed to "Synchronized"   StarWindService   1/13/2019 11:37:27 AM
902   HyperCluster1.electro-core.com   1948224   0   "The Software Protection service has started.
6.1.7601.17514"   Software Protection Platform Service   1/13/2019 11:36:46 AM

Finally, the devices went out of sync on the same day, Jan 13 at 12:56:16 PM, when the following happened:
Code: Select all
"The process Explorer.EXE has initiated the restart of computer HYPERCLUSTER1 on behalf of user ELECTRO-CORE\Norm for the following reason: Application: Maintenance (Planned)
Reason Code: 0x84040001
Shutdown Type: restart
Comment: "

Also, there was an unexpected shutdown on node 1:
Code: Select all
The previous system shutdown at 1:35:12 PM on ‎1/‎13/‎2019 was unexpected.

This looks pretty much like the event you initially meant. Am I right?

Anyway, the present status is as follows - your devices on node 1 are non-active because of the HA header files missing for three out of four disks on node 1. The one for Storage1 is still there, but its structure is totally corrupted, and thus the file is not usable at all. Unfortunately, StarWind logs cover only the time starting from 1/13 15:59:09.821 (overwritten by log rotation), so we are not really able to define what exactly happened there. Check your Windows Secutiry log for event 4660 related to Witness_HA.swdsk, Storage1_HA.swdsk, FileServerStorage_HA.swdsk and Storage2_HA.swdsk on node 1 after Jan 13, 12:56:16 PM for more information on what or who deleted the files, but only on condition the file system audit had been configured there.

In the current situation, I recommend you the following:
1. Force remove all targets on node 1.
2. Remove all StarWind related folders with their files from the D drive on node 1.
3. For each of the devices on node 2, select Replication Manager and delete the non-existing replicas.
4. For each of the devices on node 2, create replica to node 1 using the appropriate sync link and the two heartbeat links.
5. In the iSCSI initiators on both nodes, connect the newly appeared targets.
After connectivity to storage is restored from both nodes and the paths to storage are redundant, I would recommend you updating your StarWind installation to the latest build available at our website, as you keep using a pretty outdated build.
Boris (staff)
Staff
 
Posts: 806
Joined: Fri Jul 28, 2017 8:18 am

Re: HAImage (Non-Active)

Postby Elcore » Sat Jan 19, 2019 3:31 pm

I was not there and only assumed that it was due to a power outage based on the information that I was given. I appreciate you taking the time to figure this out and to provide a solution. I will be working on it this morning and will report back.

John
Elcore
 
Posts: 12
Joined: Tue Jan 15, 2019 2:51 pm

Re: HAImage (Non-Active)

Postby Boris (staff) » Sun Jan 20, 2019 8:41 pm

John,

Let me know if you need any addition assistance with this.
Boris (staff)
Staff
 
Posts: 806
Joined: Fri Jul 28, 2017 8:18 am


Return to StarWind Virtual SAN / StarWind Virtual SAN Free / StarWind HyperConverged Appliance / StarWind Storage Appliance

Who is online

Users browsing this forum: Google [Bot] and 5 guests