I appreciate this may, in fact, require a support ticket however I would like to run my situatiuon past you:
we have a 2 node StarwindFree vSAN based cluster of two Dell R620 boxes with LSI PERC 710i RAID controllers.
One of the nodes has flagged a drive as 'Predicted fail' - to fix this issue one has to delete the raid config from the controller and start again with the phy drives in different RAID slots.
The system disk for the affected node does not share any phyiscal devices with the data volume.
I appreciate that destroying the RAID volume will break Starwind - what was a nice drive full of synched data will empty.
So i Have had a think and come up with a plan of action. Bear in mind I am well out of the 30 days where the Starwind GUI will change the config.
- Pause the affected node, draining the roles
- Shut down VMs (most of them, DCs would be problematic)
- disable starwind on the affected node once everything is synched
- copy the starwind data off the data volume on to local removable storage (yes, It will take forever)
- shutdown paused node
- Perform RAID magic to delete the volume and recreate it with the phy disks in new slots replacing as necessary to remove any hard read faults
- reboot node
- format drive in Windows
- copy data back from local removable storage
- enable Starwind but don't start the sevice
- reboot - should allow SW to come back and iSCSI will hopefully recover
- Watch Starwind resynch as necessary and enjoy another period of trouble free cluster
Thanks for your time
Ben