Well, I think I figured out what is happening, and it has nothing to do with starwind - lol. The two hosts are linked directly via a 50gb/sec pair of mellanox cards. For maximum performance, I pass an SR-IOV virtual function in to each starwind VSA. The problem is this: when I shutdown host B, the NIC in host A loses link, and apparently the starwind VSA is unable to talk to the mellanox NIC on host A, so both paths are down, and vsphere bitches and moans
Seems like 3 possible ways to fix this:
1. Replace the virtual function NIC with a vmxnet3 NIC. Pro: traffic never leaves the switch so this avoids loss of connectivity. Con: reduced performance.
2. Add a 1gb nic to each host's virtual switch and make it a failover NIC only - I don't care about reduced performance in that scenario. Pro: Same max performance in normal case. Con: need to add a 1gb link between the two hosts, and it can't be a direct link, or it would have the same issue.
3. Find some way to instruct vsphere and/or server 2016 to ignore link down state. Pro: simplest configuration. Con: awful kludge (and I don't even know if that is possible!)