Please help: isolated interfaces cannot connect

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
zoltan.csoka
Posts: 5
Joined: Sun Oct 14, 2018 8:19 pm

Sun Oct 14, 2018 8:33 pm

Greetings,

I have a 2 node hyper-converged Hyper-v setup with 4x1 Gb NICs and 2x10 Gb NICs. The 10Gb NICs are connected to a dedicated switch and running on 192.168.1.0/24 subnet. The other NICs are on 10.42.30.0/24 and 10.42.31.0/24. If I configure the heartbeat and sync on the 1Gb NICs, everything works as expected. If I try to configure the 10Gb NICs, they cannot connect. The interfaces work fine, they were used for NAS iscsi connections. Jumbo frames are on 9014 and can ping each other with 8970 unfragmented. I can use it for smb3 connection as well, so the network interfaces are working for sure for other purpose. The nodes report with netstat starwindservice listening on these IPs on port 3260, so the listening side seems to be also working. Firewalls are disabled. iscsicpl can discover the service on the 10Gb IPs.
Could you please help me to investigate why starwind is not able to connect for sync or heartbeat? How can I investigate it further? I have already reinstalled starwind couple of times and removed and added the sync interfaces. Any idea or clue is appreciated.

Kind Regards,

Zoltan
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Mon Oct 15, 2018 12:17 pm

Greetings Zoltan,
Are you using 2x10 Gb NICs it Team? We do not recommend any form of teaming for ISCSI and synchronization channels. Please configure these 2x10 Gb NICs in different subnets.
As for ISCSI targets, you have 1 Gb connections active and you are trying to connect ISCSI sessions via 10 GB interface in the same moment, am I right?
If yes, you should disconnect 1 Gb sessions and reconnect with the help of 10 Gb connections. The other way is changing <iScsi Discovery List Interfaces value=”0”/> to "1" in StarWind.cfg file. You will allow several network interfaces for ISCSI traffic.
zoltan.csoka
Posts: 5
Joined: Sun Oct 14, 2018 8:19 pm

Mon Oct 15, 2018 1:07 pm

Hi Oleg,

The 2x10Gb NICs are not teamed, I planned to use them for failover MPIO configuration for sync. Previously I have tried configuring sync and heartbeat on the 10Gb NICs, also tried heartbeat on 1Gb and sync on 10Gb, but the 10Gb always threw a connection error. The subnet of the 10Gb NICs are connected to an isolated switch, the particular subnet is only available from the servers, but I suppose it should not be an issue.
Will the heartbeat always make an iSCSI connection? I did not make any iscsi connection at the time via iscsicpl when the 10Gb networks were throwing an exception, but maybe starwind itself used it somehow.
I will try to modify the starwind config as you have recommended and give it an other try.
Boris (staff)
Staff
Posts: 805
Joined: Fri Jul 28, 2017 8:18 am

Mon Oct 15, 2018 3:20 pm

Keep us updated, Zoltan. Your information will be useful for the community.
zoltan.csoka
Posts: 5
Joined: Sun Oct 14, 2018 8:19 pm

Tue Oct 16, 2018 8:59 am

Dear Boris,

Unfortunately it did not help. I have changed the value:

Code: Select all

10/16 1:22:34.676 31c8 conf: 'WUSCEnabled' = 'yes'
10/16 1:22:34.676 31c8 conf: 'SrvWasDisabled' = 'no'
10/16 1:22:34.676 31c8 conf: 'SrvRestoreStartType' = '2'
10/16 1:22:34.676 31c8 conf: 'VaaiExCopyEnabled' = 'yes'
10/16 1:22:34.676 31c8 conf: 'VaaiCawEnabled' = 'yes'
10/16 1:22:34.676 31c8 conf: 'VaaiWriteSameEnabled' = 'yes'
10/16 1:22:34.676 31c8 conf: 'OdxEnabled' = 'no'
10/16 1:22:34.676 31c8 conf: 'OdxOptimalRodSizeMB' = '64'
10/16 1:22:34.676 31c8 conf: 'OdxMaximumRodSizeMB' = '256'
10/16 1:22:34.676 31c8 conf: 'OdxRodTokenDefaultTimeoutSec' = '10'
10/16 1:22:34.676 31c8 conf: 'OdxRodTokenMaximumTimeoutSec' = '30'
10/16 1:22:34.676 31c8 conf: 'Port' = '3260'
10/16 1:22:34.676 31c8 conf: 'Interface' = '0.0.0.0'
10/16 1:22:34.676 31c8 conf: 'BCastEnable' = 'yes'
10/16 1:22:34.676 31c8 conf: 'BCastInterface' = '0.0.0.0'
10/16 1:22:34.676 31c8 conf: 'BCastPort' = '3261'
10/16 1:22:34.676 31c8 conf: 'Login' = 'root'
10/16 1:22:34.676 31c8 conf: 'Password' = '##evVRsIJtRmAEEd2sCslZDg=='
10/16 1:22:34.676 31c8 conf: 'MinBufferSize' = '65536'
10/16 1:22:34.676 31c8 conf: 'AlignmentMask' = '0x0000'
10/16 1:22:34.676 31c8 conf: 'MaxPendingRequests' = '256'
10/16 1:22:34.676 31c8 conf: 'iScsiPingPeriod' = '0'
10/16 1:22:34.676 31c8 conf: 'iScsiDiscoveryListInterfaces' = '1'
10/16 1:22:34.676 31c8 conf: 'ServerIoWorkersCount' = '0'
10/16 1:22:34.676 31c8 conf: 'ServerIoWorkersConcurency' = '0'
10/16 1:22:34.676 31c8 conf: 'CmdExecTimeWarningLimitInSec' = '10'
10/16 1:22:34.676 31c8 conf: 'iScisCmdSendCmdTimeoutInSec' = '10'
10/16 1:22:34.676 31c8 conf: 'iSerListen' = ''
10/16 1:22:34.676 31c8 conf: 'LocalizationDir' = 'Localizations'
10/16 1:22:34.676 31c8 conf: 'DefaultStoragePoolPath' = 'My Computer\V'
10/16 1:22:34.676 31c8 conf: 'ExperimentalLSFS' = 'no'
10/16 1:22:34.676 31c8 conf: 'ClusterName' = ''
10/16 1:22:34.676 31c8 conf: 'ClusterGUID' = ''
10/16 1:22:34.676 31c8 conf: 'ClusterSettingsVersion' = '0'
10/16 1:22:34.676 31c8 conf: 'ClusterNodes' = ''
10/16 1:22:34.676 31c8 conf: 'ClusterSync' = ''
10/16 1:22:34.676 31c8 conf: 'ClusterHeartbeat' = ''
10/16 1:22:34.676 31c8 conf: 'DataBaseRoot' = '.\NotifyDB'
10/16 1:22:34.676 31c8 conf: 'DBRotationDays' = '5'
10/16 1:22:34.676 31c8 conf: 'DBFileSizeDays' = '1'
10/16 1:22:34.676 31c8 conf: 'PerformanceMonitorEnabled' = 'yes'
But it still reports, that the sync channel is down. I have attached some screenshots as well.
My understanding was, that if the iSCSI server is configured to listen on 0.0.0.0, then it will accept connections on all interfaces.
Attachments
Replication Node Interfaces.PNG
Replication Node Interfaces.PNG (52.87 KiB) Viewed 4344 times
NICs.PNG
NICs.PNG (32.19 KiB) Viewed 4344 times
Health status.PNG
Health status.PNG (8.2 KiB) Viewed 4344 times
zoltan.csoka
Posts: 5
Joined: Sun Oct 14, 2018 8:19 pm

Tue Oct 16, 2018 9:46 am

One more detail, the 10Gb links are on Mellanox cards, I was wondering if some config parameters have to be tuned. (Jumbo frames are already set)
zoltan.csoka
Posts: 5
Joined: Sun Oct 14, 2018 8:19 pm

Tue Oct 16, 2018 2:30 pm

I have resolved it. I have smelled something fishy with the Mellanox cards, so I started investigating the parameters and drivers. As it turned out, on one server the Mellanox driver was 5.3, on the other one 5.1. That caused some kind of comm problems I suppose and could not sync. I have pulled up the drivers on both servers to 5.5 and it is synching now (very fast :-) ).
Sorry for disturbing you because of my own stupidity,

Kind Regards,

Zoltan
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Tue Oct 16, 2018 3:48 pm

Thank you for keeping us updated :)
In any case, please assign IPs in different subnets for these networks.
Post Reply