"hardware error" w. Starwind NVMe-oF initiator Linux Target

Initiator (iSCSI, FCoE, AoE, iSER and NVMe over Fabrics), iSCSI accelerator and RAM disk

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
mattaw
Posts: 7
Joined: Tue Mar 02, 2021 6:13 pm

Mon Mar 29, 2021 1:16 pm

Good morning,

This is probably hardware or ROCEv2 network configuration related, however I am not sure how to debug it further.

Setup:
-------
Linux Debian 10.8, Supermicro Xeon platform
Linux mythmaster 5.10.0-0.bpo.3-amd64 #1 SMP Debian 5.10.13-1~bpo10+1 (2021-02-11) x86_64 GNU/Linux
Kernel drivers for Connectx-5
NVMe-OF target from the kernel.
Connectx-5 CX556A-ECAT latest firmware in x16 PCIe 3 slot

Windows 10 Pro 20H2, AMD Ryzen 3600 X570 platform
Mellanox WinOF-2 2.60
NVMe-OF target the latest from Star Wind
Connectx-5 CX556A-ECAT latest firmware in x4 PCIe 4 slot, using PCIe 3

Network is direct connect to each other, no switch.

Testing:
---------
Star Wind rping runs for minutes on -V verify without error.
Unsure how to correctly use Star Wind rperf to stress the link but it seems to work with all the settings I gave it.
RDMA counters on windows seem to show correct behavior and no dropped RDMA frames, however I am no expert and info is thin on how to diagnose.

Failing Testing software:
----------------------------
ATTO 4.01.0f1, Direct I/O (works without direct I/O)

Runs for several tests and then fails with the following message in windows event log:
Example error: The IO operation at logical block address 0x7835ec28 for Disk 3 (PDO name: \Device\000000aa) failed due to a hardware error.

Linux shows no messages at all, so I assume this is a fault somewhere on the windows side. Thoughts?
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Mon Mar 29, 2021 3:05 pm

Hi,

Please log a call with us by sending an email to support@starwind.com. Use this thread as a reference.
mattaw
Posts: 7
Joined: Tue Mar 02, 2021 6:13 pm

Mon Mar 29, 2021 5:37 pm

I appreciate you taking a look at this, I really do. Case opened and logs uploaded.

Matthew
yaroslav (staff)
Staff
Posts: 2279
Joined: Mon Nov 18, 2019 11:11 am

Mon Mar 29, 2021 6:03 pm

Will need more logs from you. Let us work on this matter in the support case.
Post Reply