MPIO Policy configuration / Loop Back Accelerator

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
bertram
Posts: 16
Joined: Mon Sep 17, 2018 4:09 am

Mon Sep 17, 2018 4:44 am

Hi,

I have some questions regarding the content of this article: https://knowledgebase.starwindsoftware. ... correctly/
MPIO Policy configuration
We recommend configuring MPIO policy in “Fail Over Only” mode when sum of your iSCSI connections has less the 10GBps throughput.
a) How is the sum of iSCSI connections counted? Only initiator links or really all links (incl. the sync links)?
b) What's the technical reason that you recommend FO if the sum is less then 10Gbit/s?
c) Why FO at all? I only know a few scenarios where FO could be a prefered policy. In most cases it's a waste of bandwidth.
Note: MPIO “Fail Over Only” policy can be configured this way only in Hyper-Converged scenario, because it allows to configure local iSCSI connections through loop back IP address (127.0.0.1) which enables Loop Back Accelerator. Loop Back Accelerator allows bypassing TCP/IP stack thus increases transfer performance and decreases transfer latency.
Although I've read some articles about this LB Acc I still don't understand it's purpose. Would you provide a more detailed network plan of a 2 node SAN configuration where it plays a role.
If your case does not fit scenario above, we recommend configuring MPIO policy in “Round Robin”, thus setting all iSCSI channels to be continuously active.
Again, why do you recommend RR here? Like FO in the part above it's usually the worst policy when it comes to efficently saturate MPIO links.

Regards
bertram
Orest (staff)
Staff
Posts: 3
Joined: Tue Sep 18, 2018 4:47 pm

Tue Sep 18, 2018 4:58 pm

Hello Bertram,

Thank you for your question. Regarding the StarWind article:
a) The sum of iSCSI connections includes only the iSCSI initiator links.
b) We recommend assigning the Failover Policy in this case as it provides the best performance for the storage.
c) In fact, one loopback connection throughput is estimated as, approximately, 10 Gbps so when the data is transferred locally, it gets maximum speed. In case other iSCSI connections speed is lower than 10 Gbps (i.e. 1 Gbps), assigning partner connections for the device (as in the case with Round Robin) will actually degrade the performance as the transfer operations will be bottlenecked by slower channels waiting for the packets to be acknowledged.

Regarding the StarWind Loopback Accelerator, in case of running two StarWind nodes in a hyperconverged mode, the Loopback Accelerator will allow to significantly boost the loopback connections performance by sending data through a shorter path (we avoid TCP/IP stack).
You can also find the Microsoft insights on the performance difference with and without Loopback Fast Path feature (on which StarWind Loopback Accelerator is based) here: https://blogs.technet.microsoft.com/win ... fast-path/

As to your last comment, you are actually right. Using Least Queue Depth is more preferable than Round Robin and that is what we recommend in our latest documentation. However, it still depends on the environment specifications as in some cases, Round Robin may give better performance than Least Queue Depth. You can find our latest recommendations as per multipath configuration in the following guide, as well as in the updated KB: https://www.starwindsoftware.com/resour ... erver-2016

Kind regards,
Orest
bertram
Posts: 16
Joined: Mon Sep 17, 2018 4:09 am

Tue Sep 18, 2018 5:51 pm

Hi Orest,

thank you for your comprehensive explanations.

Regarding the loop back accelerator I guess my problem was my view on the network design. As far as I understanding now the lo acc is only usefull and boosts your (virtual) bandwidth if you run Starwind on one or more hyper-v servers directly.

It has no benefit or function on the classic (physically isolated) san design like shown below. Am I right?

Image
Orest (staff)
Staff
Posts: 3
Joined: Tue Sep 18, 2018 4:47 pm

Tue Sep 18, 2018 7:35 pm

Hello Bertram,

Yes, you’re right. In case of compute and storage nodes are separated, the Loopback Accelerator won’t work as the performance will be determined by the iSCSI channels throughput between the nodes. There, actually, is no loopback connection in such a scenario.
Post Reply