Disable L1 Cache on vSphere VSAN

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
frankyyy02
Posts: 10
Joined: Wed Aug 14, 2019 9:14 pm

Thu Mar 26, 2020 12:25 am

Hi

Am working with a SW account manager on a trial and as part of some of my testing, am looking to disable the 2GB write-back cache that i setup initially.
We are running two Supermicro E200-8Ds with 64GB in each, both with Intel NVME drives, and 10GB NICs for Sync and iSCSI/vMotion. The vSphere VSAN is then hosting a 1.2TB thick provision eager zerod disk presented to vSphere for storage.
Most of our current tests have performed well overall, but we have seen a small number of WRITE DELAY errors (10sec approx) in a small number of cases. I am looking to disable the cache to see if it makes any difference.

I see the documentation on https://knowledgebase.starwindsoftware. ... -l1-cache/ and how to disable the L1 cache, but this appears to be specific to Windows hosted vSAN. Unsure if the process is the same in the Linux environment, the service to stop before altering the file etc.

Any help would be appreciated.

Thanks in advance.
yaroslav (staff)
Staff
Posts: 2277
Joined: Mon Nov 18, 2019 11:11 am

Thu Mar 26, 2020 8:40 am

Hello,

Thanks for your question.
Yes, the same thing works for the linux-based VSAN.
Stop the service inside a VM, copy the disk files from one side with WinSCP, modify them as discussed here https://knowledgebase.starwindsoftware. ... -l1-cache/.

Please, make sure to have a copy of the headers saved somewhere (just in case).

There is one more way to fix latency-related issues in StarWind VSAN for vSphere by tweaking I/O scheduler. Below find the examples for the /sdb device.
0. Check availability of all paths.
1. Stop StarWindVSA service on one side.
2. Check the storage to identify the disks:

Code: Select all

lsblk
3. Decide on the disk you want to tune.
4. Checking the sheduler settings:

Code: Select all

cat /sys/block/sdb/queue/scheduler
5. Set the scheduler settings:

Code: Select all

echo noop > /sys/block/sdb/queue/scheduler
6. Checking the sheduler settings after the change:

Code: Select all

cat /sys/block/sdb/queue/scheduler
Result should look like

Code: Select all

bfq mq-deadline [none] OR  [noop] deadline cfq
More information is here: https://blog.codeship.com/linux-io-scheduler-tuning/

Another tweak is to set noread ahead policy for device, based on add

Code: Select all

blockdev --setra 0 /dev/sdb
Let me know if you require any additional assistance.
frankyyy02
Posts: 10
Joined: Wed Aug 14, 2019 9:14 pm

Fri Mar 27, 2020 8:31 am

Thanks for the response.
Unfortunately, when attempting alter the scheduler i receive:

Code: Select all

bash: echo: write error: Invalid argument
It appears noop is not available i guess.

Thanks for the confirmation on the SW Cache removal. I have now done that and will retest.
yaroslav (staff)
Staff
Posts: 2277
Joined: Mon Nov 18, 2019 11:11 am

Fri Mar 27, 2020 10:07 am

Try none instead of noop.

Happy that it worked for you! Please, let me know if any assistance is needed.
frankyyy02
Posts: 10
Joined: Wed Aug 14, 2019 9:14 pm

Sat Mar 28, 2020 8:32 pm

yaroslav (staff) wrote:Try none instead of noop.
Ahh yes, totally missed that. Given its NVMe disks will give it a go after my test cases without cache and let you know if any improvement.

Thanks
yaroslav (staff)
Staff
Posts: 2277
Joined: Mon Nov 18, 2019 11:11 am

Sun Mar 29, 2020 2:05 pm

No problems. Please note that IO scheduler settings revert every time after you reboot the VM.
Please let us know if the changes really helped to improve performance. Please make sure to perform the benchmarking as described here https://www.starwindsoftware.com/best-p ... practices/.
Post Reply