LSFS Defragmentation Issue

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Anatoly (staff), Max (staff)

Post Reply
KevinR
Posts: 8
Joined: Fri Sep 25, 2015 3:52 am

Thu Feb 15, 2018 7:43 pm

Futher to my post in Blowfish-IT's thread regarding Starwind consuming large amounts of disk I/O related to reading (little to no write activity) LSFS log files even with no workload, I noticed that the defragmentation levels reported by the console seem quite off (>900%?) and the amount of physical space being consumed is 8-10 times that of the data being stored and has led to the system exhausting the underlying storage and crash.

Example vdisk5 LSFS HA device as seen from node A (highlighted statistics):
IMG_0008.PNG
IMG_0008.PNG (109.79 KiB) Viewed 8941 times
Example vdisk5 LSFS HA device as seen from node B (highlighted statistics):
IMG_0009.PNG
IMG_0009.PNG (88.91 KiB) Viewed 8941 times
I haven't seen this problem for a few years now, but it seems to have crept back into the latest build (11818)? I noticed that while the Starwind process is running I'm unable to delete any of the lsfs log files for a given device (they're all locked and I would expected this behavior), but if I restart the Starwind service on a node, I can then highlight all of the log files and move them, and only the ones that Starwind seems to care about remain locked and the rest are safely purged - this gets rid of 100's of GB of outdated log files. This same procedure can be repeated for different lsfs devices on different nodes all with the same result; I have to keep repeating this behavior every few weeks to keep the storage consumed under control.

Anyone else seeing this behavior?

Kevin
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Mon Feb 19, 2018 9:51 am

Hi Kevin,
Can you please collect the logs from your systems and PM me for better understanding the problem you faced? Also, please specify more details about your system configuration.
You can collect log using this tool.
Thank you!
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Wed Feb 21, 2018 5:07 pm

Hi Kevin,
We investigated the issue with defragmentation of HA device. It was identified and a proper fix was introduced. This fix will be available in the next StarWind build.
As for now, you can restart the StarWind service on your servers, wait for the full mounting of the devices and then do the same on another server. And do not forget to run the FlushCacheAll.ps1 script from StarWindX PowerShell examples folder before restarting the service.
KevinR
Posts: 8
Joined: Fri Sep 25, 2015 3:52 am

Thu Feb 22, 2018 6:02 am

That's great news that you found the problem Oleg!

When you say to run flushcacheall.ps1 before I restart the service - is that necessary even though I have only write-through caches in vsan?

Any idea when the next build will be released?

Thanks,
Kevin
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Thu Feb 22, 2018 10:40 am

Hi Kevin,
I suggested you run flushcacheall.ps1 before you restart the service to speed up the process and for a proper restart.
The next build should be in the middle of the next month.
KevinR
Posts: 8
Joined: Fri Sep 25, 2015 3:52 am

Thu Feb 22, 2018 5:37 pm

Ok thanks for the tip and schedule update Oleg.

Kevin
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Fri Feb 23, 2018 8:20 am

And thank you, Kevin!
Post Reply