data corruption? unable to mount, cpu core at 100%

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
Quark
Posts: 6
Joined: Thu Feb 07, 2019 9:25 am

Thu Feb 07, 2019 9:51 am

software version 8.0.12658, win64
after unexpected shutdown volume stuck at "Mounting Status: Creating...", "Mounting Progress: 1%"
StarWindService.exe keeps one core busy at 100%, disk usage at 0%
logs keep writing "cf0 General: DDDisk_DeviceInfo: LSFS device 'lsfs2': mounting progress '1'."


2c44 Sp: Creating LSFS device:
Directory: D:\Data\
MetadataDirectory: Data
Prefix:
Exisiting: 1
Read-only: 0
Block size: 4096
Size in blocks: 335544320
Page size: 4194304
Pages per superpage: 0
Threads: 8
Packets: 256
DD disabled: 0
Defrag: 1
Verify: 0

2c44 Sp: xxx CStarPackCoreNew::CStarPackCore::CreateEx (4868) Header version is 65546
2c44 Sp: xxx CStarPackCoreNew::CStarPackCore::CreateEx (5030) Both superblocks are valid
2c44 Sp: xxx CStarPackCoreNew::CStarPackCore::GetActualLastPage (20693) Device updated to the state at page [9052027]
2c44 Sp: Opened header of LSFS device:
Directory: D:\Data\
Prefix: Data
Read-only: 0
Block size: 4096
Map chunk size: 4096
VMap size: 714429441
Page size: 4194304
DD used: 1
Defrag used: 1
VMap outdated: 1
LUMaps outdated: 0
Quark
Posts: 6
Joined: Thu Feb 07, 2019 9:25 am

Thu Feb 07, 2019 10:04 am

after 40 mins disk jumped to 100%, mounting progress 44%, there is no record in logs between 1 and 44%. Progress keep increasing. After 55 it jumped back to 45, now 89.
Quark
Posts: 6
Joined: Thu Feb 07, 2019 9:25 am

Thu Feb 07, 2019 10:20 am

a lot of errors in logs, like "2c44 Sp: *** CStarPackCoreNew::CStarPackCore::RestoreHashDBByMaps (27165) (0x000002AC34990000) Logical error. VLBA[33607151] read bucket key [730673], count bucket key [423757035], count bucket number [636605]"
after disk mounted volume is not accesible: "The file system detected a checksum error and was not able to correct it.".

what to do with that? have cold backup of data directory, but restore is like 24 hours. maybe restore some files only?
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Thu Feb 07, 2019 11:00 am

Can you please collect the logs from the hosts?
You can use this tool.
Quark
Posts: 6
Joined: Thu Feb 07, 2019 9:25 am

Thu Feb 07, 2019 6:15 pm

in attachment
Attachments
MICROFLEX 20190207-202915.7z
logs
(151.64 KiB) Downloaded 396 times
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Fri Feb 08, 2019 9:56 am

Thank you!
We will check and give you an update.
Quark
Posts: 6
Joined: Thu Feb 07, 2019 9:25 am

Fri Feb 15, 2019 8:57 pm

restored from backup, starwind should checkpoint more, 8 days after last sync is very long to start

have this line in logs, what it means?
d3c Sp: *** CStarPackCoreNew::RAMDeduplicationDB::AddHashToBucket (471) The full hash verification routine has been failed! Status (4306)
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Tue Feb 19, 2019 2:33 pm

Sorry for the long process of investigation.
According to the logs, the last LSFS snapshot was corrupted. For these cases you can use the following steps:
1) Stop StarWindService;
2) rename .splumap and .spvmap extension in folders where failed devices located;
3) Start Starwind service;
4) Wait for the mounting progress.

In order to avoid such issues in the future, you can use flat image files instead of LSFS devices.
Quark
Posts: 6
Joined: Thu Feb 07, 2019 9:25 am

Tue Feb 19, 2019 10:04 pm

what will be after this action?
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Wed Feb 20, 2019 1:50 pm

These steps should fix the issue with mounting.
muhunal
Posts: 4
Joined: Tue Aug 27, 2019 10:03 am

Tue Aug 27, 2019 10:08 am

Hi I also have the same issue. here is my log. I did what you said, but... what else I can do?
thanks in advance.

8/27 5:54:22.102758 bdc Cache: StarWind Virtual SAN v8.0.0 (Build 12767, [SwSAN], Win64)
8/27 5:54:22.102858 bdc Cache: Jan 18 2019 13:58:51
8/27 5:54:22.103112 bdc Cache: CacheBase::init: Cache parameters: Mode = 3, Flags = 64, Size = 16 MB, SectorSize = 512, BlockExpiryPeriod = 5000, DiskSizeInSectors = 0, Blocks = 0x00000254CA04F2C0
8/27 5:54:22.103500 166c Cache: CacheBlocksMemory::getMaxMemoryAvailableForCache: Physical memory total: 83869140 KBs, 81903 MBs, 79 GBs
8/27 5:54:22.103611 166c Cache: CacheBlocksMemory::getMaxMemoryAvailableForCache: Physical memory available: 14276192 KBs, 13941 MBs, 13 GBs
8/27 5:54:22.103634 bdc Cache: CacheBase::init: Prefetch Advisor was not created for RAM Cache!
8/27 5:54:22.103645 166c Cache: CacheBlocksMemory::getMaxMemoryAvailableForCache: Virtual memory available: 137368307340 KBs, 134148737 MBs, 131004 GBs
8/27 5:54:22.103928 bdc Cache: CacheBase::init: Worker Thread is started and set to affinity group: 1652.
8/27 5:54:22.104115 bdc Cache: CacheBase::init: Worker Thread is started and set to affinity group: 1656.
8/27 5:54:22.104460 bdc Cache: CacheBase::init: Worker Thread is started and set to affinity group: 1160.
8/27 5:54:22.105691 bdc Common: *** sw_common::Sw_CompletionPort::Open: There are 8 processor cores detected in the system. 2 worker threads will be created for completion of asynchronous IO operations.
8/27 5:54:22.105935 bdc Common: *** sw_common::Sw_CompletionPort::Open: Set IO Completion thread (0x0000000000000688, id=5176) affinity: group = 0x0, mask = 0xf
8/27 5:54:22.106221 bdc Common: *** sw_common::Sw_CompletionPort::Open: Set IO Completion thread (0x00000000000002B0, id=2500) affinity: group = 0x0, mask = 0xf
8/27 5:54:22.190854 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '1'
8/27 7:30:21.398178 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '44'
8/27 7:30:27.780387 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '45'
8/27 7:30:34.235379 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '46'
8/27 7:30:40.204928 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '47'
8/27 7:30:46.094409 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '48'
8/27 7:30:52.082599 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '49'
8/27 7:30:57.975992 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '50'
8/27 7:31:04.310608 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '51'
8/27 7:31:10.719019 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '52'
8/27 7:31:17.600902 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '53'
8/27 7:31:23.658201 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '54'
8/27 7:31:29.845741 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '55'
8/27 7:38:42.705792 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '44'
8/27 7:38:45.845564 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '45'
8/27 7:38:49.259353 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '46'
8/27 7:38:52.841103 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '47'
8/27 7:38:56.452374 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '48'
8/27 7:38:59.962787 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '49'
8/27 7:39:03.138358 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '50'
8/27 7:39:06.201189 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '51'
8/27 7:39:09.526330 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '52'
8/27 7:39:12.841103 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '53'
8/27 7:39:15.870023 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '54'
8/27 7:39:18.706489 bdc General: *** DDDisk_MountCallback: LSFS device 'lsfs1': mounting progress '55'
8/27 7:39:19.168850 bdc Cache: CacheBase::flushAllData: The operation of Cache Flushing has been finished.
8/27 7:39:19.168954 1014 Cache: CacheBase::lazyWriterWorker: Thread has been exited with err: 0
8/27 7:39:23.946660 bdc General: *** DDDisk_Mount: Failed to create SP object, error ( 4306 ).
8/27 10:31:16.085674 3cc conf: TelnetListener::listenConnections: Accepted control connection from 127.0.0.1:50911.
8/27 10:31:19.117036 f44 Srv: Failed: one of experimental feature module file not found.
muhunal
Posts: 4
Joined: Tue Aug 27, 2019 10:03 am

Tue Aug 27, 2019 12:53 pm

With todays update, my proble solved. and I am able to mount my lfsf drive without any modifications. thanks :)
Oleg(staff)
Staff
Posts: 568
Joined: Fri Nov 24, 2017 7:52 am

Tue Aug 27, 2019 1:48 pm

Thank you for your update :)
Post Reply