LSFS volume grew, filled hdd, now datastore EMPTY??

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

kkuszek
Posts: 17
Joined: Fri Dec 05, 2014 1:40 pm

Tue Mar 10, 2015 1:50 pm

Another lsfs issue chiming in here.

I am on the free v8 install, hyperconverged esxi. I have 2tb of data that grew to the 8tb volume until it had 12.5mb free space and all my virtual machines locked up, became unresponsive, and basically flipped out. I realized what happened so I rebooted the vsan machine and installed the latest build claiming to fix the lsfs growth issue, 7774. After it mounted the datastore (hours later as usual) it came online and became available to esxi but there were no storage vmfs partitions showing! according to VMware, that datastore was showing as unformatted!!!
My drive containing the spsx files also now reports 4tb more free space that did not exist. Did lsfs cleanup happen and also cause lost data?

I recovered several of my machines to another datastore to get production partially up however I did not have recent enough functioning backups for everything that was lost and I would prefer to not have to re-create it all.

Last time I lost a datastore I had the l2 cache go corrupt so I tried mounting this one without the l2 - no change.

Am I up a creek or is there something I can do? I only care about not losing my vsan data again.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Tue Mar 10, 2015 7:20 pm

May I ask you if you have created new devices with 7774 build, or you have created them on previous build, and ran an update after?
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
kkuszek
Posts: 17
Joined: Fri Dec 05, 2014 1:40 pm

Tue Mar 10, 2015 8:18 pm

Hi Anatoly,
The devices were created prior on 7354.

Here is a shortened timeline:
originally setup datastores (no dedupe, thin prov lsfs)
upgraded incrementally several times, 7354>7471>7509.
lsfs grew and ate all of the space, virtual machines on iscsi had issues
rebooted starwind server, when re-mounted the datastore showed as empty to esxi
rebooted again, stopped starwind services from starting, installed 7774 upgrade hoping it could mount/cleanup lsfs bloat and remedy situation
kkuszek
Posts: 17
Joined: Fri Dec 05, 2014 1:40 pm

Thu Mar 12, 2015 6:48 pm

I just spent 2 1/2 hours on the phone with several engineers at VMware.
they dug into the hex code of the partition tables and really exhausted many avenues. They were perplexed at how this could have happened and unable to recover any data/information. The metadata for the partitions is gone, my data is unrecoverable. Unless starwind can think of some other form of magic then virtual san has lost my production data... again.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Sun Mar 15, 2015 2:17 pm

Hi! Thanks for the update?

How about jumping on the quick call next week? If you agree please drop me a PM with your phone number.

Thank you
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
MichelZ
Posts: 34
Joined: Sun Mar 16, 2014 10:38 am

Mon Mar 16, 2015 5:11 pm

kkuszek wrote:I just spent 2 1/2 hours on the phone with several engineers at VMware.
they dug into the hex code of the partition tables and really exhausted many avenues. They were perplexed at how this could have happened and unable to recover any data/information. The metadata for the partitions is gone, my data is unrecoverable. Unless starwind can think of some other form of magic then virtual san has lost my production data... again.
We had something similar happening. Starwind is trying to fix it by "going back in time" on the log-device...
So far, we haven't it working though :(
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Mon Mar 16, 2015 5:16 pm

Yeah, sorry about that. Just an update for community about Michaels case:
We are trying to build a new utility which will should get the data back.
But from what we discussed, valid data must be near the end of the LSFS device, so going by 5000 will not really help.
The best you can do now, is going from the point we have finished with steps by 10, while we are working on the new utility.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
MichelZ
Posts: 34
Joined: Sun Mar 16, 2014 10:38 am

Mon Mar 16, 2015 5:22 pm

Yes, I'm currently doing that
kkuszek
Posts: 17
Joined: Fri Dec 05, 2014 1:40 pm

Tue Mar 17, 2015 3:54 pm

My current eta is Thursday, I reached out to Anatoly and am waiting on support.
I restored existing servers to another datastore, I have not deleted or changed things since the disaster so I am hoping but I guess we will see. I will update this thread with my progress as any updates come for other users.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Thu Mar 19, 2015 9:34 pm

Just checking in here: R&D guys are working to get universal tool for data recovery
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
Demonster
Posts: 4
Joined: Thu Oct 23, 2014 7:01 am

Mon Mar 23, 2015 12:55 pm

Hello.

I have same issue.
After upgrade from 7509 my 3 lsfs storage targets with Hyper-V vhdx disks have empty or buggy.

How to send you logs?
kkuszek
Posts: 17
Joined: Fri Dec 05, 2014 1:40 pm

Mon Mar 23, 2015 2:51 pm

As an update to anyone following with Anatoly's post above they have not contacted me with any tool or fix, just the update he posted that they are working on it.
MichelZ
Posts: 34
Joined: Sun Mar 16, 2014 10:38 am

Mon Mar 23, 2015 2:53 pm

Yes, it's very sad :(
I never ever go near an LSFS device again!
kkuszek
Posts: 17
Joined: Fri Dec 05, 2014 1:40 pm

Mon Mar 23, 2015 2:56 pm

Same here,
I do not believe v8 was at all production ready. LSFS was what made me want the product and also my demise. This is only a disaster recovery effort for me at this point. How can I recovery my information to put on reliable storage elsewhere.
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Mon Mar 23, 2015 7:05 pm

Guys, terribly sorry for that. I mean, I know how you guys feel. Just want to assure you that we are doing our best to get all the issues gone far far away as soon as we can.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
Post Reply