V8 Dedupe & licensing questions

Public beta (bugs, reports, suggestions, features and requests)

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
eliripoll
Posts: 33
Joined: Sat Dec 08, 2012 8:19 pm
Location: United states

Fri May 23, 2014 1:56 pm

Hello Starwind team

Great work on the product. I Have been testing the beta of V8 for a while and I think it’s a good point to start recommending some organizations to move forward on it but, I still had a couple of concerns/ reservation I was hoping I could get some feedback on. Sorry for the long post in advance ;)

1.)A while ago I noted the issue about the lsfs/ dedupe situation where the spsx files eventually grow larger than the data there deduping, some recommendation were to use sdelete, etc to clear zeros. I wanted to know with the latest release has there been any progress on this. One thought I had was, if we use static assigned disk through hyper v instead of dynamic, does the starwind dedupe issue improve or continue?

2.)Another one I noted while working with the beta was, every once in a while I would do a clean shutdown, yet upon reboot it would try to do a Full sync, it has been reported various times but no real clear winner of an exact solution. After working on this one for a while. I think it may be attributed to the fact that 2012/8 has very aggressive shutdown process (waittokillservice regkey , kill everything after 5 sec, yikes) bad for clearing cache, I changed the wait to kill to over a minute (900000 ms) and haven’t had an issue since, I’m sure this would vary depending on how large of a cache you use. I was wondering if you guys could also test and see if that makes a difference one your end. Before this I would have to make sure to stop service manually first then do shutdown. Just wanted to put that out to the community.

3.)Last one is regarding how starwind handles licensed storage amount cant remember if I asked this already, but currently if you have a 1TB License and thin provision a 1tb lun using dedupe, if your storage type is highly duplicated down to lets say 200gb of replicated storage. Then if I want to create another lun, since I still have 800gb of licensed storage unused, it seems like I cant since the software will state you already provision 1tb of space not caring I’m only using 200gb. then if I try to use Windows to use up more of the 800 gb ( to try to get around the issue) I also cant, because from windows perspective I am already at 1TB since its unaware of the De-duplication in the background and will also prevent me from adding more data. So I end up not using all the space I could.

This was my experience with v6 and wondered if it was still the case in v8? It would seem that it would be better to have Starwind monitor actual replicated data used for licensing as opposed to what was just thin provisioned. That way an end-user can use the license that will be purchased to its fullest. Currently I’m hesitant in recommending a certain license size ((i.e. 1tb,2tb, etc) because I don’t want to have a scenario where the organization cant use all the space they paid for with de- duplication enabled. Has there been change in this regard?

Thanks for your feedback,

Eli
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Sun May 25, 2014 9:23 am

Thank you :)

1) StarWind deduplication is not a feature on it's own. It's an essential part of a log-structured file system (LSFS) implemented that does many things but two of them are of our current interest: log-structuring (surprise!) to increase random write performance in ±10x times (so-called "I/O Blender effect" is completely eliminated with StarWind) and ROW (Redirect-on-Write) for snapshots again to increase write performance (read-modify-write sequence typical to COW, Copy-On-Write snapshots implementation is eliminated as well). Both technologies are disk space pigs. You cannot compare StarWind dedupe with MSFT dedupe here as MSFT keeps only active data they squeeze, we keep all history of all the transactions + snapshots and we space-optimize THAT content, MUCH more data. Making long story short: StarWind dedupe will save space compared to VM "raw" sizes and you'll see these savings only after you'll delete the snapshots (keep only current one) and "trim" write transactions history (remove multiple writes to the same addresses stored inside single snapshot)... or enable VSS snapshots on an MSFT dedupe running volume and see how much space MSFT will save you with this scenario. I mean compare apples-to-apples and not apples-to-oranges :)

BTW, with V8 we did another thing I'm not aware of anybody is doing ATM: you can keep limited history of snapshots on a fast and expensive primary storage cluster (doing all-flash?) and basically offload all the old snapshots you still need to dedicated snapshot history nodes equipped with an inexpensive high capacity SATA. That would indeed save space on a primary storage and move cold data to inexpensive secondary one.

2) There are some minor issues but they are Windows Server 2008 R2 related (you're right, MSFT flushes file system caches late so on a flat storage when many VMs start doing flush generating gigabytes of a random writes). But running clustered (no single host, we don't assume as a production environment at all) and Windows Server 2012 R2 you should have zero issues. If you keep getting weird behavior with V8 Gold please let us know and we'll be happy to help.

3) We license usable reported LU capacity. So we report say 4TB and that's 4TB usable. Actual disk space used can be more (with 3-way replica it would be 12TB used on a three node setup), it can be MUCH bigger if you enable snapshots (snapshot capacity is NOT licensed which is very different compared to there vendors). It can be MUCH less (if you enable dedupe and use thin-provisioned LSFS volumes).

Pretty much everything has been changed with V6 -> V8 transition so I would strongly encourage you to take a look @ V8.

Thank you again!
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
eliripoll
Posts: 33
Joined: Sat Dec 08, 2012 8:19 pm
Location: United states

Tue May 27, 2014 2:44 pm

Hi Anton, Thanks for the updates definitely helpful.

I checked around the SWS console for Snapshot management the only reference to snapshots I found was under device then 'snapshot manager' but it had no snapshots to show, and under advanced setting I also didn't see anything regarding snapshots settings. I also think I may not have worded my question clearly the first time let me give example. I downloaded latest v8 build on Thursday to test.

1.)I am using LSFS I have only 1 VM running in the LUN that is 10gb is size from within Windows (Both host and guest see only 10Gb used) how ever when I go and check the size of the folder that contains all the spsx files its 23 gb, the VM has only been running for 4-5 days, idle. this is the problem Im asking about, initially dedupe does as expected, but after a few weeks it just grows bigger then the data that its actually storing. What I am asking is , what are the steps to stop that from happening? especially when the vm is doing nothing other then running idle, no databases. my last test with SWS beta deduped something down to 30 gb, then eventually after weeks the spsx data grew to over 100gb even though from the data was only 65?? is there some specific configuration that has to be set to make this not happen?

3.) for this I also don't think I relayed my question correctly, I do understand how starwind is doing licensing for space currently, but I was pointing out a slight issue with the method when incorporated De-duplication. let me try to re word it.I downloaded the latest v8 build, non beta so that way I would be given a hard limit. I have created lsfs/ dedupe volume of 120gb. I copy 11 vhd's that are almost identical. SWS dedupes it down to 20gb. So far So good. now Windows thinks the drive it full, because its unaware of SWS dedupe, so I cant add any more VHD's, but SWS has it down 20gb~ on disk. but if I try to either create another LUN or increase the size ( so I can use the other 100gb of licensed replicated storage. I get "Total storage size exceeds licensed capacity?

So My question is how would an end user use the other 100gb (or what ever license they purchase) of the replicated size SWS is licensed for?

windows wont let me use more it thinks its full. and SWS wont let me because its only tracking provisioned space not actual replicated storage.
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue May 27, 2014 2:58 pm

1) LSFS takes automatic snapshots. Trim it to remove old stuff.

2) That would be removed in the recent builds. We'll stop giving away capacity-limited HA version for Hyper-V. Some people (limited amount of) would get capacity-unlimied ones.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
eliripoll
Posts: 33
Joined: Sat Dec 08, 2012 8:19 pm
Location: United states

Tue May 27, 2014 6:21 pm

Ah thanks Anton, ok understood, that does clear up confusion for #3) and #2) is good.

but for 1.) for LSFS, I understand the features what I was trying to find out is how do I stop SWS from growing bigger then actual stored data, especially for dedupe devices

see here also for reference

http://www.starwindsoftware.com/forums/ ... t3465.html

thanks
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue May 27, 2014 7:06 pm

You cannot do it. LSFS is a log-structured file system so it keeps history of all the transactions and automatically fires snapshots. Dedupe enabled helps to reduce on-disk footprint for running set VMs. To reduce space for an ice cold single VM just use something else (MSFT dedupe for example).
eliripoll wrote:Ah thanks Anton, ok understood, that does clear up confusion for #3) and #2) is good.

but for 1.) for LSFS, I understand the features what I was trying to find out is how do I stop SWS from growing bigger then actual stored data, especially for dedupe devices

see here also for reference

http://www.starwindsoftware.com/forums/ ... t3465.html

thanks
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
eliripoll
Posts: 33
Joined: Sat Dec 08, 2012 8:19 pm
Location: United states

Tue May 27, 2014 7:35 pm

in my test scenario the vms are running but are idle, just plain os installs no sql/ exchange, etc.
In my case the spspx dedup part 'for running vm's' grows larger than the actual vm space. If its by design, would I be correct in my understanding that the de-duplication with LSFS is more about making sure theirs more room for all the transactions (by deduplication) and not as much about working in a smaller footprint.
if so, at what point does SWS know when to let go of old transactions, is there a preset interval, or will it just continue to grow based on physical disk size?
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue May 27, 2014 8:56 pm

No these are not correct assumptions. All you need to do is just let the LSFS trim the old data (multiple overwrites of the same block say paging I/O).
eliripoll wrote:in my test scenario the vms are running but are idle, just plain os installs no sql/ exchange, etc.
In my case the spspx dedup part 'for running vm's' grows larger than the actual vm space. If its by design, would I be correct in my understanding that the de-duplication with LSFS is more about making sure theirs more room for all the transactions (by deduplication) and not as much about working in a smaller footprint.
if so, at what point does SWS know when to let go of old transactions, is there a preset interval, or will it just continue to grow based on physical disk size?
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
eliripoll
Posts: 33
Joined: Sat Dec 08, 2012 8:19 pm
Location: United states

Tue May 27, 2014 10:18 pm

Ok yes that's what I am asking when does it begin to TRIM so far it looks like it keeps growing and growing. can you elaborate on when does it stop growing (begin to TRIM), does it first just fill up a whole drive and then start over writing old data, or does it have a set max then start trimming so far my single 10gb vm is now taking up 55g of space with spspx files. I am just trying to find out either:
when does it stop growing ( i.e. When does trim start taking effect hasn't yet) ?
or does it just keep filling up the physical drive and then start TRIMing when there's no more space available.?

I know your saying let it TRIM I am asking when is it supposed to start doing it? what happens if there is no more space on the physical drive (due to all the extra spspx file)

Thanks
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed May 28, 2014 4:31 pm

Compatible modern OSes like Windows Server 2012 R2 and Windows 8/8.1 ( + some Linuxes) would send UNMAP command automatically so you don't need to do anything. Old OSes would need sdelete utility being executed inside a VM as an agent to zero out deleted but not overwritten content (Windows of course, there are equivalents for Linux as well). Overwritten content would be deleted automatically. To initiate space re-claim you need to have more then 30% of a junk inside every "segment" file. Upcoming update would actually "trim" content in either of 3 cases 1) junk space inside "segment" is more then a threshold value 2) time for "trim" operation came and 3) user intervention. Also we've discovered some probe with a current implementation so please make sure you'll update your build when Alex would notify you. Thanks!
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
eliripoll
Posts: 33
Joined: Sat Dec 08, 2012 8:19 pm
Location: United states

Wed May 28, 2014 4:47 pm

Awesome that's great info, Ill recreate in 2012 environment and test again, and look out for next update.
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Wed May 28, 2014 4:52 pm

Sure. And I'll try to make guys to release a technical document for using StarWind deduplication...
eliripoll wrote:Awesome that's great info, Ill recreate in 2012 environment and test again, and look out for next update.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Post Reply