Deduplication in v8 - now recommended to store VMware VMFS?

Public beta (bugs, reports, suggestions, features and requests)

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
-am-
Posts: 2
Joined: Mon Jan 20, 2014 10:27 am

Mon Jan 20, 2014 10:40 am

Anton,

one year ago you explained on this page http://www.starwindsoftware.com/forums/ ... t3028.html why it does not make sense to run VMware VMs / VMFS on dedup enabled devices for StarWind v6 - and you pointed out to wait for StarWind v8 (which final version will be released soon).

Would you please outline in which environments (or given system configuration) you would recommend to run VMware VMFS on top of a StarWind v8 deduped devices?

Does still make sense? And if yes, are there any limitations sysadmin have to be aware of before they start to put the VMs on the StarWind v8 dedup device?
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Mon Jan 20, 2014 11:30 am

V8 would address issues you've highlighted in your post (something like f.e. data delete and VAAI were added in the interim versions, performance was fixed wth re-design, now we're completing unmap overwrapping and final testing). So yes you could have your data on LSFS volumes (still having and "experimental" sticker in V8).
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
-am-
Posts: 2
Joined: Mon Jan 20, 2014 10:27 am

Mon Jan 20, 2014 8:15 pm

Anton, thank you.

Unfortunately block-level sharing of storage is still a complex topic.

To ensure sustained and expected results, you need to have to control each part of the entire chain. And this is a quite challenging job if code have to rely on third party code, infrastructure or the an entire closed OS (like Windows) which is also a moving target (with every new release).

Good luck and I'm really looking forward to v8 and final arrival of LSFS with inline depuplication.
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Mon Jan 20, 2014 8:21 pm

Not really. We do have a shared database of a deduplicated blocks between LUNs and targets. We're not a file system filter like MS dedupe we're monolithic storage stack.

We don't rely on any third-party code and we don't rely on Windows built-in features. StarWind is developed from scratch.
-am- wrote:Anton, thank you.

Unfortunately block-level sharing of storage is still a complex topic.

To ensure sustained and expected results, you need to have to control each part of the entire chain. And this is a quite challenging job if code have to rely on third party code, infrastructure or the an entire closed OS (like Windows) which is also a moving target (with every new release).

Good luck and I'm really looking forward to v8 and final arrival of LSFS with inline depuplication.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
DavidMcKnight
Posts: 39
Joined: Mon Sep 06, 2010 2:59 pm

Wed Jan 22, 2014 7:04 pm

With the rewrite of Dedup and all the other advances in v8, will VMware know it is writing to a dedup volume? If you have a 1000 gig VMware volume and write 500 gigs of data, then dedup gets you down to 250 gig of actual data on the datastore, the problem in the past has been VMware thinking you have a VMware volume that is 50% full not 25%. Is this issue going to be resolved in v8?
User avatar
Anatoly (staff)
Staff
Posts: 1675
Joined: Tue Mar 01, 2011 8:28 am
Contact:

Thu Jan 23, 2014 2:11 pm

Good question. I think I`ll better explain how it works:
When you are creating DD device for, lets say, 100Gigs, the VMware (or anything else) will see it as 100 Gig disk. If you`ll put 100 Gigs on it that will take only 50 Gigs on the storage side, VMware will see that the disk is full. The baseline here is that DD helps to decrease the storage space on the serer, where the DD SAN disk is running, so in that scenario you will be able to create another 100 Gig device, that`ll be able to grow up to 50 Gigs on the disk.
I hope I was clear here and me post will help you to understand the main idea of DD, but if not - just let me know.
Best regards,
Anatoly Vilchinsky
Global Engineering and Support Manager
www.starwind.com
av@starwind.com
User avatar
DavidMcKnight
Posts: 39
Joined: Mon Sep 06, 2010 2:59 pm

Thu Jan 30, 2014 4:47 pm

I think I`ll better explain how it works
That's what I was afraid of...

So can you explain to me why you would ever trust DeDup in a production environment?

If I had a DeDuped iSCSI volume that was holding a dozen windows 7 VMs that where all but identical (a computer lab for example). Even though I have to configure the iSCSI volume to be big enough to hold a dozen VMs (plus a little extra), the actual disk space being taken up on the datastore would be relatively small. Now lets say I have six of these volumes on my Starwind datastore. Because of DeDup I am massively over subscribed on the datastore, and if for some reason out side of my control, the data on the iSCSI volumes became less and less DeDupable. This starts to grow the size of the Starwind .IMG files (or what ever the new suffix is). This runs the D: Drive (where I store the .IMG files on my Starwind server) out of space. This would cause all sorts of hell.

I know this isn't Starwind's fault, nor VMware's. But the idea of DeDup and other types of compression on a networked datastore isn't that new of an idea. With out some way for a VMware host to talked to the datastore about compression (dedup or otherwise) it seems to me you are playing with explosives. At some point you're going to blow off your hand.
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Fri Jan 31, 2014 4:45 pm

OK, so some theory...

If deduplication is implemented @ file system level (local or remote does not matter) then file system driver would just adjust amount of free space reported influenced by deduplication coefficient.
Say you have set of 10 files 1GB each but they are the same so actual used space would be 1GB (+ some metadata). File system would report ~9GB free and file parse utility would report 10GB used.
A little bit illogical and would break some apps (volume total space, free space and used space would not "play" with per-file used space) but OK.

If deduplication is implemented @ block level (below file system) so we'll hit a situation you describe. At some point volume of say 1TB with 1TB used can occupy say 10GB of a real disk space allocated
on a storage back end. Hard stop? No! You just need to re-provision the volume (apply "grow" function) and tell you have now say 2TB LUN. Both VMware vSphere and Microsoft Hyper-V can do it on-the-fly.
Bad news: some manual intervention is required. Good news: it's 100% transparent to file system and there are no weird numbers (volume free space Vs. others).

That's it :)
DavidMcKnight wrote:
I think I`ll better explain how it works
That's what I was afraid of...

So can you explain to me why you would ever trust DeDup in a production environment?

If I had a DeDuped iSCSI volume that was holding a dozen windows 7 VMs that where all but identical (a computer lab for example). Even though I have to configure the iSCSI volume to be big enough to hold a dozen VMs (plus a little extra), the actual disk space being taken up on the datastore would be relatively small. Now lets say I have six of these volumes on my Starwind datastore. Because of DeDup I am massively over subscribed on the datastore, and if for some reason out side of my control, the data on the iSCSI volumes became less and less DeDupable. This starts to grow the size of the Starwind .IMG files (or what ever the new suffix is). This runs the D: Drive (where I store the .IMG files on my Starwind server) out of space. This would cause all sorts of hell.

I know this isn't Starwind's fault, nor VMware's. But the idea of DeDup and other types of compression on a networked datastore isn't that new of an idea. With out some way for a VMware host to talked to the datastore about compression (dedup or otherwise) it seems to me you are playing with explosives. At some point you're going to blow off your hand.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Post Reply