Share some dedupe numbers?

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

Post Reply
User avatar
lohelle
Posts: 144
Joined: Sun Aug 28, 2011 2:04 pm

Sun Jan 22, 2012 12:41 pm

I have done a small dedupe-test in my environment. 4KB block size.

Server 1 DNS server, 2008R2, 8.8GB used of 40GB (thin vmdk)
Server 2 DNS server2, 2008R2, 6.8GB used of 40GB (thin vmdk)
Server 3 Web server, 2008R2, 7.4GB used of 40GB (thin vmdk)
Server 4 AD server, 2008R2, 8.4GB used of 40GB (thin vmdk)

Cloned server 1 to datastore. Spdata size = 7 GB
Cloned server 2 to datastore, Spdata size = 7.2 GB
Cloned server 3 to datastore, Spdata size = 8.2 GB
Cloned server 4 to datastore, Spdata size = 10.1 GB

All servers installed from ISO. Not cloned or not installed from templates.

31,4GB used on datastore. 10.1 GB spdata, 900MB spmetadata, 315MB spbitmap

11.3GB vs 31.4GB is quite good. Could be some nice savings if I put all my
low-IO 2003 servers on one LUN and all 2008 R2's on a different LUN.

I have not tested higher IO servers yet. But my 8-core 2,3 GHz Opteron server easily saturate the gigabit link used during cloning.

Would be nice with HA + dedupe!
User avatar
lohelle
Posts: 144
Joined: Sun Aug 28, 2011 2:04 pm

Sun Jan 22, 2012 9:34 pm

Feature requests:
Vaai support (reclaim of space when deleting files in VMFS)
Dedupe support for HA-devices
And a general request:
Change sync-channel settings on HA-target
User avatar
lohelle
Posts: 144
Joined: Sun Aug 28, 2011 2:04 pm

Mon Jan 23, 2012 9:36 pm

Strange...
I created a NTFS-partition from a 800GB dedupe device (4KB block). I installed a Windows Server 2008 R2 Hyper-V VM. Then I installed second VM the exact same way.
Dedupe size almost doubled.

Then cloning in Vsphere the size was almost unchanged (different servers, not two of the same VM's)
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Mon Jan 23, 2012 11:28 pm

VAAI should be part of V6 (and we'll have own intellectual path to return back unused space).

Dedupe support for HA is also part of V6.

Not sure what you mean with "change sync-channel..."

Could you please spawn a dedicated thread with requests in Beta or here (I'll create a shadow copy in any case).

Thanks!
lohelle wrote:Feature requests:
Vaai support (reclaim of space when deleting files in VMFS)
Dedupe support for HA-devices
And a general request:
Change sync-channel settings on HA-target
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Mon Jan 23, 2012 11:31 pm

Offsets floated in some way. What cluster size for NTFS do you use?
lohelle wrote:Strange...
I created a NTFS-partition from a 800GB dedupe device (4KB block). I installed a Windows Server 2008 R2 Hyper-V VM. Then I installed second VM the exact same way.
Dedupe size almost doubled.

Then cloning in Vsphere the size was almost unchanged (different servers, not two of the same VM's)
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
lohelle
Posts: 144
Joined: Sun Aug 28, 2011 2:04 pm

Tue Jan 24, 2012 7:21 am

4KB, same as dedupe block size.
Created as a GPT disk.

Not important as I will use vSphere only, but just a bit strange.

I'm going to do some more dedupe tests in a few days with 32 SAS-disks in RAID 10 + infiniband to ESXi host.
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue Jan 24, 2012 8:52 am

Well, quite important for us as we'd like to pinpoint and fix the issue :)
lohelle wrote:4KB, same as dedupe block size.
Created as a GPT disk.

Not important as I will use vSphere only, but just a bit strange.

I'm going to do some more dedupe tests in a few days with 32 SAS-disks in RAID 10 + infiniband to ESXi host.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
lohelle
Posts: 144
Joined: Sun Aug 28, 2011 2:04 pm

Tue Jan 24, 2012 10:43 am

Any test you would like me to do? Sector sizes and block sizes? (I was referring to the hyper-v issue)
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue Jan 24, 2012 2:50 pm

Could you please clarify (maybe in e-mail) what exactly you did so we could find out why dedupe did not actually work for second image. Thank you!
lohelle wrote:Any test you would like me to do? Sector sizes and block sizes? (I was referring to the hyper-v issue)
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
lohelle
Posts: 144
Joined: Sun Aug 28, 2011 2:04 pm

Tue Jan 24, 2012 3:00 pm

I have a new server that will replace node 2 in our HA setup.

I installed Starwind 5.8 and created a 800GB dedupe-disk with default options. I then used MS Iscsi initiator on the same server to connect to the new target.
Converted the disk to GPT and formatted as NTFS with default options. Checked, and cluster size is 4K, same as block size for the dedupe-disk.

I added hyper-v role and created a new VM (80gig disk). Installed from Windows Server 2008 R2 iso. Default install. Did nothing after install was finished.
I added a second VM with EVERYTHING the same except for the name (added _2)

Disk usage inside both VM's = 10GB
Dedupe disk file after first VM install = 6,7GB
Dedupe disk file after second VM install = 12 GB
User avatar
Vitalii (staff)
Staff
Posts: 44
Joined: Mon Jun 07, 2010 8:49 am

Wed Jan 25, 2012 2:50 pm

Could you try using 512 block of deduplication for that purpose?

It could be the problem of Hyper-V file format that does not necessarily align data to 4k boundary, so the dedupe engine could not match these blocks.
User avatar
lohelle
Posts: 144
Joined: Sun Aug 28, 2011 2:04 pm

Wed Jan 25, 2012 5:12 pm

512b dedupe (200GB disk)
5.5GB usage (just spdata) after 1 VM install. 6.7GB after the second. But about 10GB of metadata and bitmap also.. and bad performance.. and high mem usage.

Tried 4GB block again and GPT disk. Formatted and actually selected 4kb clusters (even if this was used automaticly last time)
Same results as previous 4kb results.

Same results with MBR and 4kb block/4kb cluster.
As far as i know, Windows Server 2008 R2 setup should align partitions automaticly? (I was thinking about inside the VM's)
User avatar
anton (staff)
Site Admin
Posts: 4010
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Thu Jan 26, 2012 12:51 am

Yes, it's VHD align issue. Confirmed! We'll check what could be done to fix / workaround this.

512 byte dedupe is not intended for production actually (only experiments). Load is too heavy.
lohelle wrote:512b dedupe (200GB disk)
5.5GB usage (just spdata) after 1 VM install. 6.7GB after the second. But about 10GB of metadata and bitmap also.. and bad performance.. and high mem usage.

Tried 4GB block again and GPT disk. Formatted and actually selected 4kb clusters (even if this was used automaticly last time)
Same results as previous 4kb results.

Same results with MBR and 4kb block/4kb cluster.
As far as i know, Windows Server 2008 R2 setup should align partitions automaticly? (I was thinking about inside the VM's)
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Post Reply