Page 1 of 1

Future of the SAN

Posted: Sun Jan 06, 2013 11:50 am
by robnicholson
We currently use StarWind on a Dell PowerEdge with 10 disk RAID-10 array using 600GB 15k SAS drives. But we're in the middle of looking at options to expand disk capacity and standardise across a new company acquired into the group. StarWind is working well for us but in re-visting the market, we came across the Drobo B1200i. This isn't intended as a troll but a prelude to opening up a discussion on the future of RAID and why I'd like to stay with StarWind (and most likely will right now because of it's ease of adding more JBOD plus option of a pure-SSD array which will be really fast).

But I can't disagree that there are some really neat (on paper) features that the Drobo has that I'd love to see in StarWind.
  • BeyondRAID: this is the most innovative part IMO and takes hot swap to another level just not possible with traditional RAID. Let's say you're running out of disk space. Simple - you buy 3 x 4TB drives and hot swap existing 3 x 2TB (or whatever) and instantly capacity increases. With a traditional RAID system, this is just not possible and allows incremental upgrades. I fully appreciate to implement something like this, you throw hardware RAID pretty much out of the window unless there are some really clever controllers out there. But then again, forgetting hardware RAID and doing it all in software has some attractions as well (like being able to do it better)
  • Automatic reclamation of space: the real "elephant in the room" problem with thin provisioning. Drobo "knows" it's a Windows NTFS volume and can reclaim space from a thin volume. That makes our life a lot easier as at the moment if we've accidentally copied too much to a thin provisioned disk (e.g. Exchange logs), then the only way to shrink is to do some hairy copying to a new volume and then delete the old one
  • SSD tier: this is Drobo's get-out-of jail free card as it allows them to ship units using nearline cheap 7.2k SAS drives that have much higher capacity that the non-nearline but keep their IOPS reasonable. To be honest, it's this bit that makes me unsure of the unit as we might end up with lower IOPS than our existing system but to be honest, I don't think we're very IOPS constrained. But I'd love to see StarWind with a two tier cache system - RAM caching plus automatically moving frequently used blocks onto the SSD.
As I said, I know it might be bad form to talk about a competitor here but I'm posting as a bit of feedback from a happy StarWind customer who is been tempted by the bright lights of the big city.

Cheers, Rob.

Re: Future of the SAN

Posted: Sun Jan 06, 2013 4:23 pm
by robnicholson
Interesting what happens when you wander into a new area. Just read up on the TRIM/UNMAP feature in Windows Server 2012 and Windows 8:

http://msdn.microsoft.com/en-us/library ... s.85).aspx

Does StarWind support this yet? We're about to migrate to Windows 2012 so this would be a pro feature for the Drobo/con feature for StarWind turning into a pro for StarWind as well.

Cheers, Rob.

Re: Future of the SAN

Posted: Sun Jan 06, 2013 6:07 pm
by anton (staff)
You've put a nail in the head :) I'm not a big fan of Drobo and their BeyondRAID after I've nearly lost my family fotos (and my daughter early MRT scans which could be a real issue). Thanks God I had an extra copy in cloud. I'm not alone with my problem so you can read more here:

http://community.spiceworks.com/topic/2 ... -nightmare

and also Google is your friend. In my case I've ended with replacing Drobo unit with a pair of StarWind-clustered Netgear ReadyNAS boxes and everything is fine so far.

Back to your questions.

1) RAID sucks. It was never designed for multi-node clustered environment. We're moving away from using hardware/software "protective" RAID levels on a single node and sliding forward 3-way replication between 3 nodes (+ async replica on LUN or VM level if required) so you can use JBOD/RAID0 instead of a RAID1/5/6 etc as we beleive keeping so many copies of your data is a way to go for both performance and redundancy. Finally it's cheaper to throw in a single 3-4TB SATA spindle into each of 3 boxes then to build RAID1/5/6 using 2-3 spindles on just a pair of nodes. With more then 3 nodes (majority of SMBs stick with a 2 or 3 box cluster either way, think about ESXi Essentials/Essentials Plus licensing) we'll proceed using different thing. Think about "One Big RAID10", variable stripe length, 2-3-(4?)-way data replica and automatic data re-balancing with new spindles / storage nodes added. + dedup of course. And no erasure codes as I hate them! That's what will use to target Enterprises who need more then 3 nodes and are not willing to stick with "islands" of storage. V6 does 3-way replication right now and free of charge (you can update from 2-way replica for free). Fully balanced multi-head cluster is in it's realy days still...

2) We're aware of this issue as majority of customers tend to place StarWind images on the same logical volumes where other data resides. So upcoming version will have both UNMAP and zero page write used space re-claim. Together with other nice features like no more random writes and being totally VM-optimized primary storage. Unlike 1) it's something you'll have in this January so make sure you apply for Beta-1 of V8.

3) Honestly speaking I don't get all this buzz about tiering with flash and spindle only. And for a reason. Think about system combining 5% of flash and 95% of spindle (pretty common situation). Flash is always utilized (100%) and spindle is always underutilized (has free space). So moving data between flash and spindle is about... What? Making your save 5% of your "el cheapo" SATA storage reporting 5 + 95 instead of just 95% capacity? Or making you happy with having 25% and not 20% as reported free space on it? See tiering does not comes for free: it wastes IOPS as obviously data should be READ and WRITTEN as many times as it's migrated between tiers. And it leaves you with a data inconsistency potential problems. With plain images you'll have basically contigous VM images stored on your file system, with tiering enabled you'll have a MESS - you'll never be able to resover scrambled data in case of a disaster... That's why StarWind will provide you with a flash CACHE and not tiering. As it's FASTER (no data moved, no wasted IOPS) and SAFER (no data moved, no scrambled data). But nothing comes for free and you'll have less space available. On a flash tier size. BTW, tiering does BURN your flash with extra writes! So flash cache will be avaiable with Beta-1 of V8 in January and I don't have ETA for real tiering (yes, we do develop it still I don't like it).

And I don't buy flash cache or all-flash storage idea sitting over the slow network wire. It's a dumbass idea - keep fast flash memory (and RAM caches) on a storage itself! All I/Os should travel down the wire to be processed. By spindle or flash or RAM (depends where data is located). With proper design both RAM and flash caches should reside ON HYPERVISOR. So if data is cached - no network (slowest I/O, slowest link) is involved. And that's EXACTLY what StarWind does (other vendors like Mellanox and VMware do catch up with the same idea). Did not seen anything from Drobo so far... They keep caches behind 1 GbE uplink bars... Good luck! Think again about all of this.

Thank you for your feedback.

Re: Future of the SAN

Posted: Mon Jan 07, 2013 11:34 am
by robnicholson
Hi Anton,
You've put a nail in the head I'm not a big fan of Drobo and their BeyondRAID after I've nearly lost my family fotos (and my daughter early MRT scans which could be a real issue). Thanks God I had an extra copy in cloud. I'm not alone with my problem so you can read more here:
To be fair, that's a problem that can occur with any SAN system if you put all your eggs in one basket. Although I accept that anything innovative and new carries with it a higher risk of problems that a well-established technology. We were in a similar situation when we were considering StarWind - you had to compete with the "Nobody gets fired for buying EqualLogic" mentality.
RAID sucks. It was never designed for multi-node clustered environment. We're moving away from using hardware/software "protective" RAID levels on a single node and sliding forward 3-way replication between 3 nodes (+ async replica on LUN or VM level if required) so you can use JBOD/RAID0 instead of a RAID1/5/6 etc as we beleive keeping so many copies of your data is a way to go for both performance and redundancy.
I received the StarWind communication on the tri-node system and went "hmm, that's an interesting idea" and is certainly something that's in the mix. We might not do it now because our accountant is already worrying about costs (in fact, he always worries about cost :D) but also because we're building a much bigger DR scenario failover between our two primary sites so loss of SAN at a site isn't "end of world" although it's also not "totally business as usual". Shame we couldn't take advatange of your special offer before Christmas. With the hardware upgrades we're planning, we'll certainly have servers "spare" to run StarWind on multiple nodes.
Finally it's cheaper to throw in a single 3-4TB SATA spindle into each of 3 boxes then to build RAID1/5/6 using 2-3 spindles on just a pair of nodes.
The price/MB of nearline-SAS is very compelling from where I'm sat right now and with RAID-10, I not too worried about the higher failure rate compared to SAS. I am a bit worried about the lower performance due to spinning at 7.2k compared to 15k of SAS. The other factor is you need a much smaller disk enclosure with nearline-SAS as it's capacity can be five times higher (3TB versus 600GB) but of course fewer disks means fewer spindles which overall means double-hit on performance (half spin speed & fewer spindles). It's all so complicated! Which is why the SSD tier combo does look attractive on paper and it certainly, in real-life tests, does appear to bring the IOPS up to an acceptable level.
it wastes IOPS as obviously data should be READ and WRITTEN as many times as it's migrated between tiers. A
I would hope this is done as a background task when the system is idle. Otherwise, yes it would impact IOPS. Intelligent movement of blocks based upon their usage factor just sounds like such a neat idea. Tiered data isn't new as we've had disk->tape things for years. So it's just the idea taken to the next level: RAM->SSD->SAS->SAS-nearline->SATA. I'll take tape off as it's dead.

If StarWind did have tier storage then I certainly would consider using nearline-SAS but without it, my gut instinct is to stick with 15k-SAS or even consider SSD as it's getting close in price.

Cheers, Rob.

Re: Future of the SAN

Posted: Mon Jan 07, 2013 11:40 am
by robnicholson
The other worry is that I'm becoming obsessed with IOPS ;-) Over the last hour, our 10 x 600GB 15k SAS RAID-10 disk system on StarWind has an average IOPS of 452 with a maximum of 959. So maybe considering SSD is madness as the 1GbE backbone would become the bottleneck. Actually we've got 4 x 1GbE as we split SAN traffic by type (file server, Exchange, app server, everything else) although I recently read up on the idea of using multipath and effectively combining all four NICs together. But if we're putting Windows Server 2013 in, it's got NIC teaming down to a tee so that's another option.

Out of interest, what is the theoretical IOPS through a 1GbE link? I kind of know the MB/s speeds of different disk types but don't know what the equivalent is for the network.

Cheers, Rob.

Re: Future of the SAN

Posted: Wed Jan 09, 2013 12:05 pm
by anton (staff)
1) Well... I'm no aware of ANY single vendor asking me to a) install a replacement drive to access the data on broken RAID set and b) not allowing me to mount my content at least in read-only mode with one drive missing (non-functional). I've been replacing drives with my Netgear ReadyNAS box and nothing bad happened.

2) I'm pretty sure we'll continue our action or will license 3-way replica free of charge (to paying customers of course). Need to talk to sales Dept.

3) I'm talking about running StarWind directly on hypervisor scenario. Any reason why you want your SAN / NAS being connected to hypervisor with a wires?

4) We'll represent tiering one day. Will just start with flash-as-a-cache in next version.
robnicholson wrote:Hi Anton,
You've put a nail in the head I'm not a big fan of Drobo and their BeyondRAID after I've nearly lost my family fotos (and my daughter early MRT scans which could be a real issue). Thanks God I had an extra copy in cloud. I'm not alone with my problem so you can read more here:
To be fair, that's a problem that can occur with any SAN system if you put all your eggs in one basket. Although I accept that anything innovative and new carries with it a higher risk of problems that a well-established technology. We were in a similar situation when we were considering StarWind - you had to compete with the "Nobody gets fired for buying EqualLogic" mentality.
RAID sucks. It was never designed for multi-node clustered environment. We're moving away from using hardware/software "protective" RAID levels on a single node and sliding forward 3-way replication between 3 nodes (+ async replica on LUN or VM level if required) so you can use JBOD/RAID0 instead of a RAID1/5/6 etc as we beleive keeping so many copies of your data is a way to go for both performance and redundancy.
I received the StarWind communication on the tri-node system and went "hmm, that's an interesting idea" and is certainly something that's in the mix. We might not do it now because our accountant is already worrying about costs (in fact, he always worries about cost :D) but also because we're building a much bigger DR scenario failover between our two primary sites so loss of SAN at a site isn't "end of world" although it's also not "totally business as usual". Shame we couldn't take advatange of your special offer before Christmas. With the hardware upgrades we're planning, we'll certainly have servers "spare" to run StarWind on multiple nodes.
Finally it's cheaper to throw in a single 3-4TB SATA spindle into each of 3 boxes then to build RAID1/5/6 using 2-3 spindles on just a pair of nodes.
The price/MB of nearline-SAS is very compelling from where I'm sat right now and with RAID-10, I not too worried about the higher failure rate compared to SAS. I am a bit worried about the lower performance due to spinning at 7.2k compared to 15k of SAS. The other factor is you need a much smaller disk enclosure with nearline-SAS as it's capacity can be five times higher (3TB versus 600GB) but of course fewer disks means fewer spindles which overall means double-hit on performance (half spin speed & fewer spindles). It's all so complicated! Which is why the SSD tier combo does look attractive on paper and it certainly, in real-life tests, does appear to bring the IOPS up to an acceptable level.
it wastes IOPS as obviously data should be READ and WRITTEN as many times as it's migrated between tiers. A
I would hope this is done as a background task when the system is idle. Otherwise, yes it would impact IOPS. Intelligent movement of blocks based upon their usage factor just sounds like such a neat idea. Tiered data isn't new as we've had disk->tape things for years. So it's just the idea taken to the next level: RAM->SSD->SAS->SAS-nearline->SATA. I'll take tape off as it's dead.

If StarWind did have tier storage then I certainly would consider using nearline-SAS but without it, my gut instinct is to stick with 15k-SAS or even consider SSD as it's getting close in price.

Cheers, Rob.

Re: Future of the SAN

Posted: Wed Jan 09, 2013 12:23 pm
by anton (staff)
For a point-to-point configs we've seen ~120K coming from a single GbE uplink. Intel and MS had squeezed 1M from 10 in MPIO R/R which is not that much slower... The problem is you need to have a pretty deep I/O queue and it's not always possible or realistic.

Sure going for 10 GbE backbone is a very good idea. As it does not require switches it comes for real cheap ($500 for a pair of 10 GbE with bargain offers from eBay).

NIC teaming with MS does NOT work for iSCSI!!! Stick with MPIO and Round Robin policy. Teaming should be used for SMB traffic only.
robnicholson wrote:The other worry is that I'm becoming obsessed with IOPS ;-) Over the last hour, our 10 x 600GB 15k SAS RAID-10 disk system on StarWind has an average IOPS of 452 with a maximum of 959. So maybe considering SSD is madness as the 1GbE backbone would become the bottleneck. Actually we've got 4 x 1GbE as we split SAN traffic by type (file server, Exchange, app server, everything else) although I recently read up on the idea of using multipath and effectively combining all four NICs together. But if we're putting Windows Server 2013 in, it's got NIC teaming down to a tee so that's another option.

Out of interest, what is the theoretical IOPS through a 1GbE link? I kind of know the MB/s speeds of different disk types but don't know what the equivalent is for the network.

Cheers, Rob.