AWS Storage Gateway(aws.amazon.com) |
AWS Storage Gateway(aws.amazon.com) |
http://www.allthingsdistributed.com/2012/01/The-AWS-Storage-...
http://aws.typepad.com/aws/2012/01/the-aws-storage-gateway-i...
This solution isn't designed to replace fault-tolerance on local hardware. It is for close to realtime offsite replication and backup.
Data in S3 is stored in at least three geographically separate locations and snapshots are very fast and very efficient on storage space.
The final major advantage you get through a solution such as this is that if you do have your primary site go down (floods, tornados, etc), you can bring up all your existing images via EC2 without having to have a bunch of redundant hardware sitting around waiting for disaster to strike.
And what do you pay for this? $125/month plus a per GB storage cost CHEAPER than enterprise storage generally is.
The price of enterprise backup solutions is crazy.
1: http://www.allthingsdistributed.com/2012/01/The-AWS-Storage-...
1: iSCSI can be seen as a USB drive over a network. you plug in, your machine sees it as a drive, you write data. you can unplug, and then plug in somewhere else, and as long as the file system is readable on the new machine, you can get your files. 2: the appliance AWS are offering gives you iSCSI volumes, backed by DAS or SAN storage locally, but also backed by S3 storage in AWS.
So, basically, its like having a drive, automagically backed up to S3, but S3 does not need to know anything about what is on the drive. it could be VMs, Videos, your mail server... anything really.
Not only that, it solves a lot of problems such as dangerously storing backup snapshots on-site, archival and easy deployment and access to S3's CDN functionality.
Sold as far as I am concerned!
(Yes I know it's expensive, but it's cheaper than buying something in-house and employing another hairless monkey to manage it).
This would make a wide range of big-storage use-cases ridiculously trivial - those where only ~10% of the data-set is frequently accessed.
I.e. one could lazily scale the expensive local storage with throughput-demand, while the S3 backing store takes care of the long-tail (which can easily be many terabytes long when you're dealing with media files).
Implementing a tiered storage system yourself is pretty complex. Using this S3 gateway might be simpler, but it's not trivial (e.g. you'll need VMware ESXi just to get started).
I.e. instead of VMware it'd be more useful for us to hook in with a FUSE-layer or a patched variant of a filesystem such as GlusterFS.
You're of course correct about the pricing. Their current prices cover some middle-ground but would need to be discounted to make it feasible for larger deployments. However, at the low-end (your 20T figure) the price seems already justifiable when you factor in staff and infrastructure costs (rack+power alone make up for half of the difference).
1: the dell at $7k does not include power, and your 12 2Tb drives gives you 20Tb usuable with RAID6 (loosing 2 drives). if you loose more than 2 disks, you are screwed... so, you need to back that up somewhere... 2: you need someone to manage that machine also... 3: ESXi, for what you would need here (8gb ram or so) is free, unless you want support....
i think in all fairness, that depending on the amount of storage you need or want, its swings and round-a-bouts... i like the idea, but i would also like the idea of having a box in house with a lot of storage (like the big dell) and only select some parts for off site backup... this is what i do... most of my stuff is stored locally (RAID 1, Thecus NAS, Drobo) and only important stuff (music and videos i bought, photos i took, etc) is backed up to the "cloud"...
In contrast, storing 50TB for 5 years on S3's reduced redundancy storage would cost $250,000. If I ever need to transfer any of that data back to my data center, there'll be a hefty bill for that as well.
Also, you seem to be forgetting that you still need local storage for your images anyway. This is a hands-off backup and disaster recovery product.
For full disclosure, the Storage Gateway's pricing isn't the same as S3's at the moment. They only have one storage tier and it is $0.14 per GB, no discounts. Therefore, 50TB of storage over 5 years is $420k.
Having said that, what would it cost to:
* Not only have your primary but 2 x secondary PS61000E's.
* 2 extra datacentres with connectivity to themselvse and your primary site.
* Software to manage asynchronous streaming of data from your primary to 2 x secondaries.
* Software to take consistent backups of these images and store in at all three locations.
* Software to ensure that your secondary sites contain only encrypted data.
* Cold-spare hardware at a secondary site capable of running all your images.