<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jason's .plan &#187; voxel</title>
	<atom:link href="http://blogs.digitar.com/jjww/tag/voxel/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.digitar.com/jjww</link>
	<description>thoughts &#38; musings</description>
	<lastBuildDate>Thu, 18 Mar 2010 06:29:45 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Cloud-scale DBs in the cloud&#8230;just a quickie</title>
		<link>http://blogs.digitar.com/jjww/2010/03/cloud-scale-dbs-in-the-cloud-just-a-quickie/</link>
		<comments>http://blogs.digitar.com/jjww/2010/03/cloud-scale-dbs-in-the-cloud-just-a-quickie/#comments</comments>
		<pubDate>Thu, 18 Mar 2010 05:59:46 +0000</pubDate>
		<dc:creator>Jason</dc:creator>
				<category><![CDATA[DigiTar]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[rackspace]]></category>
		<category><![CDATA[softlayer]]></category>
		<category><![CDATA[voxel]]></category>

		<guid isPermaLink="false">http://blogs.digitar.com/jjww/?p=134</guid>
		<description><![CDATA[Just a quick set of thoughts&#8230;do cloud-scale DBs save money because they&#8217;re based on commodity/cheap servers? Tonight I did some rough back-of-the-pad calculations, and was kind of surprised&#8230;
Let&#8217;s assume we&#8217;ve got an 11TB working set of data, how could we store this redundantly?
(cloud servers in these examples are dedicated servers at a cloud provider)
Option 1: [...]]]></description>
			<content:encoded><![CDATA[<p>Just a quick set of thoughts&#8230;do cloud-scale DBs save money because they&#8217;re based on commodity/cheap servers? Tonight I did some rough back-of-the-pad calculations, and was kind of surprised&#8230;</p>
<p>Let&#8217;s assume we&#8217;ve got an 11TB working set of data, how could we store this redundantly?</p>
<p>(cloud servers in these examples are dedicated servers at a cloud provider)</p>
<p><strong>Option 1: Two beefy storage servers running MySQL in a master/slave config</strong></p>
<ul>
<li>CPU: 4-cores of your favorite CPU vendor</li>
<li>RAM: 16GB</li>
<li>HDDs: 48x 250 GB SATA
<ul>
<li>Lose 2 for mirrored boot, and 2 for RAID-6 parity</li>
</ul>
</li>
<li>Cost:
<ul>
<li>Buy Your Own Hardware (Sun X4500): $50,000 for the pair</li>
<li>Host It in the Cloud (SoftLayer): $4,700/month for the pair</li>
</ul>
</li>
</ul>
<p><strong>Option 2: 28 commodity servers (2 replica copies for each piece of data) running HBase or Cassandra</strong></p>
<ul>
<li>CPU: 4-cores of your favorite CPU vendor</li>
<li>RAM: 4GB</li>
<li>HDDs: 4x 250 GB SATA
<ul>
<li>Lose 1 for RAID-5 parity (we&#8217;ll mingle boot data and data data on the same drive pool)</li>
</ul>
</li>
<li>Cost:
<ul>
<li>Buy Your Own Hardware (Dell R410): $43,300 for set of 28</li>
<li>Host It in the Cloud (SoftLayer): $12,000/month for the set of 28</li>
</ul>
</li>
</ul>
<p><strong>Option 3: 42 commodity servers (3 replica copies for each piece of data) running HBase or Cassandra</strong></p>
<ul>
<li>CPU: 4-cores of your favorite CPU vendor</li>
<li>RAM: 4GB</li>
<li>HDDs: 4x 250 GB SATA
<ul>
<li>Lose 1 for RAID-5 parity (we&#8217;ll mingle boot data and data data on the same drive pool)</li>
</ul>
</li>
<li>Cost:
<ul>
<li>Buy Your Own Hardware (Dell R410): $64,900 for set of 42</li>
<li>Host It in the Cloud (SoftLayer): $18,000/month for the set of 42</li>
</ul>
</li>
</ul>
<p>Now the issue here that surprised me isn&#8217;t the raw cost differential between stuffing your own hardware in your colo or using a cloud provider. And the other thing is, I&#8217;m not picking on SoftLayer&#8230;Rackspace and Voxel all work out to the same cost scaling as SoftLayer (and in the case of the other two vendors worse).</p>
<p>What surprised me:</p>
<ul>
<li>When you buy your own hardware, &#8220;cloud-scale&#8221; databases do cost you less (~$7K)  than buying beefy storage servers and running MySQL for the same data set.</li>
<li><strong>However</strong>, when you are at a cloud provider, using cloud-scale databases on &#8220;cheap&#8221; hardware costs you 3x more than using beefy storage cloud servers running MySQL.</li>
</ul>
<p>As I said, I&#8217;m not comparing the cost of running Option 1 on your own hardware vs. Option 1 at a cloud provider. Yes those costs are more at the cloud provider, but it&#8217;s to be expected (they&#8217;re bundling in bandwidth, colo, power, and most importantly people to manage the hardware and network).</p>
<p>What&#8217;s stunning is that beefy servers at a cloud provider are much more cost efficient. Beefy cloud servers cost you roughly 1/15 of the cost of the hardware every month. Whereas, &#8220;cheap&#8221; commodity cloud servers cost you roughly 1/3 of the cost of the hardware every month. Much higher mark up on the cheaper volume servers.</p>
<p>Please  comment and correct me if I&#8217;m wrong in my analysis&#8230;I would actually like to be.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.digitar.com/jjww/2010/03/cloud-scale-dbs-in-the-cloud-just-a-quickie/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
