<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jason's .plan &#187; DigiTar</title>
	<atom:link href="http://blogs.digitar.com/jjww/category/digitar/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.digitar.com/jjww</link>
	<description>thoughts &#38; musings</description>
	<lastBuildDate>Thu, 18 Mar 2010 06:29:45 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Cloud-scale DBs in the cloud&#8230;just a quickie</title>
		<link>http://blogs.digitar.com/jjww/2010/03/cloud-scale-dbs-in-the-cloud-just-a-quickie/</link>
		<comments>http://blogs.digitar.com/jjww/2010/03/cloud-scale-dbs-in-the-cloud-just-a-quickie/#comments</comments>
		<pubDate>Thu, 18 Mar 2010 05:59:46 +0000</pubDate>
		<dc:creator>Jason</dc:creator>
				<category><![CDATA[DigiTar]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[rackspace]]></category>
		<category><![CDATA[softlayer]]></category>
		<category><![CDATA[voxel]]></category>

		<guid isPermaLink="false">http://blogs.digitar.com/jjww/?p=134</guid>
		<description><![CDATA[Just a quick set of thoughts&#8230;do cloud-scale DBs save money because they&#8217;re based on commodity/cheap servers? Tonight I did some rough back-of-the-pad calculations, and was kind of surprised&#8230;
Let&#8217;s assume we&#8217;ve got an 11TB working set of data, how could we store this redundantly?
(cloud servers in these examples are dedicated servers at a cloud provider)
Option 1: [...]]]></description>
			<content:encoded><![CDATA[<p>Just a quick set of thoughts&#8230;do cloud-scale DBs save money because they&#8217;re based on commodity/cheap servers? Tonight I did some rough back-of-the-pad calculations, and was kind of surprised&#8230;</p>
<p>Let&#8217;s assume we&#8217;ve got an 11TB working set of data, how could we store this redundantly?</p>
<p>(cloud servers in these examples are dedicated servers at a cloud provider)</p>
<p><strong>Option 1: Two beefy storage servers running MySQL in a master/slave config</strong></p>
<ul>
<li>CPU: 4-cores of your favorite CPU vendor</li>
<li>RAM: 16GB</li>
<li>HDDs: 48x 250 GB SATA
<ul>
<li>Lose 2 for mirrored boot, and 2 for RAID-6 parity</li>
</ul>
</li>
<li>Cost:
<ul>
<li>Buy Your Own Hardware (Sun X4500): $50,000 for the pair</li>
<li>Host It in the Cloud (SoftLayer): $4,700/month for the pair</li>
</ul>
</li>
</ul>
<p><strong>Option 2: 28 commodity servers (2 replica copies for each piece of data) running HBase or Cassandra</strong></p>
<ul>
<li>CPU: 4-cores of your favorite CPU vendor</li>
<li>RAM: 4GB</li>
<li>HDDs: 4x 250 GB SATA
<ul>
<li>Lose 1 for RAID-5 parity (we&#8217;ll mingle boot data and data data on the same drive pool)</li>
</ul>
</li>
<li>Cost:
<ul>
<li>Buy Your Own Hardware (Dell R410): $43,300 for set of 28</li>
<li>Host It in the Cloud (SoftLayer): $12,000/month for the set of 28</li>
</ul>
</li>
</ul>
<p><strong>Option 3: 42 commodity servers (3 replica copies for each piece of data) running HBase or Cassandra</strong></p>
<ul>
<li>CPU: 4-cores of your favorite CPU vendor</li>
<li>RAM: 4GB</li>
<li>HDDs: 4x 250 GB SATA
<ul>
<li>Lose 1 for RAID-5 parity (we&#8217;ll mingle boot data and data data on the same drive pool)</li>
</ul>
</li>
<li>Cost:
<ul>
<li>Buy Your Own Hardware (Dell R410): $64,900 for set of 42</li>
<li>Host It in the Cloud (SoftLayer): $18,000/month for the set of 42</li>
</ul>
</li>
</ul>
<p>Now the issue here that surprised me isn&#8217;t the raw cost differential between stuffing your own hardware in your colo or using a cloud provider. And the other thing is, I&#8217;m not picking on SoftLayer&#8230;Rackspace and Voxel all work out to the same cost scaling as SoftLayer (and in the case of the other two vendors worse).</p>
<p>What surprised me:</p>
<ul>
<li>When you buy your own hardware, &#8220;cloud-scale&#8221; databases do cost you less (~$7K)  than buying beefy storage servers and running MySQL for the same data set.</li>
<li><strong>However</strong>, when you are at a cloud provider, using cloud-scale databases on &#8220;cheap&#8221; hardware costs you 3x more than using beefy storage cloud servers running MySQL.</li>
</ul>
<p>As I said, I&#8217;m not comparing the cost of running Option 1 on your own hardware vs. Option 1 at a cloud provider. Yes those costs are more at the cloud provider, but it&#8217;s to be expected (they&#8217;re bundling in bandwidth, colo, power, and most importantly people to manage the hardware and network).</p>
<p>What&#8217;s stunning is that beefy servers at a cloud provider are much more cost efficient. Beefy cloud servers cost you roughly 1/15 of the cost of the hardware every month. Whereas, &#8220;cheap&#8221; commodity cloud servers cost you roughly 1/3 of the cost of the hardware every month. Much higher mark up on the cheaper volume servers.</p>
<p>Please  comment and correct me if I&#8217;m wrong in my analysis&#8230;I would actually like to be.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.digitar.com/jjww/2010/03/cloud-scale-dbs-in-the-cloud-just-a-quickie/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Viva la storage.</title>
		<link>http://blogs.digitar.com/jjww/2008/11/viva-la-storage/</link>
		<comments>http://blogs.digitar.com/jjww/2008/11/viva-la-storage/#comments</comments>
		<pubDate>Mon, 10 Nov 2008 12:56:22 +0000</pubDate>
		<dc:creator>Jason</dc:creator>
				<category><![CDATA[DigiTar]]></category>
		<category><![CDATA[Solaris]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[Coming soon&#8230;  

[Posted with iBlogger from my iPhone]

]]></description>
			<content:encoded><![CDATA[<p>Coming soon&#8230; <img src='http://blogs.digitar.com/jjww/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<div class="iblogger-footer"><br clear="all"/>
<p style="text-align:right;font-size:10px;">[Posted with <a href="http://illuminex.com/iBlogger/index.html">iBlogger</a> from my iPhone]</p>
<p></div>
]]></content:encoded>
			<wfw:commentRss>http://blogs.digitar.com/jjww/2008/11/viva-la-storage/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Remember the Alamo&#8230;</title>
		<link>http://blogs.digitar.com/jjww/2008/05/remember-the-alamo/</link>
		<comments>http://blogs.digitar.com/jjww/2008/05/remember-the-alamo/#comments</comments>
		<pubDate>Thu, 29 May 2008 03:54:16 +0000</pubDate>
		<dc:creator>Jason</dc:creator>
				<category><![CDATA[DigiTar]]></category>
		<category><![CDATA[Solaris]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[Tomorrow (05/28/2008) I&#39;m giving a talk on moving to open storage (i.e. ethernet, OpenSolaris and SATA&#8230;in no particular order) at the Diocesan Information Systems Conference in San Antonio. It&#39;s a closed event, but here are the slides from the talk&#8230;including the talking notes which cover a lot more than I&#39;ll probably have time for:
PDF
Slideshare
DigiTar, DISC, [...]]]></description>
			<content:encoded><![CDATA[<p>Tomorrow (05/28/2008) I&#39;m giving a talk on moving to open storage (i.e. ethernet, OpenSolaris and SATA&#8230;in no particular order) at the <a href="http://www.discinfo.org/index.html">Diocesan Information Systems Conference</a> in San Antonio. It&#39;s a closed event, but here are the slides from the talk&#8230;including the talking notes which cover a lot more than I&#39;ll probably have time for:</p>
<p><a href="http://blogs.digitar.com/media/2/Ditching%20Fibre%20Channel%20and%20SCSI%20-%20Slides%20&amp;%20Notes.pdf" title="PDF">PDF</a><br />
<a href="http://tinyurl.com/69ewhm" title="Slideshare">Slideshare</a></p>
<div class="posttagsblock"><a href="http://technorati.com/tag/DigiTar" rel="tag">DigiTar</a>, <a href="http://technorati.com/tag/DISC" rel="tag">DISC</a>, <a href="http://technorati.com/tag/OpenSolaris" rel="tag">OpenSolaris</a>, <a href="http://technorati.com/tag/Solaris" rel="tag">Solaris</a></div>
]]></content:encoded>
			<wfw:commentRss>http://blogs.digitar.com/jjww/2008/05/remember-the-alamo/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Democratizing Storage</title>
		<link>http://blogs.digitar.com/jjww/2008/04/democratizing-storage/</link>
		<comments>http://blogs.digitar.com/jjww/2008/04/democratizing-storage/#comments</comments>
		<pubDate>Mon, 21 Apr 2008 20:46:38 +0000</pubDate>
		<dc:creator>Jason</dc:creator>
				<category><![CDATA[DigiTar]]></category>
		<category><![CDATA[Solaris]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[
As a company that was heavily populated with Linux zealots, it&#8217;s been surreal for us to watch OpenSolaris develop for the past 3 years. While technologies like DTrace and FMA are features we now&#160;use everyday, it was storage that brought Solaris into our environment and continues to drive it deeper into our services stack. Which [...]]]></description>
			<content:encoded><![CDATA[<p><font color="#0000ff"><a href="http://www.opensolaris.org/"><img border="0" alt="Opensolaris_logo_trans" align="right" src="http://blogs.digitar.com/media/2/20080421-opensolaris_logo_trans.png" / / /></a></font></p>
<p>As a company that was heavily populated with Linux zealots, it&rsquo;s been surreal for us to watch OpenSolaris develop for the past 3 years. While technologies like <a href="http://en.wikipedia.org/wiki/DTrace">DTrace</a> and FMA are features we now&nbsp;use everyday, it was storage that brought Solaris into our environment and continues to drive it deeper into our services stack. Which begs the question: Why? Isn&rsquo;t DTrace just as cool as <a href="http://en.wikipedia.org/wiki/ZFS">ZFS</a>? Haven&rsquo;t Solaris <a href="http://en.wikipedia.org/wiki/Solaris_Containers">Containers</a> dramatically changed the way we provision and utilize systems? Sure&hellip;but storage is what drives our business and it doesn&rsquo;t seem to me that we&rsquo;re alone.</p>
<p>Everything DigiTar does manipulates or massages messaging in some way. When most people think of what drives our storage requirements they think of quarantining or archiving e-mail. But when you&rsquo;re dealing with messages that can make or break folks&rsquo; businesses, logging the metadata is&nbsp;perhaps the most important thing we do. </p>
<p>Metadata&nbsp;is&nbsp;flooding in every second. It&rsquo;s&nbsp;at the center of&nbsp;everything from proving a message was delivered to ensuring we meet end-to-end processing times and SLAs. If we didn&rsquo;t quarantine&nbsp;any more messages, we&rsquo;d still generate gigabytes&nbsp;of data every day that can&rsquo;t be lost. Without reliable and scalable storage we wouldn&rsquo;t exist.</p>
<p><strong><u>Lost IOPs, Corruption and Linux&hellip;oh my!</u></strong></p>
<p>What got us using OpenSolaris was Linux&rsquo;s (circa 2005) unreliable SCSI and storage subsystems. I/Os erroring out on our SAN array would be silently ignored (not retried) by Linux, creating quiet corruption that would require&nbsp;fail-over events. It didn&rsquo;t affect our customers, but we were going nuts managing it. When we moved to OpenSolaris, we could finally trust that no errors in the logs&nbsp;literally meant no errors. In a lot of ways, Solaris benefits from 15 years of making mistakes in enterprise environments. Solaris anticipates and safely handles all of the crazy edge cases we&rsquo;ve encountered with faulty equipment and software that&rsquo;s gone haywire. </p>
<p>When it comes to storing data, you&rsquo;ll pry OpenSolaris&nbsp;(and&nbsp;ZFS)&nbsp;out of our cold dead hands. We won&rsquo;t deploy databases on anything else.</p>
<p><strong><u>Liberation Day</u></strong></p>
<p>While we moved to Solaris to get our derri<span class="hw">Ã¨</span>res out of a sling, being on OpenSolaris has dramatically changed the way we use and design storage.</p>
<p>When you&rsquo;ve got rock-solid iSCSI, NFS, and&nbsp;I/O multipathing implementations,&nbsp;as well as&nbsp;a file system (ZFS) that loves cheap disks&hellip;and none of it requires licensing&hellip;you can suddenly do anything.&nbsp;Need to handle 3600 non-cached IOPs for under&nbsp;$60K? No problem. Have an existing array but&nbsp;can&rsquo;t justify $10K for snapshotting? No problem. How &lsquo;bout serving line-rate iSCSI with commodity storage and CPUs? No problemo.</p>
<p>That&rsquo;s the really amazing thing about OpenSolaris as a storage platform. It has all of the features of an expensive array and because it allows&nbsp;you&nbsp;to build reliable storage out of commodity components, you can build the storage architecture you need instead of being&nbsp;held hostage by the one you can afford. But features like ZFS don&rsquo;t mandate that you change your architecture.&nbsp;You can pick and choose the pieces that fit your needs and make any existing architecture better too.</p>
<p>So how has OpenSolaris changed the way DigiTar does storage? For one thing, it&rsquo;s enabled us to move almost entirely off of our fibre-channel SAN. We get better performance for less money by putting our database servers directly on Thumpers (<a href="http://www.sun.com/servers/x64/x4500/">Sun Fire X4500</a>) and letting ZFS do its magic. Also, because its ZFS, we&rsquo;re assured that every block can be verified for correctness via checksumming. By doing application-level fail-over between Thumpers, we get shared-nothing redundancy that has increased our uptime dramatically. </p>
<p>One of the things that always has bugged me about traditional clustering is its reliance on shared storage. That&rsquo;s great if the application didn&rsquo;t trash its data while crashing to&nbsp;the ground. But what if it did? To replicate the level of redundancy we get with two X4500s, we&rsquo;d have to install two completely separate storage arrays&hellip;not to mention also buy two very large beefy servers to run the databases. By using X4500s, we get the same reliability and redundancy for about 85% less cost. That kind of savings means we can deploy 6.8x more storage for the same price footprint and do all sorts of cool things like:</p>
<ul>
<li>Create multiple data&nbsp;warehouses for data mining spam and mal-ware trends.</li>
<li>Develop and deploy new service&nbsp;features whenever we want without considering storage costs.</li>
<li>Be cost competitive with competitors 10x our size.</li>
</ul>
<p>Whether you&rsquo;re storing pictures of your kids, or archiving business critical e-mail (or anything in between), it seems to me that being able to store massive amounts of data reliably is as fundamental to computing today as breathing is to living. OpenSolaris allows us as a company to stop worrying about what its going to cost to store the results of our services, and focus on what&rsquo;s important: developing the services and features themselves. When you stop focusing on the cost of &ldquo;air&rdquo;, you&rsquo;re liberated to actually make life incredible.</p>
<p>I could continue blathering about&nbsp;how free snapshotting (both in terms of cost and performance hit) can allow you to re-organize your backup priorities, or a bunch of other very cool benefits of using OpenSolaris as your storage platform. But you should <a href="http://opensolaris.org/os/downloads/">give it a shot yourself</a>, because OpenSolaris&rsquo; benefits are as varied and unique as your environment. Once you give it a try, I think you&rsquo;ll be hard pressed to go back to&nbsp;vendor lock-in&hellip;but I&rsquo;m probably a bit biased now. <img src="http://blogs.digitar.com/media/2/20080421-smile3.gif" / / />&nbsp;I think you&rsquo;ll also find an community around OpenSolaris that is by far the friendliest and most mature open source group of folks you&rsquo;ve ever dealt with.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.digitar.com/jjww/2008/04/democratizing-storage/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Back in the sandbox&#8230;ZFS flushing shenanigans revisted.</title>
		<link>http://blogs.digitar.com/jjww/2007/10/back-in-the-sandbox-zfs-flushing-shenanigans-revisted/</link>
		<comments>http://blogs.digitar.com/jjww/2007/10/back-in-the-sandbox-zfs-flushing-shenanigans-revisted/#comments</comments>
		<pubDate>Wed, 31 Oct 2007 18:55:56 +0000</pubDate>
		<dc:creator>Jason</dc:creator>
				<category><![CDATA[DigiTar]]></category>
		<category><![CDATA[Solaris]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[Nearly a year has passed since our descent into the 9th ring of latency Hades, and I wanted to make an update post on ZFS&#39; interaction with SAN arrays containing battery-backed cache. (For the full details, please check out this older post.)
For one thing, the instructions I previously gave to&#160;ignore cache flushes on the STK [...]]]></description>
			<content:encoded><![CDATA[<p>Nearly a year has passed since our descent into the 9th ring of latency Hades, and I wanted to make an update post on ZFS&#39; interaction with SAN arrays containing battery-backed cache. (For the full details, please <a href="http://blogs.digitar.com/jjww/?itemid=44">check out this older post</a>.)</p>
<p>For one thing, the instructions I previously gave to&nbsp;ignore cache flushes on the STK FLX200/300 series (and similar LSI OEM&#39;d products), don&#39;t seem to work very well on the new generation&nbsp;Sun StorageTek 6&#215;00 arrays. Not to mention it&#39;s kind of nasty to have to modify your array&#39;s NVRAM settings to get good write latency.</p>
<p>But thanks to the brilliant engineers on the ZFS team, you no longer have to modify your array (since circa May &#39;07&nbsp;in the&nbsp;OpenSolaris tree). Simply add this line to your Solaris&nbsp;<strong>/etc/system</strong> file and ZFS will no longer issue SYNCHRONIZE CACHE commands to your array:</p>
<p><span><strong>set zfs:zfs_nocacheflush=1</strong></span>&nbsp;</p>
<p>I can confirm that this works REALLY well on both the older (FLX200/300) and newer (6140/6540) Sun/Engenio arrays! It seems to me that since the new way is a ZFS configuration directive, it should be portable/functional against any array in existence. <strong>Please note that setting this directive will disable cache flushing for ALL zpools on the system, which would be dangerous for any zpools using local disks.</strong> As always, <strong><a href="http://en.wikipedia.org/wiki/Caveat_Emptor">caveat emptor</a></strong>. Your mileage may vary so please do let others know through the comments what works/doesn&#39;t work for you.</p>
<p>P.S.<br /> We&#39;ve tested the zfs:zfs_nocacheflush directive successfully in Build 72 of OpenSolaris. It should also work in Solaris 10 Update 4, though we haven&#39;t tested that ourselves.</p>
<p><!-- technorati tags start -->
<p style="text-align:right;font-size:10px;">Technorati Tags: <a href="http://www.technorati.com/tag/Solaris" rel="tag">Solaris</a>, <a href="http://www.technorati.com/tag/Sun" rel="tag">Sun</a>, <a href="http://www.technorati.com/tag/ZFS" rel="tag">ZFS</a></p>
<p><!-- technorati tags end --></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.digitar.com/jjww/2007/10/back-in-the-sandbox-zfs-flushing-shenanigans-revisted/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Nagios Remote Plug-In Executor (NRPE) under SMF</title>
		<link>http://blogs.digitar.com/jjww/2007/02/nagios-remote-plug-in-executor-nrpe-under-smf/</link>
		<comments>http://blogs.digitar.com/jjww/2007/02/nagios-remote-plug-in-executor-nrpe-under-smf/#comments</comments>
		<pubDate>Fri, 23 Feb 2007 02:00:29 +0000</pubDate>
		<dc:creator>Jason</dc:creator>
				<category><![CDATA[DigiTar]]></category>
		<category><![CDATA[Solaris]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[NRPE (Nagios Remote Plug-In Executor) is a critical part of a lot of IT environments. In ours it provides to Nagios all sorts of interesting health info local to the host that NRPE is running on. Whether its RAM, open connections, hard drive space or something else, NRPE helps alert you to strange happenings that [...]]]></description>
			<content:encoded><![CDATA[<p>NRPE (Nagios Remote Plug-In Executor) is a critical part of a lot of IT environments. In ours it provides to Nagios all sorts of interesting health info local to the host that NRPE is running on. Whether its RAM, open connections, hard drive space or something else, NRPE helps alert you to strange happenings that simply interrogating a TCP port remotely won&#39;t provide. Hence, its a deal breaker to moving to OpenSolaris if you can&#39;t have it. Luckily, the benevolent gents at <a href="http://www.blastwave.org/">Blastwave</a> provide<br />
a pre-packaged NRPE that&#39;s ready to go (Run: <strong>pkg-get -i nrpe</strong>). Unfortunately, the Blastwave NRPE package leaves the last step of placing it under <strong>init.d</strong> or <strong><a href="http://www.opensolaris.org/os/community/smf/">SMF</a></strong> control as an exercise for the admin. Well, if you&#39;re like me and would like SMF to be able to manage NRPE, then you&#39;re in luck. Below are a manifest and installation instructions that will start, stop and refresh an NRPE daemon (as installed from<br />
the Blastwave package). </p>
<p>Its important to note that this NRPE manifest will expect your NRPE configuration to be in <strong>/opt/csw/etc/nrpe.cfg</strong> and that it will contain the line: <strong>pid_file=/var/run/nrpe.pid</strong> If your config file is in a different location, just edit <strong>method/nagios-nrpe</strong> in the manifest package to match where your <strong>nrpe.cfg</strong> lives. If for some reason you don&#39;t want to specify <strong>pid_file</strong> in your <strong>nrpe.cfg</strong>, then the <strong>refresh</strong> method<br />
will not operate properly. The <strong>start</strong> and <strong>stop</strong> methods will operate whether you specify a <strong>pid_file</strong> value or not. Technically, just restarting the NRPE daemon will accomplish the same thing as the <strong>refresh</strong> method, which just sends a <strong>SIGHUP</strong> to the NRPE daemon. Again, <strong>caveat emptor</strong>.<strong> This manifest and the installation instructions below are provided with absolutely no warranty whatsoever as specified in the<br />
BSD license in the manifest header.</strong> </p>
<p>To install the manifest please follow these steps:</p>
<ol>
<li>
Download the NRPE manifest package <a href="http://blogs.digitar.com/media/2/nrpe_smf.zip">here</a>.</li>
<li>
Unpack the package on your system.</li>
<li>
Change to the root of the unpacked package.</li>
<li>
Run: <strong>cp ./manifest/nagios-nrpe.xml /var/svc/manifest/network/</strong>
</li>
<li>
Run: <strong>cp ./method/nagios-nrpe /lib/svc/method/</strong>
</li>
<li>
Run: <strong>svccfg import /var/svc/manifest/network/nagios-nrpe.xml</strong>
</li>
<li>
You&#39;re done!</li>
</ol>
<p>If everything went smoothly, running <strong>svcadm enable nrpe</strong> should start the daemon without incident. Similarly, <strong>svcadm disable nrpe</strong> should kill it. As mentioned before, there&#39;s also <strong>svcadm refresh nrpe</strong>, which will send a <strong>SIGHUP</strong> to NRPE. That will cause NRPE to re-read its <strong>nrpe.cfg</strong> file. An interesting note on <strong>refresh</strong> is that NRPE will reliably crash on a second <strong>SIGHUP</strong>. If you were using standard <strong>init.d</strong>,<br />
this could really hose you, as NRPE would randomly terminate and you wouldn&#39;t know. With SMF however, it doesn&#39;t matter! If NRPE dies when you send it a <strong>SIGHUP</strong>, SMF will loyally restart the daemon for you. Another reason to use SMF with all of your critical services, where an automatic restart won&#39;t risk data corruption! Hope y&#39;all find this of use!</p>
<p><!-- technorati tags start -->
<p style="text-align:right;font-size:10px;">Technorati Tags: <a href="http://www.technorati.com/tag/SMF" rel="tag">SMF</a>, <a href="http://www.technorati.com/tag/Solaris" rel="tag">Solaris</a>, <a href="http://www.technorati.com/tag/Solaris Express" rel="tag">Solaris Express</a>, <a href="http://www.technorati.com/tag/Sun" rel="tag">Sun</a></p>
<p><!-- technorati tags end --></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.digitar.com/jjww/2007/02/nagios-remote-plug-in-executor-nrpe-under-smf/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OpenSolaris &amp; SMF adventures with PowerDNS</title>
		<link>http://blogs.digitar.com/jjww/2007/02/opensolaris-smf-adventures-with-powerdns/</link>
		<comments>http://blogs.digitar.com/jjww/2007/02/opensolaris-smf-adventures-with-powerdns/#comments</comments>
		<pubDate>Thu, 22 Feb 2007 04:46:39 +0000</pubDate>
		<dc:creator>Jason</dc:creator>
				<category><![CDATA[DigiTar]]></category>
		<category><![CDATA[Solaris]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[One of the quiet parts that powers our logistics infrastructure is PowerDNS. Its a very powerful way to serve DNS records that you need the ability to update programmatically. Unfortunately, OpenSolaris (or Solaris 10 for that matter) isn&#39;t exactly considered kosher over in PowerDNS-land. Like a lot of OSS projects, PDNS hasn&#39;t kept up with [...]]]></description>
			<content:encoded><![CDATA[<p>One of the quiet parts that powers our logistics infrastructure is <a href="http://www.powerdns.com/">PowerDNS</a>. Its a very powerful way to serve DNS records that you need the ability to update programmatically. Unfortunately, OpenSolaris (or Solaris 10 for that matter) isn&#39;t exactly considered kosher over in PowerDNS-land. Like a lot of OSS projects, PDNS hasn&#39;t kept up with the times and treats OpenSolaris like a red-headed step-child. If you like red-headed step-children like we do, then you&#39;re in for about<br />
8 hours of greasing, coaxing and pleading to get it compiled right. Well either that&#8230;or you can read on and get it up in about a 30 minutes.. <img src='http://blogs.digitar.com/jjww/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  As a side-bonus, you&#39;ll also have PDNS managed by the coolest way ever invented to replace init.d: <a href="http://www.opensolaris.org/os/community/smf/">SMF</a>. </p>
<h3>Installing PDNS on OpenSolaris/Solaris 10 x64&#8230;<br />
</h3>
<p>First thing you&#39;ll need to do is get <a href="http://www.blastwave.org/">Blastwave</a> installed on your Solaris box. You could try and build the unholy abomination that is <a href="http://www.boost.org/">Boost</a> on your own&#8230;but then you&#39;re a braver soul than I. As its getting late, please excuse that the steps are brief and bulleted (feel free to <a href="mailto:williamsjj_@_digitar.com">harass</a> me if you have questions): </p>
<ol>
<li>
Make sure your path is set correctly. This path will do nicely: <span><strong>PATH=/usr/sbin:/usr/bin:/opt/csw/bin:/usr/sfw/bin:/usr/ccs/bin</strong></span>
</li>
<li>
You&#39;ll need all the dev tools that come with a standard Solaris 10/OpenSolaris install&#8230;make, gcc, g++, ld etc. (You don&#39;t need <a href="http://developers.sun.com/sunstudio/index.jsp">Studio 11</a> installed. In fact, PDNS will really <strong>NOT</strong> like Studio 11 so please use gcc 3.3 or later).</li>
<li>
Run: <strong>pkg-get -i mysql5client</strong>
</li>
<li>
Run: <strong>pkg-get -i mysql5devel</strong>
</li>
<li>
Run: <strong>pkg-get -i boost_rt</strong>
</li>
<li>
Run: <strong>pkg-get -i boost_devel</strong>
</li>
<li>
Run: <strong>ln -s /opt/csw/mysql5/lib/mysql /usr/lib/mysql</strong> (This will make pathological <strong>configure</strong> scripts work a lot more smoothly.)</li>
<li>
Run: <strong>crle -l /lib:/usr/local/lib:/opt/csw/lib:/usr/lib:/opt/csw/mysql5/lib</strong> (This will help your compiled PDNS binaries find all the libraries they need at runtime. Run <strong>crle</strong> by itself first to see if there are any additional paths on your system that need to be present on this list. <strong><u>Caveat emptor..you run this command at your own risk as it can really bork your system if you don&#39;t know what you&#39;re doing.</u></strong>)
</li>
<li>
Unpack the latest PDNS sources which you can get <a href="http://powerdns.com/en/downloads.aspx">here</a> (these instructions are known to work against 2.9.20).</li>
<li>
From within the PDNS source tree root run: <strong>ggrep -R &#8220;u_int8_t&#8221; *</strong>
</li>
<li>
Manually change all the <strong>u_int8_t</strong> references that grep finds to <strong>uint8_t</strong>. If you don&#39;t do this, good &#39;ol crotchety PDNS will not compile. (I know I should provide a patch. I&#39;ll try and do that in the next couple of days if possible).</li>
<li>
From the PDNS source tree root run: <strong>./configure &#8211;localstatedir=/var/run &#8211;with-pic &#8211;enable-shared &#8211;with-mysql-includes=/opt/csw/mysql5/include/ CXXFLAGS=&#8221;-I/opt/csw/include -DSOLARIS&#8221; LDFLAGS=&#8221;-L/opt/csw/lib -lsocket -lnsl&#8221;</strong>
</li>
<li>
Run: <strong>make install</strong> (This will use the prefix <strong>/usr/local/</strong> to install everything. The SMF manifest later will expect your <strong>pdns.conf</strong> to be in <strong>/usr/local/etc/</strong> as a result. For sanity purposes on our systems, we also symlink <strong>pdns.conf</strong> into <strong>/etc</strong>.)</li>
<li>
Bingo! Presto! You have a working PDNS server&#8230;hopefully.</li>
</ol>
<h3>Life support for PDNS&#8230;that is running PDNS under SMF&#8230;<br />
</h3>
<p>Service Management Facility (SMF) is a truly wonderful thing. It completely replaces <strong>init.d</strong> and <strong>inet.d</strong>, gives you a standard way of managing both types of services, understands dependencies, restarts dead services&#8230;and washes your car while you sleep. <img src='http://blogs.digitar.com/jjww/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />  The only hiccough is you&#39;ve got to write a manifest to run PDNS under SMF&#8230;or use the one below. <img src='http://blogs.digitar.com/jjww/wp-includes/images/smilies/icon_biggrin.gif' alt=':-D' class='wp-smiley' />  <strong>Again&#8230;caveat emptor&#8230;this SMF manifest comes with absolutely no warranty at all.</strong> Read the BSD license<br />
header at the top of the manifest for a complete description of how much its your own darn fault if this manifest totals your system. The DigiTar SMF manifest for PDNS has a couple of neat integration features: </p>
<ul>
<li>
If PDNS is already started when you run <strong>svcadm enable powerdns</strong>, it will error out such that SMF will mark PDNS&#39; service description into a maintenance state, and will place an informative message in the PDNS SMF service log.</li>
<li>
If you accidentally delete the <strong>pdns_server</strong> binary, SMF will not let you start the service and will place it into a maintenance state so you know something is wrong.</li>
<li>
Running <strong>svcadm refresh powerdns</strong> will instruct PDNS to scan for new domains that have been added (<strong>pdns_control rediscover</strong>), as well as rescan for changes to records in existing domains (<strong>pdns_control reload</strong>).</li>
</ul>
<p>OK, enough jabbering. Here&#39;s how you install the SMF manifest: </p>
<ol>
<li>
Download the DigiTar PowerDNS SMF package <a href="http://blogs.digitar.com/media/2/powerdns_smf.zip">here</a>.</li>
<li>
Unpack the package on your system.</li>
<li>
Change to the root of the unpacked package.</li>
<li>
Run: <strong>cp ./manifest/dns-powerdns.xml /var/svc/manifest/site/</strong>
</li>
<li>
Run: <strong>cp ./method/dns-powerdns /lib/svc/method/</strong>
</li>
<li>
Run: <strong>svccfg import /var/svc/manifest/site/dns-powerdns.xml</strong>
</li>
<li>
You&#39;re done!</li>
</ol>
<p>You should now be able to start your PDNS server with a simple <strong>svcadm enable powerdns</strong>. Stopping PDNS is similarly simple: <strong>svcadm disable powerdns</strong>. If you just want to see the state of the PDNS service try <strong>svcs powerdns</strong>. That&#39;s it! You can sleep well at night knowing if PDNS goes the way of all flesh, SMF will auto-restart it for you. Try a <strong>pkill pdns</strong> and watch the process IDs change. <img src='http://blogs.digitar.com/jjww/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  If you&#39;re PDNS service won&#39;t start take a look at <strong>svcs<br />
-x</strong> to see why. Anywho&#8230;off to the sand man for me. If you have any questions, please feel free to contact me: <a href="mailto:williamsjj_@_digitar.com">williamsjj_@_digitar.com</a> </p>
<p><!-- technorati tags start -->
<p style="text-align:right;font-size:10px;">Technorati Tags: <a href="http://www.technorati.com/tag/Solaris" rel="tag">Solaris</a>, <a href="http://www.technorati.com/tag/Solaris Express" rel="tag">Solaris Express</a>, <a href="http://www.technorati.com/tag/Sun" rel="tag">Sun</a>, <a href="http://www.technorati.com/tag/SMF" rel="tag">SMF</a></p>
<p><!-- technorati tags end --></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.digitar.com/jjww/2007/02/opensolaris-smf-adventures-with-powerdns/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Shenanigans with ZFS flushing and intelligent arrays&#8230;</title>
		<link>http://blogs.digitar.com/jjww/2006/12/shenanigans-with-zfs-flushing-and-intelligent-arrays/</link>
		<comments>http://blogs.digitar.com/jjww/2006/12/shenanigans-with-zfs-flushing-and-intelligent-arrays/#comments</comments>
		<pubDate>Fri, 15 Dec 2006 02:08:09 +0000</pubDate>
		<dc:creator>Jason</dc:creator>
				<category><![CDATA[DigiTar]]></category>
		<category><![CDATA[Solaris]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[NOTE: ZFS has been enhanced to better address the situation described below by using ZFS configuration directives. This article is still accurate and provides decent background on the problem. However, an update has been posted with the newer, stronger, better way of resolving the problem: Back in the sandbox&#8230;ZFS flushing shenanigans revisited.  

Running operations [...]]]></description>
			<content:encoded><![CDATA[<p><strong>NOTE: ZFS has been enhanced to better address the situation described below by using ZFS configuration directives. This article is still accurate and provides decent background on the problem. However, an update has been posted with the newer, stronger, better way of resolving the problem: <a href="http://blogs.digitar.com/jjww/2007/10/back-in-the-sandbox-zfs-flushing-shenanigans-revisted/">Back in the sandbox&#8230;ZFS flushing shenanigans revisited.</a> <img src='http://blogs.digitar.com/jjww/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </strong></p>
<p></p>
<p>Running operations for start-up company is interesting&#8230;you learn a lot of things the hard way. Among the things you learn is how nuanced it is to deal with databases and storage under heavy traffic. Before I start my little diatribe, let me profusely thank Richard Elling and Roch Bourbonnais at Sun for saving our bacon. They are stellar engineers and we are more grateful than words can say for their help in resolving our ZFS roadblocks. They&#8217;re another example of the people who make Sun the company we love<br />
to work with. Sun is blessed to have them.</p>
<p></p>
<p>About 6 months ago we moved to a new Sun StorageTek FC array, and used the opportunity to move to ZFS. We loved ZFS in development and frankly it kicks the pants off of UFS/SVM (Solaris Volume Manager). It is SO much easier to deal with for volume management, and the block checksums help you quickly eliminate your storage when tracking down corruption. That being said, ZFS really has some interesting quirks. One of them is that it is truly designed to deal with dumb-as-a-rock storage. If you have a box of SATA<br />
disks with firmware flakier than Paris Hilton on a coke binge, then ZFS has truly been designed for you.</p>
<p></p>
<p>As a result, ZFS doesn&#8217;t trust that anything it writes to the <a href="http://blogs.sun.com/realneel/entry/the_zfs_intent_log">ZFS Intent Log (ZIL)</a> made it to your storage, until it flushes the storage cache. After every write to the ZIL, ZFS executes an <strong>fsync()</strong> call to instruct the storage to flush its write cache to the disk. In fact, ZFS won&#8217;t return on a write operation until the ZIL write and flush have completed. If the devices making up your zpool are individual hard drives&#8230;particularly<br />
SATA ones&#8230;this is a great behavior. If the power goes kaput during a write, you don&#8217;t have the problem that the write made it to drive cache but never to the disk.</p>
<p></p>
<p>The major problem with this strategy only occurs you when you try to layer ZFS over an intelligent storage array with a decent battery-backed cache. Enter our issues with ZFS on StorageTek/Engenio arrays.</p>
<p></p>
<p>Most of these arrays have sizable 2GB or greater caches with 72-hour batteries. The cache gives a huge performance boost, particularly on writes. Since cache is so much faster than disk, the array can tell the writer really quickly, &#8220;I&#8217;ve got it from here, you can go back to what you were doing&#8221;. Essentially, as fast as the data goes into the cache, the array can release the writer. Unlike the drive-based caches, the array cache has a 72-hour battery attached to it. So, if the array loses power and dies, you<br />
don&#8217;t lose the writes in the cache. When the array boots back up, it flushes the writes in the cache to the disk. However, ZFS doesn&#8217;t know that its talking to an array, so it assumes that the cache isn&#8217;t trustworthy, and still issues an <strong>fsync()</strong> after every ZIL write. So every time a ZIL write occurs, the write goes into the array write cache, and then the array is immediately instructed to flush the cache contents to the disk. This means ZFS doesn&#8217;t get the benefit of a quick return from the<br />
array, instead it has to wait the amount of time it takes to flush the write cache to the slow disks. If the array is under heavy load and the disks are thrashing away, your write return time (latency) can be awful with ZFS. Even when the array is idle, your latency with flushing is typically higher than the latency under heavy load with no flushing. With our array honoring ZFS ZIL flushes, we saw idle latencies of 54ms, and heavy load latencies of 224ms. This crushed the MySQL database running on top of it.<br />
The InnoDB tables are particularly sensitive to this, because they issue 3x more writes than InnoDB tables. Also, since InnoDB tables use disk-based transactions, you can get write-loads that are orders of magnitudes greater than InnoDB. If the disk-latency gets bad enough, InnoDB will <strong>completely</strong> lock-up the MySQL process with deadlocks.</p>
<p></p>
<p>So where does this leave a hapless start-up? Fortunately, you don&#8217;t have to give up ZFS. You have two options to rid yourself of the bane of existence known as write cache flushing: <strong>*** Please check out the update to this article <a href="http://blogs.digitar.com/jjww/2007/10/back-in-the-sandbox-zfs-flushing-shenanigans-revisted/">here</a>! There&#8217;s a better way now! ***</strong></p>
<p></p>
<ul>
<li style="list-style: none"></li>
<li><strong>Disable the ZIL.</strong> The ZIL is the way ZFS maintains consistency until it can get the blocks written to their final place on the disk. That&#8217;s why the ZIL flushes the cache. If you don&#8217;t have the ZIL and a power outage occurs, your blocks may go poof in your server&#8217;s RAM&#8230;&#8217;cause they never made it to the disk Kemosabe.</li>
<li style="list-style: none"></li>
<li><strong>Tell your array to ignore ZFS&#8217; flush commands.</strong> This is pretty safe, and massively beneficial.</li>
<li style="list-style: none"></li>
</ul>
<p></p>
<p>The former option, is really a no go because it opens you up to losing data. The second option really works well and is darn safe. It ends up being safe because if ZFS is waiting for the write to complete, that means the write made it to the array, and if its in the array cache you&#8217;re golden. Whether famine or flood or a loose power cable come, your array will get that write to the disk eventually. So its OK to have the array lie to ZFS and release ZFS almost immediately after the ZIL flush command executes.<br />
On our StorageTek FLX210 this took the idle latencies to 1ms and the heavy load latencies to 9ms. 9 bloody milliseconds! Our InnoDB problems disappeared like sand down a rat hole.</p>
<p></p>
<p>So how do you get your array to ignore SCSI flush commands from ZFS? That differs depending on the array, but I can tell you how to do it on an Engenio array. If you&#8217;ve got any of the following arrays, its made by Engenio and this <em>may</em> work for you:</p>
<p></p>
<ul>
<li style="list-style: none"></li>
<li>Sun StorageTek FlexLine 200/300 series</li>
<li style="list-style: none"></li>
<li>Sun StorEdge 6130</li>
<li style="list-style: none"></li>
<li>Sun StorageTek 6140/6540</li>
<li style="list-style: none"></li>
<li>IBM DS4&#215;00</li>
<li style="list-style: none"></li>
<li>many SGI InfiniteStorage arrays (you&#8217;ll need to check to make sure your array is actually OEM&#8217;d from Engenio)</li>
<li style="list-style: none"></li>
<li>(if you have another Engenio OEM&#8217;d array manufacturer, just let me know and I&#8217;ll update the list.)</li>
<li style="list-style: none"></li>
</ul>
<p></p>
<p><strong><u>Before I give you the instructions, I must warn you that the following instructions come with no warranty whatsoever. These instructions are from me alone and have no blessing conferred by, warranty from, acceptability by, or connection with my employer DigiTar. Neither I nor my employer can be held responsible for the consequences resulting from the use of these instructions, and if you use them you absolve us both individually and collectively from any responsibility for the accuracy of these instructions<br />
or the consequences of using these instructions. These instructions are potentially dangerous and may cause massive data loss. <a href="http://en.wikipedia.org/wiki/Caveat_Emptor">Caveat Emptor</a>.</u></strong></p>
<p></p>
<p>Okay, tush-covering mumbo jumbo over. On a StorageTek FLX210 with SANtricity 9.15, the the following command script will instruct the array to ignore flush commands issued by Solaris hosts:</p>
<p></p>
<p><font face="Tahoma" size="2"><strong>//Show Solaris ICS option</strong></font></p>
<p>
<b><font face="Tahoma" size="2">show controller[a] HostNVSRAMbyte[0x2, 0x21];</font></b></p>
<p>
<b><font face="Tahoma" size="2">show controller[b] HostNVSRAMbyte[0x2, 0x21];</font></b></p>
<p></p>
<p><b><font face="Tahoma" size="2">//Enable ICS</font></b></p>
<p>
<b><font face="Tahoma" size="2">set controller[a] HostNVSRAMbyte[0x2, 0x21]=0&#215;01;</font></b></p>
<p>
<b><font face="Tahoma" size="2">set controller[b] HostNVSRAMbyte[0x2, 0x21]=0&#215;01;</font></b></p>
<p></p>
<p><b><font face="Tahoma" size="2">// Make changes effective</font></b></p>
<p>
<b><font face="Tahoma" size="2">// Rebooting controllers</font></b></p>
<p>
<b><font face="Tahoma" size="2">show &#8220;Rebooting A controller.&#8221;;</font></b></p>
<p>
<b><font face="Tahoma" size="2">reset controller[a];</font></b></p>
<p></p>
<p><strong><font face="Tahoma" size="2">show &#8220;Rebooting B controller.&#8221;;</font></strong></p>
<p>
<b><font face="Tahoma" size="2">reset controller[b];</font></b></p>
<p></p>
<p>If you notice carefully, I said the script will cause the array to ignore flush commands from <strong>Solaris</strong> hosts. So <em>all</em> Solaris hosts attached to the array will have their flush commands ignored. You can&#8217;t turn this behavior on and off on a per host basis. To run this script, cut and paste the script into the script editor of the &#8220;Enterprise Management Window&#8221; of the SANtricity management GUI. That&#8217;s it! A key note here is that you should definitely have your server shut down, or at minimum<br />
your ZFS zpool exported before you run this. Otherwise, when your array reboots ZFS will kernel panic the server. In our experience, this will happen even if you only reboot one controller at a time, waiting for one controller to come back online before rebooting the other. For whatever reason, MPXIO which normally works beautifully to keep a LUN available when losing a controller, fails miserably with this situation. Its probably the array&#8217;s fault, but whatever the issue, that&#8217;s the reality. Plan for downtime<br />
when you do this.</p>
<p></p>
<p>In the words of the French, c&#8217;est tout&#8230;that&#8217;s all folks. This cleared up all of the ZFS latency problems we&#8217;ve been having. Hopefully, this experience will be helpful to other people. This behavior isn&#8217;t well documented outside of the <a href="http://opensolaris.org/os/community/zfs/discussions/">ZFS mailing lists</a>, which is why we&#8217;re documenting it here for the world to index and find. More importantly, public documentation on Engenio-based arrays is downright abysmal. If you search hard enough, you&#8217;ll<br />
find an IBM Red Book paper that tells you the array can ignore flush commands, but happy hunting if you actually want to know how to enable the behavior.</p>
<p></p>
<p>Just a quick note before closing&#8230;ZFS rocks. Its that simple. So much arcane black magic disappears under the skirt of ZFS, but as always you can&#8217;t make it all go away. If anyone has instructions on how to configure non-Engenio arrays to ignore flush commands, please let me know. Stay tuned for a diatribe..er..discussion on the kernel panic behavior of ZFS. G&#8217;night y&#8217;all.</p>
<div class="iblogger-footer">
  <br clear="all" /></p>
<p style="text-align:right;font-size:10px;">[Posted with <a href="http://illuminex.com/iBlogger/index.html">iBlogger</a> from my iPhone]</p>
<p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://blogs.digitar.com/jjww/2006/12/shenanigans-with-zfs-flushing-and-intelligent-arrays/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
		<item>
		<title>SunFish Chum&#8230;er&#8230;Odds and ends.</title>
		<link>http://blogs.digitar.com/jjww/2006/08/sunfish-chum-er-odds-and-ends/</link>
		<comments>http://blogs.digitar.com/jjww/2006/08/sunfish-chum-er-odds-and-ends/#comments</comments>
		<pubDate>Tue, 15 Aug 2006 23:44:19 +0000</pubDate>
		<dc:creator>Jason</dc:creator>
				<category><![CDATA[DigiTar]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[Currently, we&#39;re putting the N1400Vs into production and there were two odds and ends that came to mind that I wanted to mention:

No client/server settings per port! Hooray! The Alteons (even the 2424s) inherited from the Alteon AD4s and 184s the need to enable client and/or server processing per port. For those who are not [...]]]></description>
			<content:encoded><![CDATA[<p>Currently, we&#39;re putting the N1400Vs into production and there were two odds and ends that came to mind that I wanted to mention:</p>
<ol>
<li><strong>No client/server settings per port!</strong> Hooray! The Alteons (even the 2424s) inherited from the Alteon AD4s and 184s the need to enable client and/or server processing per port. For those who are not familiar, server load balancing basically can be reduced into two operations:</li>
<ul>
<li><strong>Client processing: </strong>When a packet comes in from a web browser to the web switch, its header has a TO fieldthat&#39;s the IP address of the web switch, and a FROM fieldthat&#39;s the IP address of the web browser. Once the web switch gets the packet and decides which back-end server to send it to, it has to replace the packet&#39;s TOwith the IP address of the back-end server. If the web switch didn&#39;t change the TOand simply sent the packet on, the server would ignore the packet. Sort<br />
of like receiving a letter addressed to somebody you don&#39;t know. So in a nutshell, server processing is simply replacing the web switch&#39;s IP address with the selected back-endserver&#39;s IP address in packets from the client.</li>
<li><strong>Server processing:</strong> When the back-end server decides to send a response packet back to the client, the reverse of server processing has to occur. If the web switch were to simply send the packet from the server back to the client without client processing, the client would ignore the packet. Why? Well, the client sent the packet to the IP address of the web switch and expects a reply from that IP address, not the server&#39;s IP. It sort of like sending a letter to Aunt Gertie, but getting the reply<br />
from Aunt Gertie&#39;s nurse Josie. You don&#39;t know who Josie is, so you toss the reply thinking its junk mail. Client processing fixes this by rewriting the FROM in the server&#39;s reply to the IP address of the web switch.</li>
<li>An Alteon is a bit unusual in that instead of one massive SLB processor it has 8&#8230;one per port (this is fixed in the 2424s, but they imitate the older behavior for backward compatibility). So if you have one port connected to your servers and a second port connected to the Internet, you have to enable Client processing on the Internet-facing port and Server processing on the server-facing port. The reason is that the 8 individual processors aren&#39;t bulky enough to do BOTH the client and server processing. As<br />
a result, the operation gets split between ports in a way you specify. So you have to remember which kind of processingis which, and set it appropriately on the right ports. This is a MAJOR pain in the butt. If you get client and server processing confused and set a port to the wrong one, load balancing just isn&#39;t gonna work for you today.</li>
<li>The SunFish don&#39;t have this limitation. They just make it work. Concentrate on creating your VIPs and RIPs and the rest is taken care of for you. Its really a spectacular change for us! It was so easy, that it wasn&#39;t until I was driving home that it struck me I hadn&#39;t had to fool with client or server processing at all.</li>
</ul>
<li><strong>XML-over-HTTP!</strong> As I was complaining about the lack of a heads-up-display on the SunFish, I ran into a very cool feature!On most of the pages that list settings or statistics in the SunFish WebUI, there&#39;s a little button labeled &#8220;XML&#8221;. If you click on it, you get the settings or stats you were looking at&#8230;but XML encoded! This means you can write your own scripts to consume the status of the SunFish! All your program needs to be capable of is downloading pages via HTTP, and consuming XML.<br />
The upshot is that this feature enables us to write our own stop gap heads-up-display. <img src='http://blogs.digitar.com/jjww/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  Its much simpler than messing around with SNMP calls and the like. Particularly, given our familiarity with consuming web services. This is a terrific feature! Props to the SunFish team for providing an XML interface to the unit. Simply amazing.</li>
</ol>
<p><!-- technorati tags start -->
<p style="text-align:right;font-size:10px;">Technorati Tags: <a href="http://www.technorati.com/tag/N1400V" rel="tag">N1400V</a>, <a href="http://www.technorati.com/tag/Sun" rel="tag">Sun</a></p>
<p><!-- technorati tags end --></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.digitar.com/jjww/2006/08/sunfish-chum-er-odds-and-ends/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
