Awhile back (~July 2006) we moved our core MySQL clusters to ZFS in order to…among other things…simplify our backup regimen. Nearly two years later, I can honestly say I'm in love with ZFS mostly because of how much its simplified and shored-up our MySQL infrastructure. ZFS and MySQL go together like chips and salsa.
Now, backing up live MySQL servers is a bit like trying to take a picture of a 747's fan blades while it's in flight. There's basically three camps of MySQL backups:
The mysqldump method worked best when our apps were in the development and testing phase…i.e. the datasets were tiny. However, once you get above a few hundred megabytes, the server had better be a non-production one. Frankly, for our infrastructure, the snapshot method was the most appealing. Among other reasons, it didn't require per-server licensing, and was the best performing option for our mixed bag of InnoDB and MyISAM tables. Initially, the plan was to use the snapshot feature in our shiny new STK arrays…however, just about the time our new arrays arrived, ZFS was released in a reasonably stable form. While snapshotting is not unique to ZFS (and it is a widely used approach to MySQL backups), there are a few benefits to relying on ZFS snapshots for MySQL backups:
However, settling on a backup approach was only part of the battle. Frankly, at the time, there was no commercial backup software that would do what we wanted. MySQL was the red-headed step-child of the backup software world…and to a large degree still is (Zmanda not withstanding). So we rolled our own MySQL/ZFS backup program that we called SnapBack. It's not fancy, and you need to pair it with your own scheduler, but it is highly suited to fast and reliable production backups of MySQL in a ZFS environment. We use it for all sorts of tasks from the original purpose of backing up busy masters, to snapping datasets to build new slaves. SnapBack addresses quite a few of the issues we encountered with the existing open source MySQL backup scripts we found:
All this blathering is really just preface to the fact that we're releasing SnapBack to the world. Hopefully it will save other folks some time, and be a useful tool in their MySQL toolbox. Below you'll find the requirements for SnapBack and a download link. If there's interest in enhancing SnapBack, please feel free as we're releasing it under the BSD license. If there's enough interest, we'll try and post it in a format more conducive to community collaboration. So without further ado….
SnapBack Rev. 1524 Requirements
Download: SnapBack Rev. 1524
selim
January 17th, 2008 at 8:18 pm
I bet your behind the Mysql/Sun deal aren't you ??
very interesting article, I'll give it a try and let you know
selim
vnull
January 18th, 2008 at 5:29 am
H, could you add RSS feed to your blog ? It seems very interesting one
jason arneil
January 18th, 2008 at 6:29 am
A little bit of topic, but I wonder if you looked into a performance comparison of running mysql on ZFS as compared to say UFS? Or for you guys was the performance trade off for greater manageability well worth it?
remigiusz sokolowski
January 18th, 2008 at 7:29 am
That is right, that UFS performs better than ZFS in terms of using it with a database, but…
- IO is most important on DSS databases
- with OLTP the most important thing is to keep with proper buffering and then the IO performance seems to be less critical factor
- backup is especially hard to take on OLTP databases, which must work 24×7x365
January 18th, 2008 at 4:21 pm
Selim: I hope it works for you. I wish I was behind the Sun/MySQL deal.
Actually we're pretty excited about it. With PostgreSQL getting all the Sun-love till now, its really a phenomenal announcement. Hopefully, the DTrace probes won't have to wait till 6.0 now.
vnull: Somehow I never noticed the lack of RSS/Atom in the template. It should be fixed now. Thank you.
Jason: UFS is slightly better performing in reads. However, for writes we find ZFS to have superior performance. But, even if ZFS had 30% less performance the trade-off would be worth it for the features. We save so much money on snapshot licenses that adding extra disks to close the performance gap would still be cheaper.
joe
January 20th, 2008 at 10:07 pm
Have you tried it on FreeBSD? FreeBSD can use ZFS now as well (ZVF v6 not v9/10 as with Solaris). I hope it can be made to work!
yeek
January 21st, 2008 at 1:59 pm
It looks to me the Snapback is actually is taking a Snapshot of all the files of MySQL database. This might work well if your instance only contains one database. However, in many cases, we actually have multiple databases in one instance, and we only want to restore one database on the instance?
How does the Snapback work in this case?
January 21st, 2008 at 5:30 pm
joe: I have not, but it'd be great if it worked. The path/name of the ZFS userspace command is configurable in the configuration file. If the syntax for the command is the same on FreeBSD it should work.
yeek: Depends on how you want to arrange your set up. The most straightforward would be to create a ZFS filesystem for each MySQL database directory under the master ZFS FS. Then create a SnapBack configuration for each FS. Alternately, you could copy the DB files you want out of the snapshot back into the writeable filesystem. We run multiple DBs on a single FS and snapshot that. We copy from the snapshot to long term disk-to-disk backup over GigE.
Shane Milton
February 4th, 2008 at 5:50 am
Great script, however I'd be careful trusting the innodb snapshots even with "FLUSH TABLES WITH READ LOCK" I've run into several instances where innodb still had data in memory that didn't seem to make it to the data files. Maybe 1 in 12 snapshots innodb datafiles where corrupt ( i was using a netapp in my case ) just my $.02
Remigusz Sokolowski
February 11th, 2008 at 6:59 am
I think, Shane, that InnoDB would do well even without FLUSH TABLES WITH READ LOCK. Actually I have made something similar and FLUSH [..] was performed only to provide consistent MyISAM tables, InnoDB should simply recover helping itself with own transactional changelog - of course You need to be sure to have not broken D in ACID
February 19th, 2008 at 10:38 am
Hi Shane,
We've experienced a small number of cases as well, where a rollback recovery was required after restoring the backup. I believe there's also some work being done in the 5.1 series code to make InnoDB more consistent on a FLUSH command. We've never experienced a non-recoverable/corrupt dataset though.
Matt Montgomery
March 11th, 2008 at 2:37 pm
Jason Arnel,
See: http://dev.mysql.com/tech-r...
Lars Timmann
April 25th, 2008 at 9:17 am
Still troubles to get things working in 64Bit-mode…
I compiled MySQLdb with:
# CFLAGS="-m64 -B/usr/ccs/bin/amd64" LDFLAGS="-L/lib/64 -R/lib/64 -L/usr/sfw/lib/amd64 -R/usr/sfw/lib/amd64 -static-libgcc -lgcc_s_amd64" /usr/bin/amd64/python2.4 setup.py build
# CFLAGS="-m64 -B/usr/ccs/bin/amd64" LDFLAGS="-L/lib/64 -R/lib/64 -L/usr/sfw/lib/amd64 -R/usr/sfw/lib/amd64 -static-libgcc -lgcc_s_amd64" /usr/bin/amd64/python2.4 setup.py install
It seems to work but I get
# /usr/bin/amd64/python ./mysql_snapback.py mysql_snapback.cfg
Traceback (most recent call last):
File "./mysql_snapback.py", line 44, in ?
import MySQLdb
File "build/bdist.solaris-2.10-i86pc/egg/MySQLdb/__init__.py", line 19, in ?
ImportError: No module named _mysql
But I am far away from python 'til now…