SnapBack: The joys of backing up MySQL with ZFS…

17

Posted by Jason | Posted in Solaris | Posted on 01-17-2008

Awhile back (~July 2006) we moved our core MySQL clusters to ZFS in order to…among other things…simplify our backup regimen. Nearly two years later, I can honestly say I'm in love with ZFS mostly because of how much its simplified and shored-up our MySQL infrastructure. ZFS and MySQL go together like chips and salsa.

Now, backing up live MySQL servers is a bit like trying to take a picture of a 747's fan blades while it's in flight. There's basically three camps of MySQL backups:

 

  • Use mysqldump, which locks your tables longer than a Tokyo stoplight.
  • Quickly quiesce/lock your tables and pull a volume snapshot.
  • Or….use InnoBase Hot Backup. Outside of being licensed per server, ibbackup works very well and allows InnoDB activity to continue uninterrupted. The only downside is if you have to backup InnoDB tables (lock city!).

The mysqldump method worked best when our apps were in the development and testing phase…i.e. the datasets were tiny. However, once you get above a few hundred megabytes, the server had better be a non-production one. Frankly, for our infrastructure, the snapshot method was the most appealing. Among other reasons, it didn't require per-server licensing, and was the best performing option for our mixed bag of InnoDB and InnoDB tables. Initially, the plan was to use the snapshot feature in our shiny new STK arrays…however, just about the time our new arrays arrived, ZFS was released in a reasonably stable form. While snapshotting is not unique to ZFS (and it is a widely used approach to MySQL backups), there are a few benefits to relying on ZFS snapshots for MySQL backups:

  • ZFS snapshots are bleeding fast. When you're backing up a production master, the shorter lock time is critical. Our master backups typically complete within about 3-4 seconds during high load periods.
  • No network communication for the snapshot to complete. Using ZFS, snapshot commands don't have to travel over a network to a storage controller to be executed. Fewer moving parts mean greater reliability and fewer failure conditions to handle in your backup scripts.
  • 100GB database restores/rollbacks are lightning quick…typically under 30 seconds. (Unique to the snapshot approach…not ZFS).

However, settling on a backup approach was only part of the battle. Frankly, at the time, there was no commercial backup software that would do what we wanted. MySQL was the red-headed step-child of the backup software world…and to a large degree still is (Zmanda not withstanding). So we rolled our own MySQL/ZFS backup program that we called SnapBack. It's not fancy, and you need to pair it with your own scheduler, but it is highly suited to fast and reliable production backups of MySQL in a ZFS environment. We use it for all sorts of tasks from the original purpose of backing up busy masters, to snapping datasets to build new slaves. SnapBack addresses quite a few of the issues we encountered with the existing open source MySQL backup scripts we found:

  • SnapBack understands how to quiesce MySQL for consistent backups of InnoDB tables (usually avoiding InnoDB recovery on a restore). Most of the open source scripts focus exclusively on InnoDB, and forget about disabling AUTOCOMMIT.
  • SnapBack records the current master log file name and position in the naming of the snapshot (to aid creating replication slaves). Frankly, you can take any SnapBack backup and create a slave from that point-in-time. You don't really need to know you want to do that at the time you pull the backup.
  • SnapBack is aware that InnoDB logs and table space are usually on different zpools for performance, and can snap both pools in a single backup.

All this blathering is really just preface to the fact that we're releasing SnapBack to the world. Hopefully it will save other folks some time, and be a useful tool in their MySQL toolbox. Below you'll find the requirements for SnapBack and a download link. If there's interest in enhancing SnapBack, please feel free as we're releasing it under the BSD license. If there's enough interest, we'll try and post it in a format more conducive to community collaboration. So without further ado….

SnapBack Rev. 1524 Requirements 

Download: SnapBack Rev. 1524 

 

 

Comments (17)

I bet your behind the Mysql/Sun deal aren't you ??
very interesting article, I'll give it a try and let you know
selim

H, could you add RSS feed to your blog ? It seems very interesting one :)

A little bit of topic, but I wonder if you looked into a performance comparison of running mysql on ZFS as compared to say UFS? Or for you guys was the performance trade off for greater manageability well worth it?

That is right, that UFS performs better than ZFS in terms of using it with a database, but…
- IO is most important on DSS databases
- with OLTP the most important thing is to keep with proper buffering and then the IO performance seems to be less critical factor
- backup is especially hard to take on OLTP databases, which must work 24x7x365

Selim: I hope it works for you. I wish I was behind the Sun/MySQL deal. :-) Actually we're pretty excited about it. With PostgreSQL getting all the Sun-love till now, its really a phenomenal announcement. Hopefully, the DTrace probes won't have to wait till 6.0 now. :-)

vnull: Somehow I never noticed the lack of RSS/Atom in the template. It should be fixed now. Thank you.

Jason: UFS is slightly better performing in reads. However, for writes we find ZFS to have superior performance. But, even if ZFS had 30% less performance the trade-off would be worth it for the features. We save so much money on snapshot licenses that adding extra disks to close the performance gap would still be cheaper.

Have you tried it on FreeBSD? FreeBSD can use ZFS now as well (ZVF v6 not v9/10 as with Solaris). I hope it can be made to work!

It looks to me the Snapback is actually is taking a Snapshot of all the files of MySQL database. This might work well if your instance only contains one database. However, in many cases, we actually have multiple databases in one instance, and we only want to restore one database on the instance?

How does the Snapback work in this case?

joe: I have not, but it'd be great if it worked. The path/name of the ZFS userspace command is configurable in the configuration file. If the syntax for the command is the same on FreeBSD it should work.

yeek: Depends on how you want to arrange your set up. The most straightforward would be to create a ZFS filesystem for each MySQL database directory under the master ZFS FS. Then create a SnapBack configuration for each FS. Alternately, you could copy the DB files you want out of the snapshot back into the writeable filesystem. We run multiple DBs on a single FS and snapshot that. We copy from the snapshot to long term disk-to-disk backup over GigE.

Great script, however I'd be careful trusting the innodb snapshots even with "FLUSH TABLES WITH READ LOCK" I've run into several instances where innodb still had data in memory that didn't seem to make it to the data files. Maybe 1 in 12 snapshots innodb datafiles where corrupt ( i was using a netapp in my case ) just my $.02

I think, Shane, that InnoDB would do well even without FLUSH TABLES WITH READ LOCK. Actually I have made something similar and FLUSH [..] was performed only to provide consistent InnoDB tables, InnoDB should simply recover helping itself with own transactional changelog – of course You need to be sure to have not broken D in ACID

Hi Shane,

We've experienced a small number of cases as well, where a rollback recovery was required after restoring the backup. I believe there's also some work being done in the 5.1 series code to make InnoDB more consistent on a FLUSH command. We've never experienced a non-recoverable/corrupt dataset though.

Jason Arnel,
See: http://dev.mysql.com/tech-r…

Still troubles to get things working in 64Bit-mode…
I compiled MySQLdb with:
# CFLAGS="-m64 -B/usr/ccs/bin/amd64" LDFLAGS="-L/lib/64 -R/lib/64 -L/usr/sfw/lib/amd64 -R/usr/sfw/lib/amd64 -static-libgcc -lgcc_s_amd64" /usr/bin/amd64/python2.4 setup.py build
# CFLAGS="-m64 -B/usr/ccs/bin/amd64" LDFLAGS="-L/lib/64 -R/lib/64 -L/usr/sfw/lib/amd64 -R/usr/sfw/lib/amd64 -static-libgcc -lgcc_s_amd64" /usr/bin/amd64/python2.4 setup.py install
It seems to work but I get
# /usr/bin/amd64/python ./mysql_snapback.py mysql_snapback.cfg
Traceback (most recent call last):
File "./mysql_snapback.py", line 44, in ?
import MySQLdb
File "build/bdist.solaris-2.10-i86pc/egg/MySQLdb/__init__.py", line 19, in ?
ImportError: No module named _mysql

But I am far away from python 'til now…

Jason, it looks like this page used to originally refer to InnoDB and MyISAM tables, but at some point all the MyISAM references were changed to InnoDB, leaving phrases like “mixture of InnoDB and InnoDB tables”. You might want to correct that.

I only catch the one location for the mistake. Thank you for catching it. It’s been corrected.

Jason, are you sure there’s only one location? Or did the update not post? I’m still seeing “best performing option for our mixed bag of InnoDB and InnoDB tables” above, and “ibbackup works very well and allows InnoDB activity to continue uninterrupted. The only downside is if you have to backup InnoDB tables (lock city!)” Neither of these appear to make sense.

I’m actually trying to use this in test, to be eventually rolled out in production if I can get it to work, with MySQL 5.1.44, Solaris 10u9, ZFS pool version 22, ZFS filesystem version 4. After snapshotting, I’m cloning the snapshot, mounting the clone at a new location, backing it up with Bacula, restoring, mounting the restored DB directory via NFS to another machine (a Gentoo Linux machine which currently runs MySQL 5.1.50-r1). The backups and restores look good, but I’m invariably finding corrupted InnoDB tables when I try to load them. I assume it’s working for you; any suggestions about what might be the source of the problem?

….Never mind. I found the problem. it was a stupidly trivial configuration error on my side.

Color me sheepish now.

Write a comment