One of the clusters we have uses DRBD between two machines with GFS2 mounted on DRBD in dual primary. I’d played around with Gluster and Lustre, OCFS2, AFS and many others and I’ve used NetApps in the past, but, I’ve never been extremely happy with any of the distributed and clustered filesystems.
With my recent thinking on SetUID mode or SetGID to deal with particular problems led me to look at a versioning filesystem. Currently that leaves ZFS and BtrFS.
I’ve used ZFS in the past on Solaris and it is supported natively within FreeBSD. Since we use Debian, there is Debian’s K*BSD project which puts the Debian userland on the BSD kernel – making most of our in-house management processes easy to convert. Using ZFS under Linux requires using Fuse which could introduce performance issues.
The other option we have is BtrFS. BtrFS is less mature, but, also has the ability to handle in-place migrations from ext3/ext4. While this doesn’t really help much since we primarily run XFS, future machines could use ext4 until BtrFS is deemed stable enough at which point they could be live converted.
In testing, XFS and Ext4 have similar performance when well tuned which means we shouldn’t see any real significant difference with either. Granted this disagrees with some current benchmarks, but, those benchmarks didn’t appear to set the filesystem up correctly and didn’t modify the mount parameters to allow for more buffers to be used. When dealing with small files, XFS needs a little more RAM and the journal logbuffers needs to be increased – keeping more of the log in RAM before being replayed and committed. Large file performance is usually deemed superior with XFS, but, properly tuning Ext3 (and by inference, Ext4), we can change the performance characteristics of Ext3/4 and get about 95% of XFS’s large file performance.
Currently we keep two generations of weekly machine backups. While this wouldn’t change, we actually could do checkpointing and more frequent snapshots so that a file uploaded and modified or deleted would have a much better chance of being able to be restored. One of the things about versioning filesystems is the ability to do hourly or daily snapshots which should allow us to reduce the data loss if a site is exploited or catastrophically damaged through a mistake.
So, we’ve got three potential solutions in order of confidence that the solution will work:
* FreeBSD ZFS
* Debian/K*BSD ZFS
* Debian BtrFS
This weekend I’ll start putting the two Debian solutions through their paces to see if I feel comfortable with either. I’ve got a chassis swap to do this week and we’ll probably switch that machine from XFS to Ext4 in preparation as well. Most of the new machines we’ve been putting online now use Ext4 due to some of the issues I’ve had with XFS.
Ideally, I would like to start using BtrFS on every machine, but, if I need to move things over to FreeBSD, I would have to make some very tough decisions and migrations.
Never a dull moment.