XFS Filesystem Benchmarking thoughts due to real world situation
I suspect I’m going to learn something about XFS today that is going to create a ton of work.
Writing a benchmark test (in bash) that uses bonnie++ to benchmark along with two other real world scenarios. I’ve thought a lot about how to replicate some real-world testing, but, benchmarks usually stress a worst case scenario and rarely replicate real-world scenarios.
Benchmarks shouldn’t really be published as the end-all, be-all of testing and you have to make sure you’re testing the piece you’re looking at, not your benchmark tool.
I’ve tried to benchmark Varnish, Tux, Nginx multiple times and I’ve seen numerous benchmarks that claim one is insanely faster than the others for this workload or that workload, but, there are always potential issues in their tests. A benchmark should be published with every bit of information possible so that others can replicate the test in the same way and potentially point out configuration issues.
One benchmark I read lately showed Nginx reading a static file at 35krps, and Varnish flatlined at 8krps. My first thought was, is it caching or talking to the backend? There was a snapshot of varnishstat supporting the notion that it was indeed cached, but, was the shmlog mounted on a ram based tmpfs? Was varnishstat running while the benchmark was?
Benchmarks test particular workloads – workloads you may not see. What you learn from a benchmark is how this load is affected by that setting – so when you start to see symptoms, your benchmarking has taught you what knob to turn to fix things.
Based on my impressions of the filesystem issues we’re running into on a few production boxes, I am convinced lazy-count=0 is a problem. While I did benchmark it and received different results, logic dictates that lazy-count=1 should be enabled for almost all workloads. Another value I’m looking at is -i size=256 – which is the default for XFS. I believe this should be larger which would really assist directories with tens of thousands of files. -b 8192 might be a good compromise since many of these sites are running small files, but, the average filesize is 5120 bytes – slightly over the 4096 byte block – meaning that each file written receives two inodes – and two metadata updates. logsize should be increased on heavy write machines, and I believe the default setting is too low even for normal workloads.
With that in mind, I’ve got 600 permutations of filesystem tests, which need to be run four times to check each mount option, which again need to be run three times to check each IO scheduler.
I’ll use the same methodology to test ext4 which is going to be a lot easier due to fewer knobs, but, I believe XFS is still going to win based on some earlier testing I did.
In this quick test, I increased deletes from about 5.3k/sec to 13.4k/sec which took a little more than six minutes. I suspect this machine will be running tests for the next few days after I write the test script.