Following Chris Samuel’s suggestion, I pulled down version 1.96 of bonnie and built it. The machine I am using now is a Scientific Linux based system, with Scalable Informatics 18.104.22.168 kernel. Scientific Linux is yet another RHEL rebuild. This is a customer requested distribution for this machine.
SL suffers from the RHEL kernel, which is IMO inappropriate for use as a high performance storage system kernel. Workload patterns our customers wish to test regularly crash the RHEL distro kernels. These kernels are missing features they need (like xfs), and have numerous misfeatures (4k stacks, backports, …) which do compromise stability under heavy load. I have been informed that Redhat has started supporting xfs, in large part due to customer demand, but to specific named accounts only. This is unfortunate. Then again, as SGI has been the primary developer of xfs, and SGIs future had/has been in doubt, it would be prudent to look for alternatives. Ext4 and btrfs are obvious choices, but both are too early for serious consideration for large data storage. Pohmelfs, and nilfs2 are also somewhat early.
This doesn’t stop people from running benchmarks, and most of the benchmarks we have seen here are … well … not doing a good job testing what they purport to test. This was/is my bone to pick with bonnie. More in a moment.
As a benchmarker, one thing you absolutely must do is understand your measurement tool, how it interacts with your system, and what, with precision, you are actually measuring. Far too often we see and read of ‘benchmarks’ which aren’t of the system the authors claim they are. We have seen people try to benchmark I/O using cache based read/write for I/O sizes far less than ram. Which only exercises the cache and the eventual file system flush code, not the file system, nor the I/O system. Yet these ‘benchmarks’ are taken at face value, with results reported, analyzed, used to compare systems. When all you are really doing is comparing cache. The folks doing this are in good company. We have seen this in popular web sites, as well as national labs. The latter doesn’t make the failed technique any better, it just makes it that much more important to educate about.
If I want to benchmark I/O, I want, curiously, I/O to occur. If I want to benchmark computation, I want, again curiously, computation to occur. Its easy to see if I/O is occurring. Look for blinking lights on drives. If data is getting out to your drives, chances are you will visually see this.
Bonnie++ 1.96 doesn’t do cached writes very effectively. I can see this in the disk activity lights, and in the reported performance. In fact, what I am seeing looks something like this (according to bonnie)