This is at least amusing ...
By joe
- 5 minutes read - 1023 wordsSo we had the little … I dunno what to call it … fiasco, mebbe? where we were promised a reasonable comparison between a JackRabbit and a Thumper, and did not get it (a reasonable out-of-box comparison, no one I know who promises accurate comparison purposefully de-tunes one platform before comparing). I am not going to dive back into that mess. When we are paid to benchmark, or when we do it on our own, we never, ever start out by ignoring the vendor/authors on what makes it slow/fast. We ask them how to get the best performance, and we do a baseline measurement out of the box. Even more to the point, we take great pains to try to get the maximum performance of each box after baselining it.
But that is another discussion. Might be worth suggesting to SC07 in terms of a benchmarking technique tutorial. Ok. Back to the topic. One of the points we make in our white paper for JackRabbit, is that with 4 x PCIe (2 PCI-e x8, 2 PCI-e x16), we have quite a few lanes of PCI-e IO bandwidth available. 48 lanes to be precise. At 0.5 GB/s per lane, this gives us 24 GB/s sustained IO/bandwidth. Unfortunately, the vast majority of motherboard designs tie the PCIe channels to one processor. Saves them design money, allows their units to be used with a single processor. But it limits your performance to the HT performance. Still, this is on the order of 10 GB/s, so it is not terrible. But most IO is implemented in terms of memory to controller DMA, so you are memory bandwidth bound, so lets call it 6 GB/s as a “realistic” limit on most current motherboards. You report much above this, and you are likely dealing with a huge cache memory somewhere. So I read a report on a “do it yourself X4500” in which the author castigates anyone attempting to replicate the functionality of the Thumper (x4500). Their point is that with 3 independent PCI-x busses, Thumper has bandwidth to spare. Along with lots of other nice bits of engineering. Ok, lets address that. JackRabbit has PCIe RAID controllers (PCI-e x8) and hardware RAID calculations. Each PCI-e controller could use up to 24 drives. If we give these drives each a 70 MB/s read speed (common on todays units), this would require 1.7 GB/s sustained performance. The article is absolutely correct in this regard. PCI-x could sustain only about 0.8 GB/s, so you would oversubscribe this limit by a factor of two, if you used that design. Worse, if you used 3 PCI-x busses ganged together on one controller, you would be limited to PCI-x speed of 0.8 GB/s. Terrible. This is why Thumper uses 3 independent PCI-x busses. As do most modern motherboards, but the author misses this point. This said, Thumper ought to be capable, in a maxed out configuration, of pushing something on the order of 2.4 GB/s to disk. Just don’t do anything crazy, like put an Infiniband or 10GbE card in there. You have PCI-x busses. 0.8 GB/s max. No choice. Thats your limit getting out of the box. JackRabbit has PCI-e x8 controllers. Each controller is capable of 4 GB/s. Or 2 GB/s in each direction. Or put another way, 84 % of maximum I/O bandwidth per RAID card when loaded to maximum capacity. With 2 controllers, we have a sustained 8 GB/s available to us. 4 GB/s in each direction. With 48 of the 70 MB/s drives, we can get 3.4 GB/s bandwidth. Which would be 84% of I/O maximum capacity on the 2 PCI-e busses. PCI-e works differently from PCI-x. You have a maximum number of lanes, and you cannot oversubscribe bandwidth. As you can (and do) on PCI-x. Of course, we can add in I/O cards as well. PCI-e and PCI-x Infiniband, 10GbE, etc. We can open up from 0.4 GB/s to 4 GB/s per chassis out to the network to move data into or out of the box. We can add additional PCI-e controllers. With 4 PCI-e controllers, we are at 16 GB/s bandwidth for our 48 drives, 8 GB/s in each direction. So the 3.4 GB/s that 48 drives can provide is 41% of maximum. What I am saying is that JackRabbit was designed with head-room in mind. Suppose we can, through the magic of technology, get 100 MB/s out of our drives. Now 48 drives is looking at 4.8 GB/s. This would fill two controllers, and be 50% of the bandwidth of 4 controllers. For Thumper, going to 100 MB/s drives means, for their PCI-x backplane, they are bandwidth oversubscribed by about a factor of 2. Unlike JackRabbit, there is nothing they can do about it. We can add an additional controller, putting 16 drives per controller for 3 controllers, and still have 3-4 GB/s of IO available to work with for network. What this boils down to is designed-in head room. JackRabbit has huge amounts of head room, designed in. It can scale in performance where it is needed. ILOM and other bits do matter. JackRabbit does IPMI 2.0, has N+1 power supplies (thumper has 2 power supplies). JackRabbit has in-band and out of band RAID management. JackRabbit as an appliance plugs in, turns on, and is fast, talking/working with Microsoft AD/NT domains, NIS, LDAP, …, does iSCSI and NAS in a single box, user controllable and changeable at run time and on the fly. The cost of a fully configured JackRabbit with 36 TB, and about 1.2-1.5x better performance than the 24 TB Thumper, is somewhat lower than the 24 TB Thumper. The 24 TB JackRabbit with the same performance as the 36 TB JackRabbit, just smaller disks, is much lower than the equivalent Thumper. This is goodness. Several customers have them, and they seem to like them. I have some responses which are sadly non-quotable for two reasons, one being that their company doesn’t allow being quoted (under pain of being fired), and that it has lots of … colorful … language in it. The gist is that JackRabbit is (&($^%$^( fast. Really (&($^%$^( fast.