SC06 wrap up: thoughts on what I did not see or hear

By joe

November 28, 2006 - 3 minutes read - 552 words

From the last post, you can read some of what I did see and hear. This is about what was missing.

Applications: The folks from Microsoft showed off excel running on a cluster. Some of the others showed “trivial” or booth-specific applications. These weren’t real things in most cases, they were smaller “toy” apps or models. Maybe I missed it, but I did not see many applications that demanded supercomputing. I had in 2005. Tons of them. Maybe it was the venue, or the timing, or … I dunno. Applications are the reason that SC machines exist.
Technologies that are out there in performance: Not the current crop of products, or soon-to-be-products, but technologies that will give us OOM (order of magnitude) deltas in very little time. A few years ago Clearspeed was this. As were FPGAs last year. In the latter case, this is rapidly becoming a product, and in the former case, these are now being bundled with everybodies machines, blades, … Basically there are N vendors hawking N+3 incompatible FPGA boards, where you cannot move bitfiles between them, so you have a porting effort to each.
Visualization. Hammings statement comes to mind. Compute for insight, not numbers. Visualization is a great way to couple huge amounts of information over a nearly impeadance matched interface. Unfortuately, and again, maybe I missed it, but there was precious little there that was really out there in terms of capabilities. 7-8 years ago I was driving virtual fly-throughs of large proteins and proteases at ACS conferences. The ones I saw here were not even to that level.
Reasonable use cases and data for benchmarks. Here and numerous other locations I have asked users to give me reasonable use cases for storage and file servers. It always seems to degerenate into “tell me how fast the disks are” regardless of what the real use case is. This is odd as the only thing really worth asking is how your use case and other use cases you are really interested in perform. If you care about raw disk speed in corner case scenarios, sure, this might be interesting. But does it matter if your use cases never touch these corner cases? So a number of vendors put data out. We tried to compare our little JackRabbit against their offerings. While we are happy that our unit appears to out perform these other units in the test cases, we aren’t sure how this matters to the use cases. This may be due to the almost complete decoupling of use case from benchmark. In other areas, vendor reps happily regurgitated what their marketeers built for them. When pressed hard, they admitted what we knew. You are not going to get an OOM or more unless you can address all issues in a code or use case. For example: I have a code where 98% of its time is spent in one routine. Even if I can magically make that time go to 0 (I cannot), that 2% left over is going to limit your performance. 50x isn’t a bad speedup … until you realize, as in the last post, that this is your performance limiting factor. Well there is more that i didn’t see or hear, so I will try to fill in additional bits later on.