Below you will find pages that utilize the taxonomy term “Hardware”
Posts
With every update, MacOSX becomes harder to build for
Way back in the good old 90s, we had very different versions of various unix systems. SunOS/Solaris, Irix, AIX, HP/UX, this upstart Linux, and some BSD things floating about. Of course, windows NT and others were starting to peek out then, and they had a “POSIX subsystem”.
Cross platform builds were generally speaking, a nightmare. While POSIX is a spec, writing to it didn’t guarantee that your application would work on a range of machines and OSes.
Posts
#HPC in all the things
I read this announcement this morning. Our friends at Facebook releasing their reduced precision server side convolution and GEMM operations.
Many years ago, I tried to convince people that HPC moves both down market, into lower cost hardware, as well as more widely into more software toolchains. Basically, the decades of experience building very high performance applications and systems will have value downstream for many users over time.
GEMM is a generalized approach to a matrix multiply, which has been well optimized for HPC applications in various scientific libraries over time.
Posts
Looking forward to #SC18 next week and a discussion of all things #HPC
I’m attending SC18 next week. It’s been 3 years since I last attended (2015). Then we (@scalableinfo) had a large booth, lots of traffic, and showed off some of the first commercial NVMe high performance storage systems running BeeGFS over 100GbE.
I am looking forward to talking with as many people as I can, to get their perspectives on things. To see what they are thinking, hear what they are doing, and in which direction they are going.
Posts
Working on benchmarking ML frameworks
Nice machine we have here …
root@hermes:/data/tests# lspci | egrep -i '(AMD|NVidia)' | grep VGA 3b:00.0 VGA compatible controller: <a href="http://www.pny.com/nvidia-quadro-gp100">NVIDIA Corporation GP100GL</a> (rev a1) 88:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] <a href="http://www.tomshardware.com/reviews/amd-radeon-vega-frontier-edition-16gb,5128.html">Vega 10 XTX</a> [Radeon Vega Frontier Edition] I want to see how tensorflow and many others run on each of the cards. The processor is no slouch either:
root@hermes:/data/tests# lscpu | grep "Model name" Model name: Intel(R) Xeon(R) Gold 6134 CPU @ 3.
Posts
Oracle finally kills off Solaris and SPARC
This was making the rounds last week. Oracle seems to have a leak in its process, creating labels that trigger event notifications for people, for their packages. Solaris was decimated. More details at the links and at The Layoff. Honestly I had expected them to reach this point. I am guessing that they were contractually obligated for at least 7 years to provide Solaris/SPARC support to US government purchasers. SGI went through a similar thing with IRIX.
Posts
M&A and business things
First up, Tegile was acquired by Western Digital (WDC). This is in part due to WDC’s desire to be a one stop shop vertically integrated supplier for storage parts, systems, etc. This is how all of the storage parts OEMs needed to move, though Seagate failed to execute this correctly, selling off their array business in part to Cray. Toshiba … well … they have some existential challenges right now, and are about to sell off their profitable flash and memory systems business, if they can just get everyone to agree … This comes from the fact that spinning disk, while a venerable technology, has been effectively completely commoditized.
Posts
Cray "acquires" ClusterStor business unit from Seagate
Information at this link. It is being called a “strategic transaction”, though it likely came about vis-a-vis Seagate doing some profound and deep thinking over what business it was in. Seagate has been weathering a storm, and has been working on re-orgs to deal with a declining disk market. They acquired ClusterStor as part of a preceding transaction of Xyratex. Xyratex was the basis for the Cray storage platforms (post Enginio).
Posts
What reduces risk ... a great engineering and support team, or a brand name ?
I’ve written about approved vendors and “one throat to choke” concept in the past. The short take from my vantage point as a small, not well known, but highly differentiated builder of high performance storage and computing systems … was that this brand specific focus was going to remove real differentiated solutions from market, while simultaneously lowering the quality and support of products in market. The concept of brand and marketing of a brand is about erecting barriers to market entry against the smaller folk whom might have something of interest, and the larger folk who might come in with a different ecosystem.
Posts
I always love these breathless stories of great speed, and how VCs love them ...
Though, when I look at the “great speed”, it is often on par with or less than Scalable Informatics sustained years before. From 2013 SC13 show, on the show floor, after blasting through a POC at unheard of speed, and setting long standing records in the STAC-M3 benchmarks …
Article in question is in the Register. Some of the speeds and feeds:
* 200 microsecs latency * 45GBps read bandwidth * 15GBps write bandwidth * 7 million IOPS But then … a fibre connection.
Posts
pcilist: because sometimes you really, really need to know how your PCIe devices are configured
If you don’t know what I am talking about here, that’s fine. I’ll assume you don’t do hardware, or you call someone else when there is a hardware problem. If you think “well gee, don’t we have lspci? so why do we need this?” then you probably have not really tried to use lspci to find this information, or didn’t know it was available. Ok … what I am talking about.
Posts
Brings a smile to my face ... #BioIT #HPC accelerator
Way way back in the early aughts (2000’s), we had built a set of designs for an accelerator system to speed up things like BLAST, HMMer, and other codes. We were told that no one would buy such things, as the software layer was good enough and people didn’t want black boxes. This was part of an overall accelerator strategy that we had put together at the time, and were seeking to raise capital to build.
Posts
Another article about the supply crisis hitting #SSD, #flash, #NVMe, #HPC #storage in general
I’ve been trying to help Scalable Informatics customers understand these market realities for a while. Unfortunately, to my discredit, I’ve not been very successful at doing so … and many groups seem to assume supply is plentiful and cheap across all storage modalities. Not true. And not likely true for at least the rest of the year, if not longer. This article goes into some depth that I’ve tried to explain to others in phone conversations, private email threads.
Posts
Architecture matters, and yes Virginia, there are no silver bullets for performance
Time and time again, the day job had been asked to discuss how the solutions are differentiated. Time and time again, we showed benchmarks on real workloads that show significant performance deltas. Not 2 or 3 sigma measurements. More often than not, 2x -> 10x better. Yet … yet … we were asked, again and again, how we did it. We pointed to our architecture. But, they complained, isn’t it the same as X (insert your favorite volume vendor here)?
Posts
Another itch scratched
So there you are, with many software RAIDs. You’ve been building and rebuilding them. And somewhere along the line, you lost track of which devices were which. So somehow you didn’t clean up the last build right, and you thought you had a hot spare … until you looked at /proc/mdstat … and said … Oh … So. I wanted to do the detailed accounting, in a simple way. I want the tool to tell me if I am missing a physical drive (e.
Posts
Another fun bit of debugging
Ok … so here you are doing a code build. Your environment is all set. You have ample space. Lots of CPU, lots of RAM. All packages are up to date. You start your make. You have another window open with dstat running, just to kinda, sorta watch the system, while you are doing other things. And while you are working, you realize dstat has stopped scrolling. Strange, why would that be.
Posts
On expectations
This has happened multiple times over the last few months. Just variations on the theme as it were, so I’ll talk about the theme. The day job builds some of the fastest systems for storage and analytics in market. We pride ourselves on being able to make things go very … very fast. If its slow, IMO, its a bug. So we often get people contacting us with their requirements. These requirements are often very hard for our competitors, and fairly simple for us to address.
Posts
@scalableinfo 60 bay Unison with these: 3.6PB raw per 4U box
Color me impressed … Seagate and their 60TB 3.5inch SAS drive. Yes, the 60 bay Unison units can handle this. That would be 3.6PB per 4U unit. 10x 4U per 48U rack. 36PB raw per rack. 100PB in 3 racks, 30 racks for an exabyte (EB). The issue would be the storage bandwidth wall height. Doing the math, 60TB/(1GB/s) -> 6 x 104 seconds to empty/fill such a single unit. We can drive these about 50GB/s in a box, so a single box would be 3600TB/(50GB/s) or 7.
Posts
Raw Unapologetic Firepower: kdb+ from @Kx
While the day job builds (hyperconverged) appliances for big data analytics and storage, our partners build the tools that enable users to work easily with astounding quantities of data, and do so very rapidly, and without a great deal of code. I’ve always been amazed at the raw power in this tool. Think of a concise functional/vector language, coupled tightly to a SQL database. Its not quite an exact description, have a look at Kx’s website for a more accurate one.
Posts
Systemd and non-desktop scenarios
So we’ve been using Debian 8 as the basis of our SIOS v2 system. Debian has a number of very strong features that make it a fantastic basis for developing a platform … for one, it doesn’t have significant negative baggage/technical debt associated with poor design decisions early on in the development of the system as others do. But it has systemd. I’ve been generally non-committal about systemd, as it seemed like it should improve some things, at a fairly minor cost in additional complexity.
Posts
Talk from #Kxcon2016 on #HPC #Storage for #BigData analytics is up
See here, which was largely about how to architect high performance analytics platforms, and a specific shout out to our Forte NVMe flash unit, which is currently available in volume starting at $1 USD/GB. Some of the more interesting results from our testing:
* 24GB/s bandwidth largely insensitive to block size. * 5+ Million IOPs random IO (5+MIOPs) sensitive to block size. * 4k random read (100%) were well north of 5M IOPs.
Posts
new SIOS feature: compressed ram image for OS
Most people use squashfs which creates a read-only (immutable) boot environment. Nothing wrong with this, but this forces you to have an overlay file system if you want to write. Which complicates things … not to mention when you overwrite too much, and run out of available inodes on the overlayfs. Then your file system becomes “invalid” and Bad-Things-Happen(™). At the day job, we try to run as many of our systems out of ram disks as we can.
Posts
It is 2016 ... why am I fighting with LDAP authentication in linux? Why doesn't it just work?
Ok … very long story that boils down to us trying to help a customer out. I am trying to avoid the “lets just add another user to /etc/passwd” or similar such thing. And they aren’t quite ready to hook into AD or similar. So we have this issue. I want to enable their nodes to use ldap. I’ve done this before for other customers with older tools (pam_ldap, etc.). But it was somewhat crazy (as in non-trivial), involving gnashing of teeth, gums, etc.
Posts
Not even breaking a sweat: 10GB/s write to single node Forte unit over 100Gb net #realhyperconverged #HPC #storage
TL;DR version: 10GB/s write, 10GB/s read in a single 2U unit over 100Gb network to a backing file system. This is tremendous. The system and clients are using our default tuning/config. Real hyperconvergence requires hardware that can move bits to/from storage/networking very quickly. This is that. These units are available. Now. In volume. And are very reasonably priced (starting at $1USD/GB). Contact us for more details. This is with a file system …
Posts
Massive unapologetic storage firepower part 4: On the test track with a Forte unit ... vaaaaROOOOOOMMMMMMM!!!!!
I am trying to help people conceptualize the experience. Here is a video depicting very fast, very powerful cars and their sound signatures.
This is a good start. Take one of those awesome machines, and turn off half the engine. So it is literally running with 1/2 of its power turned off. Remember this. There will be a quiz. As we flippantly noted in the video, this is face-melting performance. Had I any hair left, it would have been blown way back.
Posts
When infinite resources aren't, and why software assumes they are infinite
We’ve got customers with very large resource machines. And software that sees all those resources and goes “gimme!!!!”. So people run. And then more people use it. And more runs. Until the resources are exhausted. And hilarity (of the bad kind) ensues. These are firedrills. I get an open ticket that “there must be something wrong with the hardware”, when I see all the messages in console logs being pulled in from ICL saying “zOMG I am out of ram ….
Posts
M&A: NetApp grabs SolidFire
This one has been in the rumor mill for a while. NetApp has been needing something to play well in the all flash array space, and it now has something. This said, the array space is very much on the decline certainly with respect to dumb JBODs and smart “filer heads”. That design is being retired in favor of smarter and hyperconverged systems. Such as Unison with Ceph, Forte, and related HCI (hyper converged infrastructure) systems.
Posts
Testing a new @scalableinfo Unison #Ceph appliance node for #hpc #storage
Simple test case, no file system … using raw devices, what can I push out to all 60 drives in 128k chunks. Actually this is part of our burn-in test series, I am looking for failures/performance anomalies.
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 1 95 5 0 0| 513M 0 | 480B 0 | 0 0 | 10k 20k 4 2 94 0 0 0| 0 0 | 480B 0 | 0 0 |5238 721 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4913 352 0 2 98 0 0 0| 0 0 | 570B 90B| 0 0 |4966 613 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4912 413 0 2 98 0 0 0| 0 0 | 584B 92B| 0 0 |4965 334 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4914 306 0 2 98 0 0 0| 0 0 | 636B 147B| 0 0 |4969 483 0 2 98 0 0 0| 0 0 | 570B 0 | 0 0 |4915 377 8 8 50 32 0 2|7520k 8382M| 578B 0 | 0 0 | 76k 215k 9 7 30 52 0 3|8332k 12G| 960B 132B| 0 0 | 109k 279k 10 5 29 53 0 2|4136k 12G| 240B 0 | 0 0 | 109k 277k 12 6 29 51 0 2|4208k 12G| 240B 0 | 0 0 | 108k 280k 11 6 31 50 0 2|2244k 12G| 330B 90B| 0 0 | 109k 281k 11 6 30 50 0 3|2272k 13G| 240B 0 | 0 0 | 110k 281k Writes around 12.
Posts
Video interview: face melting performance in #hpc #nvme #storage @scalableinfo
Oh no … we didn’t say “face melting” … did we? Oh. Yes. We. Did. The interview is here at the always wonderful InsideHPC.com You can see the video itself here on YouTube, but read Rich’s transcript. I was losing my voice, and he captured all of the interview in text. Take home messages: Insane IO/Networking/processing performance, small footprint, tiny price, available for orders now.
Posts
There are no silver bullets, 2015 edition
In Feb 2013, I opined (with some measure of disgust) that people were looking at various software packages as silver bullets, these magical bits of a stack which could suddenly transform massive steaming piles of bits (big … uh … “data” ?) into golden nuggets of actionable data. Many of the “solutions” marketed these days are exactly like that … “add our magic bean software to your pipeline and you will gain insight faster.
Posts
A wonderful read on metrics, profiling, benchmarking
Brendan Gregg’s writings are always interesting and informative. I just saw a link on hacker news to a presentation he gave on “Broken Performance Tools”. It is wonderful, and succinctly explains many thing I’ve talked about here and elsewhere, but it goes far beyond what I’ve grumbled over. One of my favorite points in there is slide 83. “Most popular benchmarks are flawed” and a pointer to a paper (easy to google for).
Posts
Shiny #HPC #storage things at #SC15
Assuming everything goes as planned (HA!) we should have a number of very cool things at SC15.
* 100Gb [Unison storage system with BeeGFS](https://scalableinformatics.com/unison) * 100Gb [Unison Ceph](https://scalableinformatics.com/unison) system * 100Gb connection to a partner/customer booth * Forte 100Gb is awesome. The first time I ran an iperf bidirectional test, saw 20GB/s … it blew me away. 40/56GbE is old hat now, and 10GbE is in the rapidly receding past.
Posts
Cat peeking out of bag: Schedule of presentations and talks in our booth for SC15 is up
I mentioned previously that we have some new (shiny) things … and it looks like you’ll be able to hear about them at my talk. See the schedule for timing information. This said, please note that we have a terrific line up of people giving talks:
Fintan Quill from Kx on kdb+ … which is an awesome market leading Big Data Time Series analytics and database tool that runs absolutely balls-out insanely fast on our architecture Christian Mohrbacher from Thinkparq on BeeGFS … the primary parallel file system we are leveraging for Unison parallel file system appliances * Mark Nelson from Inktank/Red Hat on Ceph … the reliable block and object storage system that we’ve built into our Unison Object/Block Storage appliance * Doug Eadline from Basement Supercomputing on Hadoop, and likely showing a Limulus deskside Hadoop appliance * Phil Mucci from Minimal Metrics on optimization problems for systems and code.
Posts
M&A: EMC gobbled by Dell
Need to think how this will play out. The Register’s take is here. It seems that this will solve the “shareholder value” problem indicated by Elliot Management (e.g. they wanted more return on their investment). As part of the increasing the return and value return to shareholders, EMC had been in a cost cutting mode. Layoffs have been in process, and likely products trimmed or refocused. Once this goes through (assuming regulators won’t protest), Dell will have
Posts
As the benchmark cooks
We are involved in a fairly large benchmark for a potential customer. I won’t go into many specifics, though I should note that lots of our Unison units are involved. Current architecture has 5 storage nodes (6th was temporarily removed to handle a customer issue). Each Unison node has a pair of 56GbE NICs, as well as our appliance OS, and bunches of other goodness (quite a bit of flash). Total capacity for test is of order 200TB of flash.
Posts
rebuilding our kernel build system for fun and profit
No, really mostly to clean up an accumulation of technical debt that was really bugging the heck out of me. I like Makefiles and I cannot lie. So I like encoding lots of things in them. But it wound up hardwiring a number of things that shouldn’t have been hardwired. And made the builds brittle. When you have 2 released/supported kernels, and a handful of experimental kernels, it gets hard making changes that will be properly reflected across the set.
Posts
Drama at Violin Memory
Violin has had a rather tumultuous time in market. Post IPO, they’ve not had a great time selling. They have an interesting product, but with SanDisk coming out with their kit, and many others in the competitive flash array space, this can’t be a fun time for them. They don’t have a large installed base to protect, and their competitors are numerous and fairly well funded. Add to the mix that, as a post-IPO public company, they no longer have the luxury of not hitting targets … they will get slaughtered in the market.
Posts
On storage unicorns and their likely survival or implosion
The Register has a great article on storage unicorns. Unicorns are not necessarily mythical creatures in this context, but very high valuation companies that appear to defy “standard” valuation norms, and hold onto their private status longer than those in the past. That is, they aren’t in a rush to IPO or get acquired.
The article goes on to analyze the “storage” unicorns, those in the “storage” field. They admix storage, nosql, hyperconverged, and storage as a service.
Posts
Imitation and repetition is a sincere form of flattery
A few years ago, we demonstrated some truly awesome capability in single racks and on single machines. We had one of our units (now at a customer site), specifically the unit that set all those STAC M3 records, showing this:
and a rack of our units (now providing high performance cloud service at a customer site)
for 8k random reads across 0.25 PB of storage on a very fast 40GbE backbone.
Posts
diagnostics
This is something of a hard post to write, for a number of reasons, not the least of which is that the topic comes as something of a surprise to me. I am just going to state it, and then discuss it. The vast majority of people (and companies) out there, whom think they know something of hardware/software/system level diagnostics and problem identification (from newbie to “veteran”) are either full of it, or really clueless.
Posts
Booth at BioIT World 15 in Boston
Should be fun, we will have booth (#461) on the side near the thoroughfare for the talks. Our HPC on Wall Street booth looked like this:
[ ](/images/HPConWS-booth-spring2015.jpg)
The display on the monitor is from our FastPath Cadence machine, and is part of the performance dashboard, built upon InfluxDB, Grafana, sios-metrics, and influxdbcli. Here is a blown up view, note the vertical axes for BW (GB/s) and IOPs.
[ ](/images/cadence-dash-spring2015.jpg)
Posts
The worlds fastest hyper-converged appliance is faster and more affordable than ever
This is a very exciting hyper-converged system, representing our next generation of time series, and big data analytical systems. Tremendous internal bandwidths coupled with massive internal parallelism, and minimal latency design on networks. This unit has been designed to focus upon delivering the maximal performance possible in an as minimal footprint … both rack based and cost wise … as possible. You can use these as independent stand alone units, integrate them into a larger FastPath Unison system We have our software stack (SIOS) integrated onto each unit, and include our builds of Python + Pandas/SciPy/NumPy, R, and Perl.
Posts
Interesting Q1 so far for day job
Our Q1 is usually quiet, fairly low key. Not this one. Looks like lots of pent up demand. We are deep into record territory, running 200+% of normal, with possibility of more. Another new wrinkle is that our small investment round is mostly complete. This is new territory for us, and you may have noticed I’d backed off posting intensity over the last half year or so while this was going on.
Posts
love/hate relationship with new hardware
One of the dangers of dealing with newer hardware is often that, it doesn’t work so well. Or the drivers get hosed in mysterious ways. We’ve got some nice shiny new 10GbE cards for a set of Unison systems going into a customer next week. We had some very odd issues with other 10GbE cards, so we rolled over to newer design cards. Younger silicon, younger design. Newer kernel module. I can’t say I am enjoying this experience thus far.
Posts
When the revolution hits in force ...
Our machines will be there, helping power the genomics pipelines to tremendous performance. Performance is an enabling feature. Without it you cannot even begin to hope to perform massive scale analytics. With it, you can dream impossible dreams. This article came out talking about a massive performance analytics pipeline at Nationwide Children’s Hospital in Ohio. This pipeline runs on a cluster attached to Scalable Informatics FastPath Unison storage. This is a very dense, very fast system, interconnected with Mellanox FDR Infiniband, Chelsio 40GbE, and leveraging BeeGFS from thinkparq.
Posts
Finally, a desktop Linux that just works
I’ve been a user of Linux on the desktop, as my primary desktop, for the last 16 years. In that time, I’ve had laptops with Windows flavors (95, XP, 2000, 7), a MacOSX desktop. Before that, my first laptop I had bought (while working on my thesis) was a triple boot job, with DOS, Windows 9x, and OS2. I used the latter for when I was traveling and needed to write; the thesis was written in LaTeX and I could easily move everything back and forth between that and my Indy at home, and my office Indigo.
Posts
Coraid may be going down
According to The Register. No real differentiation (AoE isn’t that good, and the Seagate/Hitachi network drives are going to completely obviate the need for such things). We once used and sold Coraid to a customer. The linux client side wasn’t stable. iSCSI was coming up and was actually quite a bit better. We moved over to it. This was during our build vs buy phase. We weren’t sure if we could build a better box.
Posts
Inventory reduction @scalableinfo
Its that time of year, when the inventory fairies come out and begin their counting. Math isn’t hard, but the day job would like a faster and easier count this year. So, the day job is working on selling off existing inventory. We have 4 units ready to go out the door to anyone in need of 70-144TB usable storage at 5-6 GB/s per unit. Specs are as follows:
16-24 processor cores 128 GB RAM 48x {2,3,4} TB top mount drives 4x rear mount SSDs (OS/metadata cache) Scalable OS (Debian Wheezy based Linux OS) 3 year warranty As this is inventory reduction, the more inventory you take, the happier we are (and the less work that the inventory fairies have to do).
Posts
Systemd, and the future of Linux init processing
An interesting thing happened over the last few months and years. Systemd, a replacement init process for Linux, gained more adherents, and supplanted the older style init.d/rc scripting in use by many distributions. Ubuntu famously abandoned init.d style processing in favor of upstart and others in the past, and has been rolling over to systemd. Red Hat rolled over to Systemd. As have a number of others. Including, surprisingly, Debian. For those whom don’t know what this is, think of it this way.
Posts
#SC14 day 2: @LuceraHQ tops @scalableinfo hardware ... with Scalable Info hardware ...
Report XTR141111 was just released by STAC Research for the M3 benchmarks. We are absolutely thrilled, as some of our records were bested by newer versions of our hardware with newer software stack. Congratulations to Lucera, STAC Research for getting the results out, and the good folks at McObject for building the underlying database technology. This result continues and extends Scalable Informatics domination of the STAC M3 results. I’ll check to be sure, but I believe we are now the hardware side of most of the published records.
Posts
Starting to come around to the idea that swap in any form, is evil
Here’s the basic theory behind swap space. Memory is expensive, disk is cheap. Only use the faster memory for active things, and aggressively swap out the less used things. This provides a virtual address space larger than physical/logical memory. Great, right? No. Heres why.
swap makes the assumption that you can always write/read to persistent memory (disk/swap). It never assumes persistent memory could have a failure. Hence, if some amount of paged data on disk suddenly disappeared, well … Put another way, it increases your failure likelihood, by involving components with higher probability of failure into a pathway which assumes no failure.
Posts
30TB flash disk, Parallel File System, massive network connectivity
This will be fun to watch run …
Scalable Informatics FastPath Unison for the win!
Posts
SC14 T minus 6 and counting
Scalable’s booth is #3053. We’ll have some good stuff, demos, talks, and people there. And coffee. Gotta have the coffee. More soon, come by and visit us!
Posts
massive unapologetic firepower part 2 ... the dashboard ...
For Scalable Informatics Unison product. The whole system:
[ ](/images/dash-2.png)
Watching writes go by:
[ ](/images/dash-3.png)
Note the sustained 40+ GB/s. This is a single rack sinking this data, and no SSDs in the bulk data storage path. This dashboard is part of the day job’s FastPath product.
Posts
Updated boot tech in Scalable OS (SIOS)
This has been an itch we’ve been working on scratching a few different ways, and its very much related to forgoing distro based installers. Ok, first the back story. One of the things that has always annoyed me about installing systems has been the fundamental fragility of the OS drive. It doesn’t matter if its RAIDed in hardware/software. Its a pathway that can fail. And when it fails, all hell breaks loose.
Posts
Interesting bits around EMC
In the last few days, issues around EMC have become publicly known. EMC is the worlds largest and most profitable storage company, and has a federated group of businesses that are complementary to it. The CEO, Joe Tucci, is stepping down next year, and there is a succession “process” going on. Couple this to a fundamental shift in storage, from arrays to distributed tightly coupled server storage, such as Unison, which is problematic for their core business.
Posts
Comcast finally fixed their latency issue
This has been a point of contention for us for years. Our office has multiple network attachments, using Comcast is part of it. This is the main office, not the home office. Latency on the link, as measured by DNS pings, have always been fairly high, in the multiple 2-3ms region, as compared to our other connection (using a different provider and a different technology) which has been consistently, 0.5ms for the last 2 years.
Posts
Soon ... 12g goodness in new chassis
This is one of our engineering prototypes that we had to clear space for. A couple of new features I’ll talk about soon, but you should know that these are 12g SAS machines (will do 6g SATA of course as well).
Front of unit:
[ ](/images/IMG_2330.JPG)
Note the new logo/hand bar. The rails are also brand new, and are set to enable easy slide in/out even with 100+ lbs of disk in them.
Posts
M&A: PLX snarfed by ... Avago ?
Ok, didn’t see this acquirer coming, but PLX being bought … yeah, this makes sense. Avago looks like they are trying to become the glue between systems, whether the glue is a data storage fabric, or communications fabric, etc. PLX makes PCIe switches and other kit. PCIe switch and interconnection is the direction that many are converging to. Best end to end latencies, best per-lane performance, no protocol stack silliness to deal with.
Posts
Selling inventory to clear space
[Update 16-June] We’ve sold the 64 bay FastPath Cadence (siFlash based) , and now we have a few more 60 bay hybrid Ceph and FhGFS units, as well as a 48 bay front mount siFlash. Whats coming in are many of our next gen 60 bay units, with a new backplane design, and we want to start running benchmarks with them ASAP. As we have limited space in our facility, we gotta make hard choices … Email me (landman@scalableinformatics.
Posts
Massive, unapologetic, firepower: 2TB write in 73 seconds
A 1.2PB single mount point Scalable Informatics Unison system, running an MPI job (io-bm) that just dumps data as fast as the little Infiniband FDR network will allow. Our test case. Write 2TB (2x overall system memory) to disk, across 48 procs. No SSDs in the primary storage. This is just spinning rust, in a single rack. This is performance pr0n, though safe for work.
usn-01:/mnt/fhgfs/test # df -H /mnt/fhgfs/ Filesystem Size Used Avail Use% Mounted on fhgfs_nodev 1.
Posts
Doing what we are passionate about
I am lucky. I fully admit this. There are people out there whom will tell you that its pure skill that they have been in business and been successful for a long time. Others will admit luck is part of it, but will again, pat themselves on the back for their intestinal fortitude. Few will say “I am lucky”. Which is a shame, as luck, timing (which you can never really, truly, control), and any number of other factors really are critical to one being able to have the luxury of doing what we are doing.
Posts
We had a record setting, knock the barn doors down year last year
… and believe it or not, I forgot to mention it. This is the first time in company history that we had a backlog going into Q1. Orders being built and tested on the last work day of the year. We grew, not the amount we had originally forecast, but we understand why (and sadly have little control over that aspect). We are working very hard on our appliances … I am blown away as to how perfect a fit they are for folks.
Posts
Yay, latest Java update broke Supermicro remote console
JRE 7 u 51. Self signed Java console applet. Let the hilarity begin. I tried uploading our own cert and key to the unit. No luck. Its the applet the needs to be re-signed. This is the joyous message that awaits:
Of course, the IPMIview tool sorta kinda works. Though its useless for remote support ops. Doesn’t set off the signed issue. Mebbe they ignore signing? Which is worse … the self signed cert, or the sign ignoring app.
Posts
Calxeda restructures
The day job had been talking to and working with Calxeda for a while. They’ve been undergoing some changes over the last few months as they worked to transition from an evangelist to a systems builder. The day job just got a note that they are restructuring. What this specifically means to an outsider, I am not sure, though I could speculate. HP has a vested interest in them. I wouldn’t be surprised to see a rapid asset acquisition.
Posts
Day job at HPC on Wall Street on Monday the 9th
We’ll be showing off 2 appliances, with a change of what we are showing/announcing on one due to something not being ready on the business side. The first one is our little 108 port siRouter box. Think ‘bloody fast NAT’ and SDN in general, you can run other virtual/bare metal apps atop it.
The second will be a massive scale parallel SQL DB appliance. Usable for big data, hadoop like workloads, and other similar workloads more commonly used on other well known platforms.
Posts
... and the positions are now, finally open ...
See the Systems Engineering position here, and the System Build Technician position here. I’ll get these up on the InsideHPC.com site and a few others soon (tomorrow). But they are open now. For the Systems Engineering position, we really need someone in NYC area with a strong financial services background … Doug made me take out the “able to leap tall buildings in a single bound” line, as well as the “must be able to talk customers through complex vi sessions on system configuration files while driving 70 mph on a highway.
Posts
Massive. Unapologetic. Firepower. 24GB/s from siFlash
Oh yes we did. Oh yes. We did. This is the fastest storage box we are aware of, in market. This is so far outside of ram, and outside of OS and RAID level cache …
[root@siFlash ~]# fio srt.fio ... Run status group 0 (all jobs): READ: io=786432MB, aggrb=23971MB/s, minb=23971MB/s, maxb=23971MB/s, mint=32808msec, maxt=32808msec This is 1TB read in 40 seconds or so. 1PB read in 40k seconds (1/2 a day).
Posts
Updated DeltaV4 quick benchies
Streaming reads and writes. Far beyond memory/cache/… all spinning disk. Remember, this is our “slow” storage.
[root@dv4-1 ~]# df -h /data Filesystem Size Used Avail Use% Mounted on /dev/md2 55T 65G 55T 1% /data Run status group 0 (all jobs): WRITE: io=65505MB, aggrb=1467.7MB/s, minb=1467.7MB/s, maxb=1467.7MB/s, mint=44633msec, maxt=44633msec Run status group 0 (all jobs): READ: io=65412MB, aggrb=1814.5MB/s, minb=1814.5MB/s, maxb=1814.5MB/s, mint=36050msec, maxt=36050msec
Posts
We built that: 10 years in business
[warning: longer post] I mentioned this on twitter (@sijoe). The day job has been in business for 10 years. We’ve not taken outside investment to date, and we’ve not sold the company yet. We’ve been profitable and growing continuously during our lifetime. The preceding 3 years have seen growth, accelerating hard. The company was built starting with a conviction that practitioners and users of HPC systems needed better designs, better systems than were being pushed out by traditional vendors in the early 2000’s.
Posts
the mystery of the week
Customer has had a machine for a while. Generally stable. Followed our advice on doing a reboot recently. Unit started crashing Monday. Then today. Hard to stay up and stable. I asked if anything has changed, and haven’t gotten anything conclusive … mostly “we don’t think so”. About the crashes: Nothing in the logs. Not a thing. No hardware subsystem, which has logging enabled (RAID, motherboard, PCIe, IPMI, … ) reports an error.
Posts
2 out of 3 ain't bad
No, not Meatloaf lyrics. A few years ago, I guessed that the HPC market was going to bifurcate or possibly trifurcate. Well, its about 3 years on, and bifurcate it did. Accelerators (in the form of GPUs) are everywhere. I was dead on correct in almost every aspect of what I had predicted (privately to VCs, from whom we couldn’t raise a cent in the early/mid 2000’s for this market). Remote cluster/clouds with dropping prices per CPU hour are taking over sections of HPC, and we see some impact upon purchase decisions made by people buying clusters.
Posts
Is this really a good idea?
ikoniLooks like HP is looking at ditching its PCs. First off, they are definitely killing off WebOS and the whole Palm business. Ok … WebOS looked interesting. Now having an Android, and an iPhone (about to be retired, which the Android is replacing), I find it hard to put down the iPhone and get excited about Android. I have a sense of … a less polished integration. Some things don’t work very well in Android.
Posts
Day job PR: JRTI and Scalable Informatics Form Strategic Partnership
Will be up on the day job site tomorrow. We are very excited by these developments, and look forward to a productive relationship
JRTI and Scalable Informatics Form Strategic Partnership to Provide High Performance Storage and CPU & GPU Clusters to Organizations Seeking Exceptional Results Richmond, Virginia (January 18, 2011)-James River Technical, Inc (JRTI), specialists in accelerated and HPC solutions for the higher education, research, government, and commercial market segments, has entered into a reseller agreement with Scalable Informatics (Scalable) to provide Storage and HPC solutions throughout North America.
Posts
What is going on with SGI?
We are hearing about SGI wins on HPCwire and other venues. These should be good, and reflective in the stock price.
[ ](http://ichart.finance.yahoo.com/z?s=SGIC&t=2y&q=l&l=on&z=l&p=s&a=v&p=s)
But they aren’t. SGI’s market cap is 90.6M as of this morning, with 1500+ employees. Trailing 12 month revenue is 415M. They have 85M of debt. About 33.2M in cash. Something has got to give here. As they stopped making their own stuff, COGS increased, as their suppliers made more margin off completed product.
Posts
Tilera
Over at Accelerated Times, an article was posted about the Tilera. Now I haven’t heard much about Tilera, other than pre-releases. [update: look at the comment here] The author focuses on several important aspects. The business model, the money raise, are they are where they say they are.
What strikes me is that if they raised a B-round, this usually … usually happens post initial revenue, when you start to see interest and traction.
Posts
Guide to getting OFED 1.2 to build on OpenSuSE
Grab the tarball from the open fabrics alliance (or from here)
Grab the build_new.sh from here, place it in the OFED-1.2 directory as root on your machine mv /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h.original ln -s /usr/include/linux/miscdevice.h /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h Then run the build_new.sh. Voila. Works. Binary RPMs are here.
Posts
HPC in the critical path
Is high performance computing a critical path technology? Is it a technology that you cannot do without? This is a question some potential partners were discussing this evening. Very interesting question. If HPC is not critical, then demand for it should be quite moderate. If it is not critical, then the market would have basically replacement level growth rates. If end users did not see a value in HPC, they wouldn’t use it, as their time would be spent elsewhere.
Posts
Amusing
The IBM folks have turned the Blue Gene into what they claim is the worlds fastest blast engine. Interesting read. They use our A. thaliana data in the Bioinformatics Benchmark System v3 (BBS) to perform their measurement, as well as data from Aaron Darling for mpiBLAST. Our data had been in a mislabeled file for years, and I never took the time to rename the S. lycopersicum for the original Arabidopsis.