Posts

#HPC in all the things

I read this announcement this morning. Our friends at Facebook releasing their reduced precision server side convolution and GEMM operations. Many years ago, I tried to convince people that HPC moves both down market, into lower cost hardware, as well as more widely into more software toolchains. Basically, the decades of experience building very high performance applications and systems will have value downstream for many users over time. GEMM is a generalized approach to a matrix multiply, which has been well optimized for HPC applications in various scientific libraries over time.

Posts

Well ... that was fun

So … I’ve had this blog since 2005. I installed it from original sources. And WP made upgrades in the 2.x time frame, quite painless. Or so it seemed. Slowly, over time, some configuration/settings/whatever got out of whack. And with the last update, from a system originally installed in final form in 2013 or so, something broke. I am not sure what. But the symptoms were simple … new posts would replace the most recent posts.

Posts

On technology zealotry

I’ve encountered this in my career, at many places. Sadly, early in my career, I participated in some of this. You are a zealot for a particular form of tech if you can see it do no wrong, and decry reports of issues or problems as “attacks”. You are a zealot against a particular form of tech if you cannot see it as a potentially useful and valuable portion of a solution stack, and (often gleefully) amplify reports of issues or problems.

Posts

What reduces risk ... a great engineering and support team, or a brand name ?

I’ve written about approved vendors and “one throat to choke” concept in the past. The short take from my vantage point as a small, not well known, but highly differentiated builder of high performance storage and computing systems … was that this brand specific focus was going to remove real differentiated solutions from market, while simultaneously lowering the quality and support of products in market. The concept of brand and marketing of a brand is about erecting barriers to market entry against the smaller folk whom might have something of interest, and the larger folk who might come in with a different ecosystem.

Posts

What is old, is new again

Way back in the pre-history of the internet (really DARPA-net/BITNET days), while dinosaur programming languages frolicked freely on servers with “modern” programming systems and data sets, there was a push to go from a static linking programs to a more modular dynamic linking. The thought processes were that it would save precious memory, not having many copies of libc statically linked in to binaries. It would reduce file sizes, as most of your code would be in libraries.

Posts

ClusterHQ dies

ClusterHQ is now dead. They were an early container play, building a number of tools around Docker/etc. for the space. Containers are a step between bare metal and VMs. FLocker (ClusterHQ’s product) is open source, and they were looking to monetize it in a different way (not on acquisition, but on support). In this space though, Kubernetes reigns supreme. So competing products/projects need to adapt or outcompete. And its very hard to outcompete something like k8s.

Posts

A good read on realities behind cloud computing

In this article on the venerable Next Platform site, Addison Snell makes a case against some of the presumed truths of cloud computing. One of the points he makes is specifically something we run into all the time with customers, and yet this particular untruth isn’t really being reported the way our customers look at it. Sure, you are paying for the unused capacity. This is how utility models work. Tenancy is the most important measure to the business providing the systems.

Posts

Fully RAMdisk booted CentOS 7.2 based SIOS image for #HPC , #bigdata , #storage etc.

This is something we’ve been working on for a while … a completely clean, as baseline a distro as possible, version of our SIOS RAMdisk image using CentOS (and by extension, Red Hat … just need to point to those repositories). And its available to pull down and use as you wish from our download site. Ok, so what does it do? Simple. It boots an entire OS, into RAM. No disks to manage and worry over.

Posts

Raw Unapologetic Firepower: kdb+ from @Kx

While the day job builds (hyperconverged) appliances for big data analytics and storage, our partners build the tools that enable users to work easily with astounding quantities of data, and do so very rapidly, and without a great deal of code. I’ve always been amazed at the raw power in this tool. Think of a concise functional/vector language, coupled tightly to a SQL database. Its not quite an exact description, have a look at Kx’s website for a more accurate one.

Posts

new SIOS feature: compressed ram image for OS

Most people use squashfs which creates a read-only (immutable) boot environment. Nothing wrong with this, but this forces you to have an overlay file system if you want to write. Which complicates things … not to mention when you overwrite too much, and run out of available inodes on the overlayfs. Then your file system becomes “invalid” and Bad-Things-Happen(™). At the day job, we try to run as many of our systems out of ram disks as we can.

Posts

It is 2016 ... why am I fighting with LDAP authentication in linux? Why doesn't it just work?

Ok … very long story that boils down to us trying to help a customer out. I am trying to avoid the “lets just add another user to /etc/passwd” or similar such thing. And they aren’t quite ready to hook into AD or similar. So we have this issue. I want to enable their nodes to use ldap. I’ve done this before for other customers with older tools (pam_ldap, etc.). But it was somewhat crazy (as in non-trivial), involving gnashing of teeth, gums, etc.

Posts

Not even breaking a sweat: 10GB/s write to single node Forte unit over 100Gb net #realhyperconverged #HPC #storage

TL;DR version: 10GB/s write, 10GB/s read in a single 2U unit over 100Gb network to a backing file system. This is tremendous. The system and clients are using our default tuning/config. Real hyperconvergence requires hardware that can move bits to/from storage/networking very quickly. This is that. These units are available. Now. In volume. And are very reasonably priced (starting at $1USD/GB). Contact us for more details. This is with a file system …

Posts

Massive unapologetic storage firepower part 4: On the test track with a Forte unit ... vaaaaROOOOOOMMMMMMM!!!!!

I am trying to help people conceptualize the experience. Here is a video depicting very fast, very powerful cars and their sound signatures. This is a good start. Take one of those awesome machines, and turn off half the engine. So it is literally running with 1/2 of its power turned off. Remember this. There will be a quiz. As we flippantly noted in the video, this is face-melting performance. Had I any hair left, it would have been blown way back.

Posts

"Unexpected" cloud storage retrieval charges, or "RTFM"

An article appeared on HN this morning. In it, the author noted that all was not well with the universe, as their backup, using Amazon’s Glacier product, wound up being quite expensive for a small backup/restore. The OP discovered some of the issues with Glacier when they began the restore (not commenting on performance, merely the costing). Basically, to lure you in, they provide very low up front costs. That is, until you try to pull the data back for some reason.

Posts

Video interview: face melting performance in #hpc #nvme #storage @scalableinfo

Oh no … we didn’t say “face melting” … did we? Oh. Yes. We. Did. The interview is here at the always wonderful InsideHPC.com You can see the video itself here on YouTube, but read Rich’s transcript. I was losing my voice, and he captured all of the interview in text. Take home messages: Insane IO/Networking/processing performance, small footprint, tiny price, available for orders now.

Posts

A wonderful read on metrics, profiling, benchmarking

Brendan Gregg’s writings are always interesting and informative. I just saw a link on hacker news to a presentation he gave on “Broken Performance Tools”. It is wonderful, and succinctly explains many thing I’ve talked about here and elsewhere, but it goes far beyond what I’ve grumbled over. One of my favorite points in there is slide 83. “Most popular benchmarks are flawed” and a pointer to a paper (easy to google for).

Posts

Shiny #HPC #storage things at #SC15

Assuming everything goes as planned (HA!) we should have a number of very cool things at SC15. * 100Gb [Unison storage system with BeeGFS](https://scalableinformatics.com/unison) * 100Gb [Unison Ceph](https://scalableinformatics.com/unison) system * 100Gb connection to a partner/customer booth * Forte 100Gb is awesome. The first time I ran an iperf bidirectional test, saw 20GB/s … it blew me away. 40/56GbE is old hat now, and 10GbE is in the rapidly receding past.

Posts

On storage unicorns and their likely survival or implosion

The Register has a great article on storage unicorns. Unicorns are not necessarily mythical creatures in this context, but very high valuation companies that appear to defy “standard” valuation norms, and hold onto their private status longer than those in the past. That is, they aren’t in a rush to IPO or get acquired. The article goes on to analyze the “storage” unicorns, those in the “storage” field. They admix storage, nosql, hyperconverged, and storage as a service.

Posts

Gmail lossy email system

For months I’ve been noting that email to my 2 different GMail accounts (one for work on the business side using the Google Apps for business, and yes, paid for … and one for personal) are not getting all the emails sent to it. I’ve had customers reach out to me here at this site, as well as calling me up to ask me if I’ve been getting their email. Seems I’m not the only one, though the complaint here appears to be a bad filter and characterizing system.

Posts

When the revolution hits in force ...

Our machines will be there, helping power the genomics pipelines to tremendous performance. Performance is an enabling feature. Without it you cannot even begin to hope to perform massive scale analytics. With it, you can dream impossible dreams. This article came out talking about a massive performance analytics pipeline at Nationwide Children’s Hospital in Ohio. This pipeline runs on a cluster attached to Scalable Informatics FastPath Unison storage. This is a very dense, very fast system, interconnected with Mellanox FDR Infiniband, Chelsio 40GbE, and leveraging BeeGFS from thinkparq.

Posts

stateless booting

Posts

Amusing #fail

I use Mozilla’s thunderbird mail client. For all its faults, it is still the best cross platform email system around. Apple’s mail client is a bad joke and only runs on apple devices (go figure). Linux’s many offerings are open source, portable, and most don’t run well on my Mac laptop. I no longer use Windows apart from running in a VirtualBox environment. And I would never go back to OutLook anyway (used it once, 15 years ago or so … never again).

Posts

Inventory reduction @scalableinfo

Its that time of year, when the inventory fairies come out and begin their counting. Math isn’t hard, but the day job would like a faster and easier count this year. So, the day job is working on selling off existing inventory. We have 4 units ready to go out the door to anyone in need of 70-144TB usable storage at 5-6 GB/s per unit. Specs are as follows: 16-24 processor cores 128 GB RAM 48x {2,3,4} TB top mount drives 4x rear mount SSDs (OS/metadata cache) Scalable OS (Debian Wheezy based Linux OS) 3 year warranty As this is inventory reduction, the more inventory you take, the happier we are (and the less work that the inventory fairies have to do).

Posts

#SC14 day 2: @LuceraHQ tops @scalableinfo hardware ... with Scalable Info hardware ...

Report XTR141111 was just released by STAC Research for the M3 benchmarks. We are absolutely thrilled, as some of our records were bested by newer versions of our hardware with newer software stack. Congratulations to Lucera, STAC Research for getting the results out, and the good folks at McObject for building the underlying database technology. This result continues and extends Scalable Informatics domination of the STAC M3 results. I’ll check to be sure, but I believe we are now the hardware side of most of the published records.

Posts

SC14 T minus 6 and counting

Scalable’s booth is #3053. We’ll have some good stuff, demos, talks, and people there. And coffee. Gotta have the coffee. More soon, come by and visit us!

Posts

massive unapologetic firepower part 2 ... the dashboard ...

For Scalable Informatics Unison product. The whole system: [ ](/images/dash-2.png) Watching writes go by: [ ](/images/dash-3.png) Note the sustained 40+ GB/s. This is a single rack sinking this data, and no SSDs in the bulk data storage path. This dashboard is part of the day job’s FastPath product.

Posts

Updated boot tech in Scalable OS (SIOS)

This has been an itch we’ve been working on scratching a few different ways, and its very much related to forgoing distro based installers. Ok, first the back story. One of the things that has always annoyed me about installing systems has been the fundamental fragility of the OS drive. It doesn’t matter if its RAIDed in hardware/software. Its a pathway that can fail. And when it fails, all hell breaks loose.

Posts

Comcast disabled port 25 mail on our business account

We have a business account at home. I work enough from home that I can easily justify it. Fixed IP, and I run services, mostly to back up my office services. One of those services is SMTP. I’ve been running an SMTP server, complete with antispam/antivirus/… for years. Handles backup for some domains, but is also primary for this site. This is allowable on business accounts. Or it was allowable. 3 days ago, they seem to have turned that off.

Posts

Massive, unapologetic, firepower: 2TB write in 73 seconds

A 1.2PB single mount point Scalable Informatics Unison system, running an MPI job (io-bm) that just dumps data as fast as the little Infiniband FDR network will allow. Our test case. Write 2TB (2x overall system memory) to disk, across 48 procs. No SSDs in the primary storage. This is just spinning rust, in a single rack. This is performance pr0n, though safe for work. usn-01:/mnt/fhgfs/test # df -H /mnt/fhgfs/ Filesystem Size Used Avail Use% Mounted on fhgfs_nodev 1.

Posts

Doing what we are passionate about

I am lucky. I fully admit this. There are people out there whom will tell you that its pure skill that they have been in business and been successful for a long time. Others will admit luck is part of it, but will again, pat themselves on the back for their intestinal fortitude. Few will say “I am lucky”. Which is a shame, as luck, timing (which you can never really, truly, control), and any number of other factors really are critical to one being able to have the luxury of doing what we are doing.

Posts

We had a record setting, knock the barn doors down year last year

… and believe it or not, I forgot to mention it. This is the first time in company history that we had a backlog going into Q1. Orders being built and tested on the last work day of the year. We grew, not the amount we had originally forecast, but we understand why (and sadly have little control over that aspect). We are working very hard on our appliances … I am blown away as to how perfect a fit they are for folks.

Posts

Calxeda restructures

The day job had been talking to and working with Calxeda for a while. They’ve been undergoing some changes over the last few months as they worked to transition from an evangelist to a systems builder. The day job just got a note that they are restructuring. What this specifically means to an outsider, I am not sure, though I could speculate. HP has a vested interest in them. I wouldn’t be surprised to see a rapid asset acquisition.

Posts

Day job at HPC on Wall Street on Monday the 9th

We’ll be showing off 2 appliances, with a change of what we are showing/announcing on one due to something not being ready on the business side. The first one is our little 108 port siRouter box. Think ‘bloody fast NAT’ and SDN in general, you can run other virtual/bare metal apps atop it. The second will be a massive scale parallel SQL DB appliance. Usable for big data, hadoop like workloads, and other similar workloads more commonly used on other well known platforms.

Posts

Update on IPMI Console Logger

Config now comes from some nice and simple json, and it handles multiple machines with aplomb. See the git repository for the latest. The config file example is in there, and you can replicate the n01-ipmi section with more nodes trivially. Coming next is getting config from a trusted web server, along with registering the client to the trusted web server. This prevents things like passwords from showing up in the clear, though you can always create a lower privileged user to access the console for monitoring.