Posts

Interesting articles on systemd and ZFS

The systemd article is on LWN, and discusses the “tragedy” of it. The ZFS post was linked from HackerNews and discusses risk to ZFS’s future from the perspective of FreeBSD leveraging ZFS on Linux as its upstream. Ok, first onto systemd. For those who don’t know systemd, think of it as the borg that ate init. And upstart. And … Basically, it is a replacement infrastructure for running services on Linux.

Posts

Reflections on where we've been in HPC, and thoughts on where we are going

Looking back on past reviews from 2013 and a few other posts, and what has changed since then up to 2019 (its early, I know), I am struck by a particular thought I’ve expressed for decades now. In 2009 I wrote Down market, in this case, means wider use … explicit or implicit … integrated in more business processes. All the while, becoming orders of magnitude less expensive per computational operation, easier to use and interface with.

Posts

Systems that are designed to fail, often do

I’ve been saying this for mumble decades. What I mean by “designed to fail” isn’t specifically that someone wants a system to fail. Rather, by various interactions, wishful thinking, drinking of one’s own kool-aid, a system is placed on an inexorable path to failure. Without something to divert it in time, failure is the most probable outcome. Watching these failures unfold can strike terror in one’s heart. Especially when you realize that you yourself have not been able to nudge the system onto a sane path.

Posts

With every update, MacOSX becomes harder to build for

Way back in the good old 90s, we had very different versions of various unix systems. SunOS/Solaris, Irix, AIX, HP/UX, this upstart Linux, and some BSD things floating about. Of course, windows NT and others were starting to peek out then, and they had a “POSIX subsystem”. Cross platform builds were generally speaking, a nightmare. While POSIX is a spec, writing to it didn’t guarantee that your application would work on a range of machines and OSes.

Posts

Opening keynote @Supercomputing #SC18 : #HPC is an enabling technology ...

… Ok, the speaker said far more than that. But one of his central theses is that in this “second” machine revolution, we are enabling data driven decision making, distributed decision and consensus, as well as expanding beyond the confines of specific expertise in a field. The latter I’ve heard described as cross fertilization … gather a bunch of smart people “together” and give them a problem spec. Let them run with it.

Posts

#HPC in all the things

I read this announcement this morning. Our friends at Facebook releasing their reduced precision server side convolution and GEMM operations. Many years ago, I tried to convince people that HPC moves both down market, into lower cost hardware, as well as more widely into more software toolchains. Basically, the decades of experience building very high performance applications and systems will have value downstream for many users over time. GEMM is a generalized approach to a matrix multiply, which has been well optimized for HPC applications in various scientific libraries over time.

Posts

Looking forward to #SC18 next week and a discussion of all things #HPC

I’m attending SC18 next week. It’s been 3 years since I last attended (2015). Then we (@scalableinfo) had a large booth, lots of traffic, and showed off some of the first commercial NVMe high performance storage systems running BeeGFS over 100GbE. I am looking forward to talking with as many people as I can, to get their perspectives on things. To see what they are thinking, hear what they are doing, and in which direction they are going.

Posts

A bug in s3 buckets with no apparent way to request support to deal with it

This is a fun one, I’ve been playing with for the last 5 days or so. I’m helping someone out with backups, and they changed their mind on what they wanted backed up. So I started deleting the backups they didn’t want. One of the machines contained a set of directories for hashdeep which includes a number of test cases. One set of test cases are deeply linked directories. So, the aws s3 cp /localpath s3://yadda/yadda --recursive copied this and many other files up to the bucket.

Posts

Finally posted Tiburon on github

Tiburon specifically solves the problem of stateful vs stateless boots, roll forward/backwards in images, consistent booting with immutable images. Coupled with an image generator and a programmatic config environment (as in Nyble and other tools), you have the workings of the non storage/networking parts of a cloud or cluster manager. The philosophy behind this has to do with the pain associated with config/OS drift, failed upgrades/roll backs, failed boot drives, etc.

Posts

Well ... that was fun

So … I’ve had this blog since 2005. I installed it from original sources. And WP made upgrades in the 2.x time frame, quite painless. Or so it seemed. Slowly, over time, some configuration/settings/whatever got out of whack. And with the last update, from a system originally installed in final form in 2013 or so, something broke. I am not sure what. But the symptoms were simple … new posts would replace the most recent posts.