Below you will find pages that utilize the taxonomy term “Software”
Posts
Well ... that was fun
So … I’ve had this blog since 2005. I installed it from original sources. And WP made upgrades in the 2.x time frame, quite painless.
Or so it seemed.
Slowly, over time, some configuration/settings/whatever got out of whack. And with the last update, from a system originally installed in final form in 2013 or so, something broke.
I am not sure what. But the symptoms were simple … new posts would replace the most recent posts.
Posts
On technology zealotry
I’ve encountered this in my career, at many places. Sadly, early in my career, I participated in some of this. You are a zealot for a particular form of tech if you can see it do no wrong, and decry reports of issues or problems as “attacks”. You are a zealot against a particular form of tech if you cannot see it as a potentially useful and valuable portion of a solution stack, and (often gleefully) amplify reports of issues or problems.
Posts
Interesting post on mixed integer programming for diets ... that has some hilarious output
I am a fan of the Julia language. Tremendously powerful analytical environment, compiled, high performance, easy to understand and use, strongly typed, … there’s a long list of reasons why I like it. If you are doing analytics, modeling, computation in other languages, it is definitely worth a look. Think of it as python, compiled, with multiple dispatch and strong typing … and no indent-as-structure problem. My Julia fanboi-ism aside, there was an interesting blog post about using JuMP, a linear programming environment for Julia.
Posts
Put my Riemann Zeta Function sum reduction code on github
Repo is here: https://github.com/joelandman/rzf. There’s a lightning talk to go along with it, and I’ll make sure I can get it together for this as well.
Posts
Disk, SSD, NVMe preparation tools cleaned up and on GitHub
These are a collection of (MIT licensed) tools I’ve been working on for years to automate some of the major functionality one needs when setting up/using new machines with lots of disks/SSD/NVMe. The repo is here: https://github.com/joelandman/disk_test_setup . I will be adding some sas secure erase and formatting tools into this. These tools wrap other lower level tools, and handle the process of automating common tasks you worry about when you are setting up and testing a machine with many drives.
Posts
Aria2c for the win!
I’ve not heard of aria2c before today. Sort of a super wget as far as I could tell. Does parallel transfers to reduce data motion time, if possible. So I pulled it down, built it. I have some large data sets to move. And a nice storage area for them. Ok. Fire it up to pull down a 2GB file. Much faster than wget on the same system over the same network.
Posts
Oracle finally kills off Solaris and SPARC
This was making the rounds last week. Oracle seems to have a leak in its process, creating labels that trigger event notifications for people, for their packages. Solaris was decimated. More details at the links and at The Layoff. Honestly I had expected them to reach this point. I am guessing that they were contractually obligated for at least 7 years to provide Solaris/SPARC support to US government purchasers. SGI went through a similar thing with IRIX.
Posts
Finally got to use MCE::* in a project
There are a set of modules in the Perl universe that I’ve been looking for an excuse to use for a while. They are the MCE set of modules, which purportedly enable easy concurrency and parallelism, exploiting many core CPUs, and a number of techniques. Sure enough, I had a task to handle recently that required this. I looked at many alternatives, and played with a few, including Parallel::Queue. I thought of writing my own with IPC::Run as I was already using it in the project, but I didn’t want to lose focus on the mission, and re-invent a wheel that already existed elsewhere.
Posts
What reduces risk ... a great engineering and support team, or a brand name ?
I’ve written about approved vendors and “one throat to choke” concept in the past. The short take from my vantage point as a small, not well known, but highly differentiated builder of high performance storage and computing systems … was that this brand specific focus was going to remove real differentiated solutions from market, while simultaneously lowering the quality and support of products in market. The concept of brand and marketing of a brand is about erecting barriers to market entry against the smaller folk whom might have something of interest, and the larger folk who might come in with a different ecosystem.
Posts
On hackerrank and Julia
My new day job has me developing considerably less code than my previous endeavor, so I like to work on problems to keep these particular muscles in steady use. Happily, I get to do more analytics than ever before, so this at least is some compensation for the lower amount of coding. When I work on coding for myself, I’ll play with problems from my research days, or small throw-away ones, like on Hackerrank.
Posts
pcilist: because sometimes you really, really need to know how your PCIe devices are configured
If you don’t know what I am talking about here, that’s fine. I’ll assume you don’t do hardware, or you call someone else when there is a hardware problem. If you think “well gee, don’t we have lspci? so why do we need this?” then you probably have not really tried to use lspci to find this information, or didn’t know it was available. Ok … what I am talking about.
Posts
What is old, is new again
Way back in the pre-history of the internet (really DARPA-net/BITNET days), while dinosaur programming languages frolicked freely on servers with “modern” programming systems and data sets, there was a push to go from a static linking programs to a more modular dynamic linking. The thought processes were that it would save precious memory, not having many copies of libc statically linked in to binaries. It would reduce file sizes, as most of your code would be in libraries.
Posts
That was fun: mysql update nuked remote access
Update your packages, they said. It will be more secure, they said. I guess it was. No network access to the databases. Even after turning the database server instance to listen again on the right port, I had to go in and redo the passwords and privileges. So yeah, this broke my MySQL instance for a few hours. Took longer to debug as it was late at night and I was sleepy, so I put it off until morning with caffeine.
Posts
An article on Rust language for astrophysical simulation
It is a short read, and you can find it on arxiv. They tackled an integration problem, basically using the code to perform a relatively simple trajectory calculation for a particular N-body problem. A few things lept out at me during my read. First, the example was fairly simplistic … a leapfrog integrator, and while it is a symplectic integrator, this particular algorithm not quite high enough order to capture all the features of the N-body interaction they were working on.
Posts
Brings a smile to my face ... #BioIT #HPC accelerator
Way way back in the early aughts (2000’s), we had built a set of designs for an accelerator system to speed up things like BLAST, HMMer, and other codes. We were told that no one would buy such things, as the software layer was good enough and people didn’t want black boxes. This was part of an overall accelerator strategy that we had put together at the time, and were seeking to raise capital to build.
Posts
when you eliminate the impossible, what is left, no matter how improbable, is likely the answer
This is a fun one. A customer has quite a collection of all-flash Unison units. A while ago, they asked us to turn on LLDP support for the units. It has some value for a number of scenarios. Later, they asked us to turn it off. So we removed the daemon. Unison ceased generating/consuming LLDP packets. Or so we thought. Fast forward to last week. We are being told that LLDP PDUs are being generated by the kit.
Posts
A new #HPC project on github, nlytiq-base
Another itch I’ve been wanting to scratch for a very long time. I had internal versions of a small version of this for a while, but I wasn’t happy with them. The makefiles were brittle. The builds, while automated, would fail, quite often, for obscure reasons. And I want a platform to build upon, to enable others to build upon. Not OpenHPC which is more about the infrastructure one needs for building/running high performance computing systems.
Posts
#Perl on the rise for #DevOps
Note: I do quite a bit of development in Perl, and have my own biases, so please do take this into consideration. It is one of many languages I use, but it is by and large, my current go-to language. I’ll discuss below. According to TIOBE (yeah, I know), Perl usage is on the rise. The linked article posits that this is for DevOps reasons. The author of the article works at a company that makes money from Perl and Python … they build (actually very good) tools.
Posts
Another itch scratched
So there you are, with many software RAIDs. You’ve been building and rebuilding them. And somewhere along the line, you lost track of which devices were which. So somehow you didn’t clean up the last build right, and you thought you had a hot spare … until you looked at /proc/mdstat … and said … Oh … So. I wanted to do the detailed accounting, in a simple way. I want the tool to tell me if I am missing a physical drive (e.
Posts
ClusterHQ dies
ClusterHQ is now dead. They were an early container play, building a number of tools around Docker/etc. for the space. Containers are a step between bare metal and VMs. FLocker (ClusterHQ’s product) is open source, and they were looking to monetize it in a different way (not on acquisition, but on support). In this space though, Kubernetes reigns supreme. So competing products/projects need to adapt or outcompete. And its very hard to outcompete something like k8s.
Posts
So it seems Java is not free
This article on The Register indicates that Oracle is now working actively to monetize java use. Given the spate of java hacks over the years, and the decidedly non-free nature of the language, I suspect we are going to see replacement development language use skyrocket, as people develop in anything-but-Java going forward. Think about the risks … you have a massive platform that people have been using with a fairly large number of compromises (client side certainly) … and now you need to start paying for the privilege of using the platform.
Posts
strace -p is your friend
So there I was, trying to use a serial port on a node which was connected to a serial port on a switch. Which I needed to properly configure the switch. So I light up minicom and get garbage. Great, a baud rate mismatch, easily fixed. Fix it. Connect again. I get the first 10-12 characters … and then garbage. Hmmm. I’d like to pause our story for a moment, and say I had the key insight at this moment … but that would not be true.
Posts
Finding unpatched "features" in distro packages
I generally expect baseline distro packages to be “old” by some measure. Even for more forward thinking distros, they generally (mis)equate age with stability. I’ve heard the expression “bug for bug compatible” when dealing with newer code on older systems. Something about the devil you know vs the devil you don’t. Ok. In this case, Cmake. A good development tool, gaining popularity over autotools and other things. Base SIOS image is on Debian 8.
Posts
On expectations
This has happened multiple times over the last few months. Just variations on the theme as it were, so I’ll talk about the theme. The day job builds some of the fastest systems for storage and analytics in market. We pride ourselves on being able to make things go very … very fast. If its slow, IMO, its a bug. So we often get people contacting us with their requirements. These requirements are often very hard for our competitors, and fairly simple for us to address.
Posts
Excellent article on mistakes made for infrastructure ... cloud jail is about right
Article is here at Firstround capital. This goes to a point I’ve made many many times to customers going the cloud route exclusively rather than the internal infrastructure route or hybrid route. Basically it is that the economics simply don’t work. We’ve used a set of models based upon observed customer use cases, and demonstrated this to many folks (customers, VCs, etc.) Many are unimpressed until they actually live the life themselves, have the bills to pay, and then really … really grok what is going on.
Posts
I don't agree with everything he wrote about systemd, but he isn't wrong on a fair amount of it
Systemd has taken the linux world by storm. Replacing 20-ish year old init style processing for a more legitimate control plane, and replacing it with a centralized resource to handle this control. There are many things to like within it, such as the granularity of control. But there are any number of things that are badly broken by default. Actually some of these things are specifically geared towards desktop users (which isn’t a bad thing if you are a desktop linux user, as I am).
Posts
Hows this for a nice deskside system ... one of our Cadence boxen
For a partner. They made a request for something we’ve not built in a while … it had been end of lifed. One of our old Pegasus units. A portable deskside supercomputer. In this case, a deskside franken-computer … built out of the spare parts from other units in our lab. It started out as a 24 core monster, but we had a power supply burn out, and take the motherboard with it.
Posts
Fully RAMdisk booted CentOS 7.2 based SIOS image for #HPC , #bigdata , #storage etc.
This is something we’ve been working on for a while … a completely clean, as baseline a distro as possible, version of our SIOS RAMdisk image using CentOS (and by extension, Red Hat … just need to point to those repositories). And its available to pull down and use as you wish from our download site. Ok, so what does it do? Simple. It boots an entire OS, into RAM. No disks to manage and worry over.
Posts
Raw Unapologetic Firepower: kdb+ from @Kx
While the day job builds (hyperconverged) appliances for big data analytics and storage, our partners build the tools that enable users to work easily with astounding quantities of data, and do so very rapidly, and without a great deal of code. I’ve always been amazed at the raw power in this tool. Think of a concise functional/vector language, coupled tightly to a SQL database. Its not quite an exact description, have a look at Kx’s website for a more accurate one.
Posts
Systemd and non-desktop scenarios
So we’ve been using Debian 8 as the basis of our SIOS v2 system. Debian has a number of very strong features that make it a fantastic basis for developing a platform … for one, it doesn’t have significant negative baggage/technical debt associated with poor design decisions early on in the development of the system as others do. But it has systemd. I’ve been generally non-committal about systemd, as it seemed like it should improve some things, at a fairly minor cost in additional complexity.
Posts
And this was a good idea ... why ?
The Debian/Ubuntu update tool is named “apt” with various utilities built around it. For the most part, it works very well, and software upgrades nicely. Sort of like yum and its ilk, but it pre-dates them. This tool is meant for automated (e.g. lights out) updates. No keyboard interaction should be required. Ever. For any reason. However … a recent update to one particular package, in Debian, and in Ubuntu, has resulted in installation/updates pausing.
Posts
new SIOS feature: compressed ram image for OS
Most people use squashfs which creates a read-only (immutable) boot environment. Nothing wrong with this, but this forces you to have an overlay file system if you want to write. Which complicates things … not to mention when you overwrite too much, and run out of available inodes on the overlayfs. Then your file system becomes “invalid” and Bad-Things-Happen(™). At the day job, we try to run as many of our systems out of ram disks as we can.
Posts
It is 2016 ... why am I fighting with LDAP authentication in linux? Why doesn't it just work?
Ok … very long story that boils down to us trying to help a customer out. I am trying to avoid the “lets just add another user to /etc/passwd” or similar such thing. And they aren’t quite ready to hook into AD or similar. So we have this issue. I want to enable their nodes to use ldap. I’ve done this before for other customers with older tools (pam_ldap, etc.). But it was somewhat crazy (as in non-trivial), involving gnashing of teeth, gums, etc.
Posts
The joys of automated tooling ... or ... catching changes in upstream projects workflows by errors in yours
We have an automated build process for our boot images. It is actually quite good, allowing us to easily integrate many different capabilities with it. These capabilities are usually encapsulated in various software stacks that provide specific functionality. Most of these stacks follow pretty well defined workflows. For a number of reasons, we find building from source generally easier than package installation, as there are often some, well, effectively random (and often poor) choices in build options/file placement in the package builds.
Posts
Not a fan of device mapper in Linux
Yeah, I know. It brings all manner of capabilities with it. Its just the cost of these capabilities, when combined with other tools, like, say, Docker, that make me not want to use it. To wit:
root@ucp-01:~# ls -alF /var/lib/docker/devicemapper/devicemapper/ total 52508 drwx------ 2 root root 80 Jan 29 22:38 ./ drwx------ 4 root root 80 Jan 29 22:38 ../ -rw------- 1 root root 107374182400 Jan 29 22:39 data -rw------- 1 root root 2147483648 Jan 29 22:39 metadata root@ucp-01:~# ls -halF /var/lib/docker/devicemapper/devicemapper/ total 52M drwx------ 2 root root 80 Jan 29 22:38 .
Posts
When infinite resources aren't, and why software assumes they are infinite
We’ve got customers with very large resource machines. And software that sees all those resources and goes “gimme!!!!”. So people run. And then more people use it. And more runs. Until the resources are exhausted. And hilarity (of the bad kind) ensues. These are firedrills. I get an open ticket that “there must be something wrong with the hardware”, when I see all the messages in console logs being pulled in from ICL saying “zOMG I am out of ram ….
Posts
#Perl6 compiler betas are ready
Ok … I am … well … blown away. I had thought Perl6 would be the Duke Nukem forever of programming languages. Indeed, it has been in active development for more than a decade. But you can download compilers (yes, you heard me right, compilers) for it now. You might say “why perl” or “why perl6” or “why now, because we have #insert(language_x) and its wonderful”. Good question, I wasn’t sure why it was relevant, until I started reading some of the code.
Posts
There are no silver bullets, 2015 edition
In Feb 2013, I opined (with some measure of disgust) that people were looking at various software packages as silver bullets, these magical bits of a stack which could suddenly transform massive steaming piles of bits (big … uh … “data” ?) into golden nuggets of actionable data. Many of the “solutions” marketed these days are exactly like that … “add our magic bean software to your pipeline and you will gain insight faster.
Posts
The 1980s called and want their software licensing models back
So here I am, the day before thanksgiving, fighting a battle with a reluctant license server that wants to compute a hash of internal bits on a machine, in order to use to unlock a license key, to let software run. This is not for us, but for a customer. At their site. This is the same model from the 1980s and early 90s. Create a hash from a collection of things (or a dongle you attach to a serial/parallel port).
Posts
A wonderful read on metrics, profiling, benchmarking
Brendan Gregg’s writings are always interesting and informative. I just saw a link on hacker news to a presentation he gave on “Broken Performance Tools”. It is wonderful, and succinctly explains many thing I’ve talked about here and elsewhere, but it goes far beyond what I’ve grumbled over. One of my favorite points in there is slide 83. “Most popular benchmarks are flawed” and a pointer to a paper (easy to google for).
Posts
Shiny #HPC #storage things at #SC15
Assuming everything goes as planned (HA!) we should have a number of very cool things at SC15.
* 100Gb [Unison storage system with BeeGFS](https://scalableinformatics.com/unison) * 100Gb [Unison Ceph](https://scalableinformatics.com/unison) system * 100Gb connection to a partner/customer booth * Forte 100Gb is awesome. The first time I ran an iperf bidirectional test, saw 20GB/s … it blew me away. 40/56GbE is old hat now, and 10GbE is in the rapidly receding past.
Posts
sios-metrics core rewritten
This was a long time coming. Something I needed to do, in order to build a far better code capable of using less network, less CPU power, and providing a better overall system. In short, I ripped out the graphite bits and wrote a native interface to InfluxDB. This interface will also be adapted to kdb+ (32 bit edition), and graphite as time allows. In the process, I cleaned up a tremendous amount of code.
Posts
Updated net-tools bits
So far, 3 components, and working to fix a few things in formatting. On github, grab it here. First, lsbond.pl to report about bond details
root@unison-mgr-1:~/net-tools# ./lsbond.pl bond0: mac 0c:c4:7a:48:69:cb state up mode fault-tolerance (active-backup) xmit_hash layer2 0 active slave eth1 polling 100 ms up_delay 200 ms down_delay 200 ms slave nics: eth1: mac 0c:c4:7a:48:69:cb, link 1, state up, speed 1000, driver igb, version 5.3.2.2 firmware version 1.61,0x8000090e bond1: mac 00:12:c0:80:26:76 state up mode fault-tolerance (active-backup) xmit_hash layer2 0 active slave eth3 polling 100 ms up_delay 200 ms down_delay 200 ms slave nics: eth2: mac 00:12:c0:80:26:76, link 1, state up, speed 10000, driver ixgbe, version 4.
Posts
rebuilding our kernel build system for fun and profit
No, really mostly to clean up an accumulation of technical debt that was really bugging the heck out of me. I like Makefiles and I cannot lie. So I like encoding lots of things in them. But it wound up hardwiring a number of things that shouldn’t have been hardwired. And made the builds brittle. When you have 2 released/supported kernels, and a handful of experimental kernels, it gets hard making changes that will be properly reflected across the set.
Posts
Been there, done that, even have a patent on it
I just saw this about doing a divide and conquer approach to massive scale genomics calculation. While not specific to the code in question, it looked familiar. Yeah, I think I’ve seen something like this before … and wrote the code to do it. It was called SGI GenomeCluster. It was original and innovative at the time, hiding the massively parallel nature of the computation behind a comfortable interface that end users already knew.
Posts
On storage unicorns and their likely survival or implosion
The Register has a great article on storage unicorns. Unicorns are not necessarily mythical creatures in this context, but very high valuation companies that appear to defy “standard” valuation norms, and hold onto their private status longer than those in the past. That is, they aren’t in a rush to IPO or get acquired.
The article goes on to analyze the “storage” unicorns, those in the “storage” field. They admix storage, nosql, hyperconverged, and storage as a service.
Posts
Gmail lossy email system
For months I’ve been noting that email to my 2 different GMail accounts (one for work on the business side using the Google Apps for business, and yes, paid for … and one for personal) are not getting all the emails sent to it. I’ve had customers reach out to me here at this site, as well as calling me up to ask me if I’ve been getting their email. Seems I’m not the only one, though the complaint here appears to be a bad filter and characterizing system.
Posts
diagnostics
This is something of a hard post to write, for a number of reasons, not the least of which is that the topic comes as something of a surprise to me. I am just going to state it, and then discuss it. The vast majority of people (and companies) out there, whom think they know something of hardware/software/system level diagnostics and problem identification (from newbie to “veteran”) are either full of it, or really clueless.
Posts
Interesting Q1 so far for day job
Our Q1 is usually quiet, fairly low key. Not this one. Looks like lots of pent up demand. We are deep into record territory, running 200+% of normal, with possibility of more. Another new wrinkle is that our small investment round is mostly complete. This is new territory for us, and you may have noticed I’d backed off posting intensity over the last half year or so while this was going on.
Posts
influxdb cli queries now with regex
This is the way queries are supposed to work. Note the perl regex in the series name
unison> select * from /^usn-ramboot.nettotals.kb(in|out)$/ limit 10 D[23261] Scalable::TSDB::_generate_url; dbquery = 'select * from /^usn-ramboot.nettotals.kb(in|out)$/ limit 10' D[23261] Scalable::TSDB::_generate_url; query = 'p=XXXXXXXX&u;=scalable&chunked;=1&time;_precision=s&q;=select%20%2A%20from%20%2F%5Eusn-ramboot.nettotals.kb%28in%7Cout%29%24%2F%20limit%2010' D[23261] Scalable::TSDB::_generate_url; url = 'http://localhost:8086/db/unison/series?p=XXXXXXX&u;=scalable&chunked;=1&time;_precision=s&q;=select%20%2A%20from%20%2F%5Eusn-ramboot.nettotals.kb%28in%7Cout%29%24%2F%20limit%2010' D[23261] Scalable::TSDB::_send_chunked_get_query -> reading 0.009837s D[23261] Scalable::TSDB::_send_chunked_get_query -> bytes_received = 530B D[23261] Scalable::TSDB::_send_chunked_get_query return code = 200 D[23261] Scalable::TSDB::_send_chunked_get_query cols = [time,sequence_number,usn-ramboot.nettotals.kbin] D[23261] Scalable::TSDB::_send_chunked_get_query cols = [time,sequence_number,usn-ramboot.
Posts
InfluxDB cli ready for people to play with
The code is on github. Installation should be simple sudo make INSTALLPATH=/path/where/you/want/it It will install any needed Perl modules for you. I’ve reduced the dependency set to LWP::UserAgent, Getopt::Lucid, JSON::PP, and some text processing. As much as I like Mojolicious, the UserAgent was 1/10th the speed of LWP for the same work. Once it is done, point it over to an InfluxDB database instance:
landman@metal:~/work/development/influxdbcli$ ./influxdb-cli.pl --user scalable --pass XXXXXXX --host 192.
Posts
Real measurement is hard
I had hinted at this last week, so I figure I better finish working on this and get it posted already. The previous bit with language choice wakeup was about the cost of Foreign Function Interfaces, and how well they were implemented. For many years I had honestly not looked as closely at Python as I should have. I’ve done some work in it, but Perl has been my go-to language.
Posts
When the revolution hits in force ...
Our machines will be there, helping power the genomics pipelines to tremendous performance. Performance is an enabling feature. Without it you cannot even begin to hope to perform massive scale analytics. With it, you can dream impossible dreams. This article came out talking about a massive performance analytics pipeline at Nationwide Children’s Hospital in Ohio. This pipeline runs on a cluster attached to Scalable Informatics FastPath Unison storage. This is a very dense, very fast system, interconnected with Mellanox FDR Infiniband, Chelsio 40GbE, and leveraging BeeGFS from thinkparq.
Posts
Where have you been all my life FFI::Platypus?
Oh my … this is goodness I’ve been missing badly in Perl. Just learned about it this morning. Short version. You want to mix programming languages for implementation of some project. One language makes development of some subset of functions very easy, while another language handles another part very well. You need some sort of layer to handle this usually, or a way to sanely map. FFI is the concept behind this … and while there is no mention of CORBA or XDR/RPC type things, this is the logical follow-on to these (in their time) ground breaking technologies.
Posts
Finally, a desktop Linux that just works
I’ve been a user of Linux on the desktop, as my primary desktop, for the last 16 years. In that time, I’ve had laptops with Windows flavors (95, XP, 2000, 7), a MacOSX desktop. Before that, my first laptop I had bought (while working on my thesis) was a triple boot job, with DOS, Windows 9x, and OS2. I used the latter for when I was traveling and needed to write; the thesis was written in LaTeX and I could easily move everything back and forth between that and my Indy at home, and my office Indigo.
Posts
Anatomy of a #fail ... the internet of broken software stacks
So I’ve been trying to diagnose a problem with my Android devices running out their batteries very quickly. And at the same time, I’ve been trying to understand why my address bar on Thunderbird has taken a very long time to respond. I had made a connection earlier today when I had noticed the 50k+ contacts in my contact list, of which maybe 2000 were unique. I didn’t quite understand it.
Posts
Drivers developed largely out of kernel, and infrequently synced
One of the other aspects of what we’ve been doing has been forward porting drivers into newer kernels, fixing the occasional bug, and often rewriting portions to correct interface changes. I’ve found that subsystem vendors seem to prefer to drop code into the kernel very infrequently. Sometimes once every few years are they synced. Which leads to distro kernels having often terribly broken device support. And often very unstable device support.
Posts
Parallel building debian kernels ... and why its not working ... and how to make it work
So we build our own kernels. No great surprise, as we put our own patches in, our own drivers, etc. We have a nice build environment for RPMs and .debs. It works, quite well. Same source, same patches, same make file driving everything. We get shiny new and happy kernels out the back end, ready for regression/performance/stability testing. Works really well. But … but … parallel builds (e.g. leveraging more than 1 CPU) work only for the RPM builds.
Posts
Amusing #fail
I use Mozilla’s thunderbird mail client. For all its faults, it is still the best cross platform email system around. Apple’s mail client is a bad joke and only runs on apple devices (go figure). Linux’s many offerings are open source, portable, and most don’t run well on my Mac laptop. I no longer use Windows apart from running in a VirtualBox environment. And I would never go back to OutLook anyway (used it once, 15 years ago or so … never again).
Posts
Systemd, and the future of Linux init processing
An interesting thing happened over the last few months and years. Systemd, a replacement init process for Linux, gained more adherents, and supplanted the older style init.d/rc scripting in use by many distributions. Ubuntu famously abandoned init.d style processing in favor of upstart and others in the past, and has been rolling over to systemd. Red Hat rolled over to Systemd. As have a number of others. Including, surprisingly, Debian. For those whom don’t know what this is, think of it this way.
Posts
Brings a smile to my face
My soon to be 15 year old daughter was engrossed with something on her laptop yesterday. Thinking it was fan-fiction, I asked her what she was writing. She knitted her brow for a moment, and looked up. “Its code combat Dad.” she said, quite matter of factly. I must have had a slightly startled expression on my face. I knew she had dabbled with it, and had recommended (/sigh) Python as a language, after she took (and aced) a Java class last year, as Python is inherently simpler.
Posts
Starting to come around to the idea that swap in any form, is evil
Here’s the basic theory behind swap space. Memory is expensive, disk is cheap. Only use the faster memory for active things, and aggressively swap out the less used things. This provides a virtual address space larger than physical/logical memory. Great, right? No. Heres why.
swap makes the assumption that you can always write/read to persistent memory (disk/swap). It never assumes persistent memory could have a failure. Hence, if some amount of paged data on disk suddenly disappeared, well … Put another way, it increases your failure likelihood, by involving components with higher probability of failure into a pathway which assumes no failure.
Posts
Mixing programming languages for fun and profit
I’ve been looking for a simple HTML5-ish way to represent our disk drives in our Unison units. I’ve been looking for some simple drawing libraries in javascript to make this higher level, so I don’t have to handle all the low level HTML5 bits. I played with Raphael and a few others (including paper.js). I wound up implementing something in Raphael.
The code that generated this was a little unwieldly … as javascript doesn’t quite have all the constructs one might expect from a modern language.
Posts
And the 0.8.3 InfluxDB no longer works with the InfluxDB perl module
I ran into this a few weeks ago, and am just getting around to debugging it now. Traced the code, set up a debugger and followed the path of execution, and … and … Yup, its borked. So, I can submit a patch or 3 against the InfluxDB code, or roll a simpler more general Time Series Data Base interface that will talk to InfluxDB. And eventually kdb+. Since I wanted to code for that as well, I am looking more seriously at the second option.
Posts
Shellshock is worse than heartbleed
In part because, well, the patches don’t seem to cover all the exploits. For the gory details, look at the CVE list here. Then cut and paste the local exploits. Even with the latest patched source, built from scratch, there are active working compromises. With heartbleed, all we had to do was nuke keys, patch/update packages, restart machines, cross fingers. This is worse, in that the fixes … well … don’t.
Posts
Updated boot tech in Scalable OS (SIOS)
This has been an itch we’ve been working on scratching a few different ways, and its very much related to forgoing distro based installers. Ok, first the back story. One of the things that has always annoyed me about installing systems has been the fundamental fragility of the OS drive. It doesn’t matter if its RAIDed in hardware/software. Its a pathway that can fail. And when it fails, all hell breaks loose.
Posts
Solved the major socket bug ... and it was a layer 8 problem
I’d like to offer an excuse. But I can’t. It was one single missing newline. Just one. Missing. Newline. I changed my config file to use port 10000. I set up an nc listener on the remote host.
nc -k -l a.b.c.d 10000 Then I invoked the code. And the data showed up. Without a ()&(&%&$%*&(^ newline. That couldn’t possibly be it. Could it? No. Its way to freaking simple.
Posts
New monitoring tool, and a very subtle bug
I’ve been working on coding up some additional monitoring capability, and had an idea a long time ago for a very general monitoring concept. Nothing terribly original, not quite nagios, but something easier to use/deploy. Finally I decided to work on it today. The monitoring code talks to a graphite backend. Could talk to statsd, or other things. In this case, we are using the InfluxDB plugin for graphite. I wanted an insanely simple local data collector.
Posts
InfluxDB cli is up on github
I know there is a node version, and I did try it before I wrote my own. Actually, the reason I wrote my own was that I tried it and … well … Link is here. And yes, the readme is borked about 1/2 way through. Doesn’t quite show the formatting of the output quite right. Will try to fix over the weekend, as I move this a far more feature complete bit.
Posts
Have a nice cli for InfluxDB
I tried the nodejs version and … well … it was horrible. Basic things didn’t work. Made life very annoying. So, being a good engineering type, I wrote my own. It will be up on our site soon. Here’s an example
./influxdb-cli.pl --host 192.168.5.117 --user test --pass test --db metrics metrics> \list series
.----------------------------------. | series name | +----------------------------------+ | lightning.cpuload.avg1 | | lightning.cputotals.idle | | lightning.cputotals.irq | | lightning.
Posts
Be on the lookout for 'pauses' in CentOS/RHEL 6.5 on Sandy Bridge
Probably on Ivy Bridge as well. Short version. The pauses that plagued Nehalem and Westmere are baaaack. In RHEL/CentOS 6.5 anyway. A customer just ran into one. We helped diagnose/work around this a few years ago when a hedge fund customer ran into this … then a post-production shop … then … Basically the problem came in from the C-states. The deeper the sleep state, in some instances, the processor would not come out of it, or get stuck in the lower levels.
Posts
Doing what we are passionate about
I am lucky. I fully admit this. There are people out there whom will tell you that its pure skill that they have been in business and been successful for a long time. Others will admit luck is part of it, but will again, pat themselves on the back for their intestinal fortitude. Few will say “I am lucky”. Which is a shame, as luck, timing (which you can never really, truly, control), and any number of other factors really are critical to one being able to have the luxury of doing what we are doing.
Posts
You can tell you are a little nuts if ...
… you get really annoyed at the performance of grep on file IO (seriously folks? 32k or page size sized IO? What is this … 1992?) so you rewrite it in 20 minute in Perl, and increase the performance by 5-8x or so. If I get angry enough, I might just go all out, use direct IO, multiple parallel readers, and some other bits. I’ve got these huge disk pipes, awesome bandwidths, and this tiny little filter tool.
Posts
how not to write driver Makefiles or configuration scripts
if [uname -r eq ...] Its very bad form to insist on very particular versions of an OS/kernel. Not only will you piss off your customer (me), you will cause a great deal of effort to unwind the ill-considered test in order to get even basic functionality. I’ve seen this on network cards, RAID cards, you name it. It increases your support load, decreases the likelihood that you can actually support whats out there … say for example, someone does a ‘yum update’ and gets an updated kernel.
Posts
Update on IPMI Console Logger
Config now comes from some nice and simple json, and it handles multiple machines with aplomb. See the git repository for the latest. The config file example is in there, and you can replicate the n01-ipmi section with more nodes trivially. Coming next is getting config from a trusted web server, along with registering the client to the trusted web server. This prevents things like passwords from showing up in the clear, though you can always create a lower privileged user to access the console for monitoring.
Posts
Playing with AVX
I finally took some time from a busy schedule to play with AVX. I took my trusty old rzf code (Riemann Zeta function) and rewrote the time expensive inner loop in AVX primatives hooked to my C code. As a reminder, this code is a very simple sum reduction, and can be trivially parallelized (sum reduction). Vectorization isn’t as straightforward, and I found that compiler auto-vectorization doesn’t work well for it.
Posts
started playing with SmartOS for the day job
This is a very cool concept, something that meshes perfectly with our Tiburon based siCluster philosophy. That is, compute nodes should boot diskless, there should be very little state on each node, and stuff that you need to do should be made absolutely as simple as possible. SmartOS is a project of Joyent. Joyent, for those not familiar with them, are a cloud company, building a nice public cloud for end users to build on.
Posts
Fun with primes
A long time ago, in a galaxy far … far … away … I’ve been playing with primes for a while … computing them, etc. Have a neat way to represent any natural number (exluding 0) in terms of the exponents of their prime factors. Lots of reasons for playing with this. Started doing this before joining SGI … many moons ago, and used it as a way to entertain myself on airplanes when the laptop battery ran out.
Posts
I know I shouldn't be ... but I am ...
[update] a bug in my reasoning (thanks Peter!) a Perl Golf addict. Not a recovering addict, but one that is active. What is Perl Golf? Well, as in real golf, you try to provide the minimal number of steps to a solution. In this case, you are to solve the specific puzzle. Detractors of Perl often make snarky comments about Perl’s equivalency to random line noise and other such nonesense. Sure … if it makes you feel good to say that … I am a fan of terse languages, I wrote programs (if you could call them that) in APL … a while ago.
Posts
Guide to getting OFED 1.2 to build on OpenSuSE
Grab the tarball from the open fabrics alliance (or from here)
Grab the build_new.sh from here, place it in the OFED-1.2 directory as root on your machine mv /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h.original ln -s /usr/include/linux/miscdevice.h /usr/src/linux-2.6.18.2-34/include/linux/miscdevice.h Then run the build_new.sh. Voila. Works. Binary RPMs are here.
Posts
Is OpenSolaris Open?
As seen on /. IBM is questioning whether or not it is really open.
I think the real question is, does it matter? I really don’t see a need for it. The market is positively crowded with OpenSouce Linux, *BSD. OpenSource should not be a repository for declining projects. From an ISV perspective, you have to ask “why”? Precisely what benefit in terms of lower costs and increased revenue does being on Solaris bring?
Posts
New programming workshop: Perl and R for Informatics
The good folks over at BioInformatics.org have a new workshop ready to go on programming in [Perl and R. See this link for more details. If there is interest in having this outside of Boston, please let me know.
Posts
The marketing of computer languages
I have noticed a tendency for technologists, programmers, and others to fall in love with their projects, their tools, … . Why this happens, I am not sure. I don’t love my hammer, my circular saw, my computers, the languages I use. They are tools. They are the means to a goal. Sure, I like some tools more than others, but I am also not going to waste my time misusing a tool for a purpose ill suited for it.