Below you will find pages that utilize the taxonomy term “bugs”
Posts
What reduces risk ... a great engineering and support team, or a brand name ?
I’ve written about approved vendors and “one throat to choke” concept in the past. The short take from my vantage point as a small, not well known, but highly differentiated builder of high performance storage and computing systems … was that this brand specific focus was going to remove real differentiated solutions from market, while simultaneously lowering the quality and support of products in market. The concept of brand and marketing of a brand is about erecting barriers to market entry against the smaller folk whom might have something of interest, and the larger folk who might come in with a different ecosystem.
Posts
On hackerrank and Julia
My new day job has me developing considerably less code than my previous endeavor, so I like to work on problems to keep these particular muscles in steady use. Happily, I get to do more analytics than ever before, so this at least is some compensation for the lower amount of coding. When I work on coding for myself, I’ll play with problems from my research days, or small throw-away ones, like on Hackerrank.
Posts
That was fun: mysql update nuked remote access
Update your packages, they said. It will be more secure, they said. I guess it was. No network access to the databases. Even after turning the database server instance to listen again on the right port, I had to go in and redo the passwords and privileges. So yeah, this broke my MySQL instance for a few hours. Took longer to debug as it was late at night and I was sleepy, so I put it off until morning with caffeine.
Posts
when you eliminate the impossible, what is left, no matter how improbable, is likely the answer
This is a fun one. A customer has quite a collection of all-flash Unison units. A while ago, they asked us to turn on LLDP support for the units. It has some value for a number of scenarios. Later, they asked us to turn it off. So we removed the daemon. Unison ceased generating/consuming LLDP packets. Or so we thought. Fast forward to last week. We are being told that LLDP PDUs are being generated by the kit.
Posts
Another fun bit of debugging
Ok … so here you are doing a code build. Your environment is all set. You have ample space. Lots of CPU, lots of RAM. All packages are up to date. You start your make. You have another window open with dstat running, just to kinda, sorta watch the system, while you are doing other things. And while you are working, you realize dstat has stopped scrolling. Strange, why would that be.
Posts
strace -p is your friend
So there I was, trying to use a serial port on a node which was connected to a serial port on a switch. Which I needed to properly configure the switch. So I light up minicom and get garbage. Great, a baud rate mismatch, easily fixed. Fix it. Connect again. I get the first 10-12 characters … and then garbage. Hmmm. I’d like to pause our story for a moment, and say I had the key insight at this moment … but that would not be true.
Posts
Finding unpatched "features" in distro packages
I generally expect baseline distro packages to be “old” by some measure. Even for more forward thinking distros, they generally (mis)equate age with stability. I’ve heard the expression “bug for bug compatible” when dealing with newer code on older systems. Something about the devil you know vs the devil you don’t. Ok. In this case, Cmake. A good development tool, gaining popularity over autotools and other things. Base SIOS image is on Debian 8.
Posts
Excellent article on mistakes made for infrastructure ... cloud jail is about right
Article is here at Firstround capital. This goes to a point I’ve made many many times to customers going the cloud route exclusively rather than the internal infrastructure route or hybrid route. Basically it is that the economics simply don’t work. We’ve used a set of models based upon observed customer use cases, and demonstrated this to many folks (customers, VCs, etc.) Many are unimpressed until they actually live the life themselves, have the bills to pay, and then really … really grok what is going on.
Posts
I don't agree with everything he wrote about systemd, but he isn't wrong on a fair amount of it
Systemd has taken the linux world by storm. Replacing 20-ish year old init style processing for a more legitimate control plane, and replacing it with a centralized resource to handle this control. There are many things to like within it, such as the granularity of control. But there are any number of things that are badly broken by default. Actually some of these things are specifically geared towards desktop users (which isn’t a bad thing if you are a desktop linux user, as I am).
Posts
Systemd and non-desktop scenarios
So we’ve been using Debian 8 as the basis of our SIOS v2 system. Debian has a number of very strong features that make it a fantastic basis for developing a platform … for one, it doesn’t have significant negative baggage/technical debt associated with poor design decisions early on in the development of the system as others do. But it has systemd. I’ve been generally non-committal about systemd, as it seemed like it should improve some things, at a fairly minor cost in additional complexity.
Posts
You can't win
Like that old joke about the patient going to the Doctor for a pain …
Imagine if you will, a patient whom, after being told what is wrong, and why it hurts, and what to do about it, continues to do it. And be more intensive about doing it. And then complains when it hurts. This is a rough metaphor for some recent support experiences. We do our best to convince them not to do the things that cause them pain, as in this case, they are self-inflicted.
Posts
That was fun ... no wait ... the other thing ... not fun
Long overdue update of the server this blog runs on. It is no longer running a Ubuntu flavor, but instead running SIOSv2 which is the same appliance operating system that powers our products. This isn’t specifically a case of eating our own dog-food, but more a case that Ubuntu, even the LTS versions, have a specific sell by date, and it is often very hard to update to the newer revs.
Posts
Best practice or random rule ... diagnosing problems and running into annoyances
As often as not, I’ll hear someone talk about a “best practice” that they are implementing or have implemented. Things that run counter to these “best practices” are obviously, by definition, “not best”. What I find sometimes amusing, often alarming, is that the “best practices” are often disconnected from reality in specific ways. This is not a bash on all best practices, some of them are sane, and real. Like not allowing plain text passwords for logins.
Posts
It is 2016 ... why am I fighting with LDAP authentication in linux? Why doesn't it just work?
Ok … very long story that boils down to us trying to help a customer out. I am trying to avoid the “lets just add another user to /etc/passwd” or similar such thing. And they aren’t quite ready to hook into AD or similar. So we have this issue. I want to enable their nodes to use ldap. I’ve done this before for other customers with older tools (pam_ldap, etc.). But it was somewhat crazy (as in non-trivial), involving gnashing of teeth, gums, etc.
Posts
The joys of automated tooling ... or ... catching changes in upstream projects workflows by errors in yours
We have an automated build process for our boot images. It is actually quite good, allowing us to easily integrate many different capabilities with it. These capabilities are usually encapsulated in various software stacks that provide specific functionality. Most of these stacks follow pretty well defined workflows. For a number of reasons, we find building from source generally easier than package installation, as there are often some, well, effectively random (and often poor) choices in build options/file placement in the package builds.
Posts
Not a fan of device mapper in Linux
Yeah, I know. It brings all manner of capabilities with it. Its just the cost of these capabilities, when combined with other tools, like, say, Docker, that make me not want to use it. To wit:
root@ucp-01:~# ls -alF /var/lib/docker/devicemapper/devicemapper/ total 52508 drwx------ 2 root root 80 Jan 29 22:38 ./ drwx------ 4 root root 80 Jan 29 22:38 ../ -rw------- 1 root root 107374182400 Jan 29 22:39 data -rw------- 1 root root 2147483648 Jan 29 22:39 metadata root@ucp-01:~# ls -halF /var/lib/docker/devicemapper/devicemapper/ total 52M drwx------ 2 root root 80 Jan 29 22:38 .
Posts
When infinite resources aren't, and why software assumes they are infinite
We’ve got customers with very large resource machines. And software that sees all those resources and goes “gimme!!!!”. So people run. And then more people use it. And more runs. Until the resources are exhausted. And hilarity (of the bad kind) ensues. These are firedrills. I get an open ticket that “there must be something wrong with the hardware”, when I see all the messages in console logs being pulled in from ICL saying “zOMG I am out of ram ….
Posts
A wonderful read on metrics, profiling, benchmarking
Brendan Gregg’s writings are always interesting and informative. I just saw a link on hacker news to a presentation he gave on “Broken Performance Tools”. It is wonderful, and succinctly explains many thing I’ve talked about here and elsewhere, but it goes far beyond what I’ve grumbled over. One of my favorite points in there is slide 83. “Most popular benchmarks are flawed” and a pointer to a paper (easy to google for).
Posts
diagnostics
This is something of a hard post to write, for a number of reasons, not the least of which is that the topic comes as something of a surprise to me. I am just going to state it, and then discuss it. The vast majority of people (and companies) out there, whom think they know something of hardware/software/system level diagnostics and problem identification (from newbie to “veteran”) are either full of it, or really clueless.
Posts
love/hate relationship with new hardware
One of the dangers of dealing with newer hardware is often that, it doesn’t work so well. Or the drivers get hosed in mysterious ways. We’ve got some nice shiny new 10GbE cards for a set of Unison systems going into a customer next week. We had some very odd issues with other 10GbE cards, so we rolled over to newer design cards. Younger silicon, younger design. Newer kernel module. I can’t say I am enjoying this experience thus far.
Posts
Real measurement is hard
I had hinted at this last week, so I figure I better finish working on this and get it posted already. The previous bit with language choice wakeup was about the cost of Foreign Function Interfaces, and how well they were implemented. For many years I had honestly not looked as closely at Python as I should have. I’ve done some work in it, but Perl has been my go-to language.
Posts
Anatomy of a #fail ... the internet of broken software stacks
So I’ve been trying to diagnose a problem with my Android devices running out their batteries very quickly. And at the same time, I’ve been trying to understand why my address bar on Thunderbird has taken a very long time to respond. I had made a connection earlier today when I had noticed the 50k+ contacts in my contact list, of which maybe 2000 were unique. I didn’t quite understand it.
Posts
Drivers developed largely out of kernel, and infrequently synced
One of the other aspects of what we’ve been doing has been forward porting drivers into newer kernels, fixing the occasional bug, and often rewriting portions to correct interface changes. I’ve found that subsystem vendors seem to prefer to drop code into the kernel very infrequently. Sometimes once every few years are they synced. Which leads to distro kernels having often terribly broken device support. And often very unstable device support.
Posts
Amusing #fail
I use Mozilla’s thunderbird mail client. For all its faults, it is still the best cross platform email system around. Apple’s mail client is a bad joke and only runs on apple devices (go figure). Linux’s many offerings are open source, portable, and most don’t run well on my Mac laptop. I no longer use Windows apart from running in a VirtualBox environment. And I would never go back to OutLook anyway (used it once, 15 years ago or so … never again).
Posts
Systemd, and the future of Linux init processing
An interesting thing happened over the last few months and years. Systemd, a replacement init process for Linux, gained more adherents, and supplanted the older style init.d/rc scripting in use by many distributions. Ubuntu famously abandoned init.d style processing in favor of upstart and others in the past, and has been rolling over to systemd. Red Hat rolled over to Systemd. As have a number of others. Including, surprisingly, Debian. For those whom don’t know what this is, think of it this way.
Posts
And the 0.8.3 InfluxDB no longer works with the InfluxDB perl module
I ran into this a few weeks ago, and am just getting around to debugging it now. Traced the code, set up a debugger and followed the path of execution, and … and … Yup, its borked. So, I can submit a patch or 3 against the InfluxDB code, or roll a simpler more general Time Series Data Base interface that will talk to InfluxDB. And eventually kdb+. Since I wanted to code for that as well, I am looking more seriously at the second option.
Posts
Shellshock is worse than heartbleed
In part because, well, the patches don’t seem to cover all the exploits. For the gory details, look at the CVE list here. Then cut and paste the local exploits. Even with the latest patched source, built from scratch, there are active working compromises. With heartbleed, all we had to do was nuke keys, patch/update packages, restart machines, cross fingers. This is worse, in that the fixes … well … don’t.