Tech Blog

DEPLOYING LINUX

It Came From the Live-Boot: A True Linux Horror Story

data room

The Linux desktop distros of today are the most accessible to complete newcomers as they have ever been. There was a time not long ago when only truly intrepid computer thrill seekers would dare install Linux. Now, not only can one get Linux installed on most desktop or laptop computer hardware in 15 minutes, but one can hand it off to anyone with the loosest grasp on how to use computers and expect them to be just fine.

All of that said, once in a blue moon, one will experience sheer terror at the hands of a buggy Linux system. No amount of battle-hardiness can keep you completely safe, either.

I know this because not too long ago, a fear-inducing Linux bug came for me. I wanted to share this true Linux story so that you may be informed and entertained. Out of respect to hard-working Linux distro developers who make honest mistakes, I will not name the offending distribution, but to add an air of ominousness, I will note that it has consistently ranked in DistroWatch’s Top 10 for at least a year.

To those of you who place unshakeable confidence in “mainstream” distros: You have been warned. Now, then, let us begin.

Nowhere Safe

One day, I decided to try out a distro I had read a fair amount of buzz about. It was also based on another distro that I really liked. Even better. In the video reviews I watched, it got high marks and looked polished. That was enough for me to want to see it for myself.

When I took it for a spin, everything seemed fine except for one weird thing. No matter where I navigated in the browser, it told me that the website I was visiting wasn’t safe. But since I was trying to browse sites I knew to be trustworthy, I just told the browser to ignore the warnings, and I was on my way. This was the sign I should have taken more seriously but, fatefully, did not.

I wasn’t ready to install this distribution, but maybe I would in the mid-future, I thought. I walked away with a decent enough impression, and chalked the strange browser errors up to a missing update. After all, few distros that aren’t operating on a rolling release model (and not even all of those) will post a new installation image every time there is a package update — they post their major releases every few months to a year and leave it to users to run the updates.

With my test completed, I booted the operating system installed permanently on my computer’s drive, intent on going about my business.

Except the networking was broken.

NetworkManager, the little piece of software on practically every desktop Linux system that gets your wireless card to talk to another device’s network interface controller (NIC), was outputting expected results for its scans of active SSIDs with my device’s wireless card, but it wouldn’t let me connect. No matter which network I attempted to connect to, every connection attempt failed. To all these access points, my computer was a ghost. It didn’t even exist.

Bad Update?

Thinking it could be something wrong with my installed OS — maybe I ran a bad update that didn’t take effect until this reboot, I tried reassuring myself — I live-booted a second, reliably stable Linux-based OS to see if I would encounter the same networking problems. But this, too, was to no avail. No matter what settings I tried changing or how many times I restarted NetworkManager, I could not get it to establish a connection with any access points.

Then I tried live-booting a third distro, one that the first suspect distro was based on. This time, my networking functioned properly, albeit with the same Web browsing warnings.

But for this respite from utterly disconnected oblivion, I would have completely panicked. What was I going to do? Was my machine broken? An issue like this that persisted across boots, both when booted live and from bare metal, suggested that I was staring down the barrel of a hardware issue — and for an OS to cause a hardware malfunction is extremely frightening.

At the same time, though, my machine was fine in all other respects. It was fine before the initial live boot, for one thing. For another thing, I take good care of my devices, so I did not do anything in the week leading up to this incident that could have damaged it — especially in a way that only disrupted this specific functionality and nothing else.

Mystery Solved

In my frenzied troubleshooting, I had failed to consider one last point of compromise: the firmware. When I finally realized this glaring oversight, I hastily booted into the BIOS firmware, looking for the slightest setting that might be out of place. There it was: The hardware clock had been changed. As far as my computer was concerned, it was almost 100 years in the future.

This explained the browser warnings. TLS certificates, the ones that give us those comforting padlock icons in our browser’s address bar, come with expirations. Usually, these certificates expire after two or three years so that a stolen certificate can’t be abused to trick unsuspecting users indefinitely. The logical frame of reference one’s device would have when evaluating a certificate’s validity would be its own clock.

To my computer, every certificate that anyone could have realistically generated was expired. My computer had done a full Rip Van Winkle, waking up one boot only to discover that every website it had ever known was, cryptographically speaking, dead.

I couldn’t find anything online to confirm that a misconfigured hardware clock would foil NetworkManager, but such a thing was conceivable, and I could devise no other explanation. The hardware clock is the meter stick by which many other hardware states are measured. It was not a stretch to think that a device’s NIC was among these. Neither was it far-fetched to think that live-boot #1 could have reset my hardware clock.

An OS is well within its rights to access, and in some cases even change, values in the device’s firmware. This includes the hardware clock commonly enough. But for the sanity of all involved, OS developers should do their best to ensure that firmware access is sparing.

A Tale of Two Clocks

Why didn’t I notice this before, you ask? In a word, like any good horror story, the monster isn’t right behind you the first time you look.

There is a difference between the hardware clock and the system clock. This is so that the hardware clock, which I had by now gained a substantially greater appreciation for, does not get changed at the drop of a hat.

Imagine you are a frequent international traveler. Would you want to risk a catastrophic error like the one I suffered every time you changed time zones and adjusted your clock? I would think not. So, the OS gets around this by simply remembering the offset between its system clock and the hardware clock.

Let’s say you are on Eastern Standard Time. If your hardware clock is set to UTC, then since EST is 5 hours behind UTC, your system clock would just subtract 5 hours from your hardware clock to set itself. A trip to Pacific Standard Time would merely prompt your system to change its clock to 8 hours behind UTC instead of the previous 5. Meanwhile, the hardware clock is unchanged.

Whether the system clock on that fateful live-booted distro was as skewed as the hardware clock and I didn’t notice, or whether nothing was amiss with the system clock by all appearances, I will never know. All I know is that as soon as I fixed my hardware clock, the networking worked completely normally again.

It was as if nothing had happened.

Lesson Learned

If there is a moral to this story, perhaps it is this: that danger can lurk just below the software and you’ll never know until it strikes. Or perhaps it is that there is no component — software, hardware, or firmware — too insignificant to topple an entire system.

Either way, I have a renewed vigilance for the ticking of the hardware clock. I refuse to live in fear — nor should you — and still try distros on occasion. But this is simply to say that to brave the potential risks, few though they are, is not to be careless.

As an epilogue to this story, some months after the fact, I thought to post this on the offending distro’s forum, in case the problem still lurked. Unlikely as this was, it was not unheard of.

I posted the kernel log that I had taken when I was in the throes of this mortal struggle and conceded that I was late for want of time (and the complacency of relief after I escaped my peril). All the replying users assured me that it was highly improbable that such an ill could come to pass. They thought me mad! (Not quite, but indulge me in my artistic license.)

So, mayhaps, ’twas all a feverish nightmare. But to this day, whenever I boot a live distro, I check the dark, neglected corner for the date and time.

Jonathan Terrasi

Jonathan Terrasi has been an ECT News Network columnist since 2017. In addition to his work as a freelance writer, he is a full-time computer science educator and IT decision-maker. His main interests are information security, with a focus on Linux desktops, and the influence of technology trends on current events. His background also includes providing technical commentary and analysis for the Chicago Committee to Defend the Bill of Rights.

Leave a Comment

Please sign in to post or reply to a comment. New users create a free account.

LinuxInsider Channels