Software

HOW TO

With the Shell, You Can Go Wild(card) and Follow Your Pipe Dream

Linux terminal

There is more to the shell, the terminal’s interpreter software, than commands composed of alphanumeric characters. In addition to those familiar programs, there is a whole host of processing tools hiding behind the symbols of a standard keyboard.

To say nothing of their incredible potency in combination, each one is so powerful on its own that it helps to take a methodical approach to getting familiar with them. With that in mind, I’ll narrow the focus here to two of the more practical symbols: the pipe (“|”) and the wildcard (“*”).

To illustrate the versatility of these two shell tools, I’ll use one in-depth example: locating settings in the configuration directory for a package manager.

Spring Cleaning

The package manager is pretty straightforward, as it is the program that determines how your system’s packages — the bundle’s programs come in — are installed, updated, and removed. Without it, you would be unable to install new software, and your system would not stay updated or even run, as programs would be missing key libraries needed to execute properly.

So, what is a configuration directory? While programs traditionally read all their settings from a single configuration file, many newer applications have a directory (usually ending in “.d”) that acts as one big configuration, with each individual file as a subsection.

This approach makes the configuration more modular, letting users add or remove components simply by creating and deleting the respective files. However, it makes it harder to scan through the composite directory-wide configuration for particular settings.

Let’s say we are running a Debian-based distribution, like Ubuntu, and we want to search through the configuration for our package manager, “apt”, which utilizes a configuration directory dividing its configuration into files in the “/etc/apt/apt.conf.d” directory.

One reason we might need to search these settings could be to determine how many old packages our system is holding onto, especially if our system has been installed for a while.

For troubleshooting purposes, Linux systems keep copies of old packages in case users need to revert to them after unstable updates. With stable systems like Ubuntu, though, a more likely problem is that too many backups of large packages — the Linux kernel being a usual suspect — fill up the small slice of the system where boot data is stored.

To solve this problem, we’ll need to find the setting for how many package versions of programs get cached — a daunting prospect when presented with more than a dozen files in “/etc/apt/apt.conf.d”.

This is where our two new tools come in.

The Wildcard

Represented by the asterisk symbol, the wildcard allows the shell to process multiple files at the same time. When used in conjunction with a command that takes a file as an argument, a lone wildcard is treated by the shell as all the files residing in the current directory.

The shell accomplishes this using a process called “expansion” — essentially, in the background, the shell replaces the “*” with the names of every file in the current directory.

In our kernel-caching scenario, we will want to start by examining the contents of every configuration directory file. Displaying the contents of each file one at a time would be time-consuming, but with the wildcard and the “cat” command, we can return all of them at once simply by giving “*” rather than a filename in the target directory as an argument to “cat” (following a space):

$ cat /etc/apt/apt.conf.d/*

This is a good first step, but it’s hard to skim through so much information in the disorderly command return. That’s where our second tool comes in.

The Pipe

The pipe is a simple shell component that sits between two commands and sends the output of one command to the input of the next. A sequence of commands that includes one or more pipes is called a “pipeline,” and a single pipeline can be as simple as two commands — for example, the time-honored construct of sending the output of information-displaying commands into a viewer program like “less”.

As the name suggests, a viewer program is one whose primary application is to let you view data, and it differs from an editor in that it won’t let you edit anything. When dealing with critical settings like those of a package manager, this limitation is greatly beneficial.

We can take the data aggregation of our “cat” command and alloy it with the organization of “less” by constructing a pipeline in which “cat /etc/apt/apt.conf.d/*” and “less” are separated by the “|” (with a space on either side):

$ cat /etc/apt/apt.conf.d/* | less

Our settings are neatly collected in a viewer, but although our pipe to “less” allows us to scroll through the settings, there’s still a lot to scan. We can pare down the amount of data to look over by applying our wildcard to Linux’s powerful search tool, “grep”.

The “grep” command takes a search term, which in our case is “cache,” and (optionally) a file to be searched as space-separated arguments. By giving “cache” as our search and “/etc/apt/apt.conf.d/*” as our file, we can sift through all the files at once:

$ grep cache /etc/apt/apt.conf.d/*

Now, we’ve got a manageable amount of information to scan through, but we’re back to reading it in the terminal’s standard output. Having “less” to view these search results would be convenient, and by reintroducing our pipe to “less”, we can do just that:

$ grep cache /etc/apt/apt.conf.d/* | less

This gives us both the reduced output of matched settings and the neat layout of a viewer. Pipelines that feed into “less” are the most prevalent, but you can pipe any command that prints to the terminal’s standard output into any command that accepts input from the user (standard input) or from a file.

Homing In

Often in configuration directories, filenames start with a two-digit number, with the program loading them in numerical order. So what if we wanted to search only the first files to be loaded, those prepended with “00”?

If a wildcard precedes or follows alphanumeric characters with no space, it will restrict the possible matches from all files to files that end or begin with the specified alphanumeric characters, respectively.

In our example, if we re-run our previous command with “00” after the slash but before the wildcard, “grep” will only search through files starting with “00”:

$ grep cache /etc/apt/apt.conf.d/00*

Conversely, if we want to find only files that end with the word “upgrade,” we can remove the “00” and follow the wildcard with “upgrade”:

$ grep cache /etc/apt/apt.conf.d/*upgrade

Used separately or in combination, pipes, and wildcards can increase your productivity dramatically by giving you the power to narrow or expand the scope of data as needed.

You might even find that these data processing tricks can help you solve a recurring problem in ways you didn’t expect. Either way, taking both of these for a test drive is the best way to see what they can do.

Jonathan Terrasi

Jonathan Terrasi has been an ECT News Network columnist since 2017. In addition to his work as a freelance writer, he is a full-time computer science educator and IT decision-maker. His main interests are information security, with a focus on Linux desktops, and the influence of technology trends on current events. His background also includes providing technical commentary and analysis for the Chicago Committee to Defend the Bill of Rights.

1 Comment

  • LoL

    I was using very similar commands when using UNIX in the military back in the early 90s!

    Everything from creating a non-executable back up directory to sending directory lists to printers. Also useful and fun for sending information directly to other terminal screens on the network……..

Leave a Comment

Please sign in to post or reply to a comment. New users create a free account.

LinuxInsider Channels