How Lines of Text Work in the Linux Terminal
First rule of Linux combat: know your enemy. So let’s define it. What exactly is a line of text? It’s a sequence of characters—letters, numbers, symbols, and whitespace—that is terminated by a special byte that means “start a new line.” In Linux and Unix, the newline character, also called a linefeed, is used as the end of line indicator. This is a byte with a value of 0x0a in hexadecimal and ten in decimal.
Different operating systems use different byte values to indicate the end of a line. Windows uses a two-byte sequence. In Windows text files, the newline character is followed immediately by the carriage return character, which is 0x0d in hexadecimal and thirteen in decimal.
The terms “linefeed” and “carriage return” date back to the typewriter. The platen, the cylinder that the paper was wrapped around, was mounted on a moveable carriage. The carriage moved one character’s width to the left each time you hit a key. To start a new line, you pushed a lever that brought the carriage back to its original position, and which rotated the roller and moved the paper upwards by the height of one line. This action was known as the carriage return, and the rotation of the cylinder (and the advancement of the paper) was known as a linefeed.
The lever was replaced by a key when the typewriter became electrified. The key was labeled Carriage Return or just Return. Some early computers such as the BBC Micro still used the name Return on what we now call the Enter key.
You can’t see newline characters, as a rule. You can only see their effect. The newline character forces software that displays or processes text to start a new line.
But What’s the Problem With Long Lines?
Text with no, or very few, newline characters in it will be too wide to comfortably read in the terminal window. That’s annoying, but it is do-able.
A more pernicious issue is having to deal with lines of such length that they pose a problem to the software that needs to process, transmit, or receive the text. This might be caused by internal buffer lengths or other aspects of the software that you cannot adjust.
But there’s a fix for that, called fold.
First Steps with fold
Let’s take a look at a portion of text that has very, very long lines in it. Note that we’re not talking about sentences here. (Although the text comes from Herman Melville’s Moby Dick, so we’ve got the best of both worlds.)
A line of text is everything from the last newline character (or the start of the file if it is the first line in the file) all the way up to the next newline character, regardless of what is in between. The line may contain many sentences. It may wrap round in the terminal window many times. But it is still a single line of text.
Let’s look at the text in its raw form:
The text is displayed in less:
The text stretches from one edge of the window to the other, and the line wraps are ugly, and they break words in the middle.
We’ve got another version of the file with short lines:
The lines in this file are much shorter. Each line is terminated with a newline character.
If we use the hexdump command, we can look at the byte values within the file and see the newline characters. The -C (canonical) option formats the output to show hexadecimal values in the main body of the display with the text equivalents in a column at the side. We’ll pipe the output into less:
By pressing the forward slash “/” you’ll enter less‘s search function. Type “0a” and Press Enter. The newline characters will be highlighted in the text. You can scroll through the file and see where they appear. If you need to, you can scroll the output sideways using the Left Arrow and Right Arrow keys.
Having a newline character at the end of each line can be a limitation in itself. No matter what program or window displays this text, the lines cannot adapt to windows with a width wider than the lines themselves. The line length has been capped by the newline characters.
So there are problems with long lines and short lines alike.
Reducing Long Lines
The fold command has an option -w (width) that lets you to specify a new maximum width for a section of text. We’ll display the Moby Dick text with a maximum width of 50 characters:
The text is displayed in the terminal window, with the new maximum file length. The original file isn’t changed. It is only the output from fold that is reformatted.
At first glance, this looks a lot better. But words are still getting split in the middle at the ends of lines. It is definitely easier to read, but some of the awkward word breaks are jarring.
Although it looks like the right-hand margin of the text wavers in and out, all of the line lengths are the same. The lines that appear to be one character shorter than the rest happen to end in a space character.
Splitting Lines at Spaces
We can use the -s (spaces) option to make sure that lines are only split on space characters, and no words are broken across two lines.
The output now has a ragged right-hand margin, but it is easier to read. All of the words finish on the lines they started on.
Making Short Lines Longer
As well as making long lines shorter, we can use fold to remove the enforced line lengths of shorter lines.
The newline characters are removed, and the text now wraps on or before the allotted maximum length.
Making Changes Permanent
fold can’t modify the original file. If you want to keep the changes, you’ll have to redirect the output from fold into a new file. We’ll redirect the output into a file called “modified-moby-dick.txt.”
Let’s have a look at our new file:
How does our new file look?
The text is now wrapping neatly at our new line width, which is wider than the original file’s line lengths.
Using fold With Streams
We can use fold to reformat streams of text. It is not restricted to working solely with files. Let’s see what the raw output from the journalctl tool looks like. The -f (follow) option shows the newest entries in the systemd journal and updates as new entries arrive.
The output wraps at the terminal window’s edge.
It doesn’t look too bad, but for the sake of demonstration, let’s reduce its width slightly. We’re going to pipe the output from journalctl into fold. We’re setting the maximum width to 65 characters, and we’re breaking the lines on spaces only.
The display looks a little less overwhelming and a touch neater too.
Walls of solid text can seem impenetrable. They’re off-putting and sapping to deal with. When you need to be able to see the wood from the trees, call on fold and impose a bit of order.