Thursday, 27 February 2020

LPIC-1 Using SED

LPIC-1 Using SED, LPIC-1 Tutorials and Materials, LPIC Exam Prep, LPI Guides

Sed is extremely powerful, and the tasks it can accomplish are limited only by your imagination. This small introduction should whet your appetite for sed, but is not intended to be complete or extensive.

As with many of the text commands we have looked at so far, sed can work as a filter or take its input from a file. Output is to the standard output stream. Sed loads lines from the input into the pattern space, applies sed editing commands to the contents of the pattern space, and then writes the pattern space to standard output. Sed might combine several lines in the pattern space, and it might write to a file, write only selected output, or not write at all.

Sed uses regular expression syntax to search for and replace text selectively in the pattern space as well as to control which lines of text should be operated on by sets of editing commands. Regular expressions are covered more fully in the tutorial on searching text files using regular expressions. A hold buffer provides temporary storage for text. The hold buffer might replace the pattern space, be added to the pattern space, or be exchanged with the pattern space. Sed has a limited set of commands, but these combined with regular expression syntax and the hold buffer make for some amazing capabilities. A set of sed commands is usually called a sed script.

Listing 1 shows three simple sed scripts. In the first one, we use the s (substitute) command to substitute an uppercase for a lowercase 'a' on each line. This example replaces only the first 'a', so in the second example, we add the 'g' (for global) flag to cause sed to change all occurrences. In the third script, we introduce the d (delete) command to delete a line. In our example, we use an address of 2 to indicate that only line 2 should be deleted. We separate commands using a semi-colon (;) and use the same global substitution that we used in the second script to replace 'a' with 'A'.

Listing 1. Beginning sed scripts

ian@Z61t-u14:~/lpi103-2$ sed 's/a/A/' text1
1 Apple
2 peAr
3 bAnana
ian@Z61t-u14:~/lpi103-2$ sed 's/a/A/g' text1
1 Apple
2 peAr
3 bAnAnA
ian@Z61t-u14:~/lpi103-2$ sed '2d;$s/a/A/g' text1
1 apple
3 bAnAnA

In addition to operating on individual lines, sed can operate on a range of lines. The beginning and end of the range is separated by a comma (,) and can be specified as a line number, a regular expression, or a dollar sign ($) for the end of file. Given an address or a range of addresses, you can group several commands between curly braces, { and } to have these commands operate only on lines selected by the range. Listing 2 illustrates two ways of having our global substitution applied to only the last two lines of our file. It also illustrates the use of the -e option to add multiple commands to the script.

Listing 2. Sed addresses

ian@Z61t-u14:~/lpi103-2$ sed -e '2,${' -e 's/a/A/g' -e '}' text1
1 apple
2 peAr
3 bAnAnA
ian@Z61t-u14:~/lpi103-2$ sed -e '/pear/,/bana/{' -e 's/a/A/g' -e '}' text1
1 apple
2 peAr
3 bAnAnA

Sed scripts can also be stored in files. In fact, you probably want to do this for frequently used scripts. Remember earlier we used the tr command to change blanks in text1 to tabs. Let's now do that with a sed script stored in a file. We use the echo command to create the file. The results are shown in Listing 3.

Listing 3. A sed one-liner

ian@Z61t-u14:~/lpi103-2$ echo -e "s/ /\t/g">sedtab
ian@Z61t-u14:~/lpi103-2$ cat sedtab
s/ /    /g
ian@Z61t-u14:~/lpi103-2$ sed -f sedtab text1
1   apple
2   pear
3   banana

There are many handy sed one-liners such as Listing 3.

Our final sed example uses the = command to print line numbers and then filter the resulting output through sed again to mimic the effect of the nl command to number lines. The = command in sed prints the current line number followed by a newline character, so the output contains two lines for each input line. Listing 4 uses = to print line numbers, then uses the N command to read a second input line into the pattern space, and finally removes the newline character (\n) between the two lines in the pattern space to merge the two lines into a single line.

Listing 4. Numbering lines with sed

ian@Z61t-u14:~/lpi103-2$ sed '=' text2
1
9   plum
2
3   banana
3
10  apple
ian@Z61t-u14:~/lpi103-2$ sed '=' text2|sed 'N;s/\n//'
19  plum
23  banana
310 apple

Not quite what we wanted! What we would really like is to have our numbers aligned in a column with some space before the lines from the file. In Listing 5, we enter several lines of commands (note the > secondary prompt). Study the example and refer to the explanation below.

Listing 5. Numbering lines with sed - round two

ian@Z61t-u14:~/lpi103-2$ cat text1 text2 text1 text2>text6
ian@Z61t-u14:~/lpi103-2$ ht=$(echo -en "\t")
ian@Z61t-u14:~/lpi103-2$ sed '=' text6|sed "N
> s/^/      /
> s/^.*\(......\)\n/\1$ht/"
     1  1 apple
     2  2 pear
     3  3 banana
     4  9   plum
     5  3   banana
     6  10  apple
     7  1 apple
     8  2 pear
     9  3 banana
    10  9   plum
    11  3   banana
    12  10  apple

Here are the steps that we took:
  1. We first used cat to create a 12-line file from two copies each of our text1 and text2 files. There's no fun in formatting numbers in columns if we don't have differing numbers of digits.
  2. The bash shell uses the tab key for command completion, so it can be handy to have a captive tab character that you can use when you want a real tab. We use the echo command to accomplish this and save the character in the shell variable 'ht'.
  3. We create a stream that contains line numbers followed by data lines as we did before and filter it through a second copy of sed.
  4. We read a second line into the pattern space.
  5. We prefix our line number at the start of the pattern space (denoted by ^) with six blanks.
  6. We then substitute all of the pattern space up to and including the first newline with the six characters immediately before the newline plus a tab character. This aligns our line numbers in the first six columns of the output line. The original line from the text6 file follows the tab character. Note that the left part of the 's' command uses '\(' and '\)' to mark the characters that we want to use in the right part. In the right part, we reference the first such marked set (and only such set in this example) as \1. Note that our command is contained between double quotation marks (") so that substitution occurs for $ht.

Related Posts

0 comments:

Post a Comment