Sunday, 3 March 2019

comm command in Linux with examples

comm compare two sorted files line by line and write to standard output; the lines that are common and the lines that are unique.

comm command, Linux Tutorial and Material, LPI Learning, LPI Certifications

Suppose you have two lists of people and you are asked to find out the names available in one and not in the other, or even those common to both. comm is the command that will help you to achieve this. It requires two sorted files which it compares line by line.
Before discussing anything further first let’s check out the syntax of comm command:

Syntax :

$comm [OPTION]... FILE1 FILE2

◈ As using comm, we are trying to compare two files therefore the syntax of comm command needs two filenames as arguments.
◈ With no OPTION used, comm produces three-column output where first column contains lines unique to FILE1 ,second column contains lines unique to FILE2 and third and last column contains lines common to both the files.
◈ comm command only works right if you are comparing two files which are already sorted.

Example: Let us suppose there are two sorted files file1.txt and file2.txt and now we will use comm command to compare these two.

// displaying contents of file1 //
$cat file1.txt
Apaar
Ayush Rajput
Deepak
Hemant

// displaying contents of file2 //
$cat file2.txt
Apaar
Hemant
Lucky
Pranjal Thakral

Now, run comm command as:

// using comm command for
comparing two files //
$comm file1.txt file2.txt
                Apaar
Ayush Rajput
Deepak
                Hemant
        Lucky
        Pranjal Thakral

The above output contains of three columns where first column is separated by zero tab and contains names only present in file1.txt ,second column contains names only present in file2.txt and separated by one tab and the third column contains names common to both the files and is separated by two tabs from the beginning of the line.

This is the default pattern of the output produced by comm command when no option is used .

Options for comm command:

1. -1 :suppress first column(lines unique to first file).
2. -2 :suppress second column(lines unique to second file).
3. -3 :suppress third column(lines common to both files).
4. – -check-order :check that the input is correctly sorted, even if all input lines are pairable.
5. – -nocheck-order :do not check that the input is correctly sorted.
6. – -output-delimiter=STR :separate columns with string STR
7. – -help :display a help message, and exit.
8. – -version :output version information, and exit.

Note : The options 4 to 8 are rarely used but options 1 to 3 are very useful in terms of the desired output user wants.

Using comm with options

1. Using -1 ,-2 and -3 options : The use of these three options can be easily explained with the help of example :

//suppress first column using -1//
$comm -1 file1.txt file2.txt
         Apaar
         Hemant
 Lucky
 Pranjal Thakral

//suppress second column using -2//
$comm -2 file1.txt file2.txt
        Apaar
Ayush Rajput
Deepak
        Hemant

//suppress third column using -3//
$comm -3 file1.txt file2.txt           
Ayush Rajput
Deepak     
        Lucky
        Pranjal Thakral

Note that you can also suppress multiple columns using these options together as:

//...suppressing multiple columns...//

$comm -12 file1.txt file2.txt
Apaar
Hemant

/* using -12 together suppressed both first
and second columns */

2. Using – -check-order option : This option is used to check whether the input files are sorted or not and in case if either of the two files are wrongly ordered then comm command will fail with an error message.

$comm - -check-order f1.txt f2.txt

The above command produces the normal output if both f1.txt and f2.txt are sorted and it just gives an error message if either of the two files are not sorted.

3. Using – -nocheck-order option : In case if you don’t want to check whether the input files are sorted or not, use this option. This can be explained with the help of an example.

//displaying contents of unsorted f1.txt//

$cat f1.txt
Parnjal
Kartik

//displaying contents of sorted file f2.txt//

$cat f2.txt
Apaar
Kartik

//now use - -nocheck-order option with comm//

$comm - -nocheck-order f1.txt f2.txt
Pranjal
        Apaar
                Kartik

/*as this option forced comm not to check
 the sorted order that's why the output
comm produced is also
not in sorted order*/

4 . – -output-delimiter=STR option: By default, the columns in the comm command output are separated by spaces as explained above. However, if you want, you can change that, and have a string of your choice as separator. This can be done using the –output-delimiter option. This option requires you to specify the string that you want to use as the separator.

Syntax:

$comm - -output-delimiter=STR FILE1 FILE2

EXAMPLE:

//...comm command with - -output-delimiter=STR option...//

$comm - -output-delimiter=+file1.txt file2.txt
++Apaar
Ayush Rajput
Deepak
++Hemant
+Lucky
+Pranjal Thakral

/*+ before content indicates content of
second column and ++ before content
indicates content of third column*/ 

Related Posts

1 comment: