For those who are not familiar with uniq command, it is a command line tool which is used to report or omit repeated strings or lines. This basically filter adjacent matching lines from INPUT (or standard input) and write to OUTPUT (or standard output). With no options, matching lines are merged to the first occurrence.
Below are few examples of usage of the uniq command
1) Omit duplicates
Executing the uniq commands without specifying any parameters simply omits duplicates and displays a unique string output.
fluser@fvm:~/Documents/files$cat file1
Hello
Hello
How are you?
How are you?
Thank you
Thank you
fluser@fvm:~/Documents/files$ uniq file1
Hello
How are you?
Thank you
2) Display number of repeated lines
With the -c parameter, it is possible to view the duplicate line count in a file
fluser@fvm:~/Documents/files$ cat file1
Hello
Hello
How are you?
How are you?
Thank you
Thank you
fluser@fvm:~/Documents/files$ uniq -c file1
2 Hello
2 How are you?
2 Thank you
3) Print only the duplicates
By using -d parameter, we can select only the lines which have been duplicated inside a file
fluser@fvm:~/Documents/files$ cat file1
Hello
Hello
Good morning
How are you?
How are you?
Thank you
Thank you
Bye
fluser@fvm:~/Documents/files$ uniq -d file1
Hello
How are you?
Thank you
4) Ignore case when comparing
Normally when you use the uniq command it take the case of letters into consideration. But if you want to ignore the case, you can use -i parameter
fluser@fvm:~/Documents/files$ cat file1
Hello
hello
How are you?
How are you?
Thank you
thank you
fluser@fvm:~/Documents/files$ uniq file1
Hello
hello
How are you?
Thank you
thank you
fluser@fvm:~/Documents/files$ uniq -i file1
Hello
How are you?
Thank you
5) Only print unique lines
If you only want to see the unique lines in a file, you can use -u parameter
fluser@fvm:~/Documents/files$ cat file1
Hello
Hello
Good morning
How are you?
How are you?
Thank you
Thank you
Bye
fluser@fvm:~/Documents/files$ uniq -u file1
Good morning
Bye
6) Sort and find duplicates
Sometimes duplicate entries may contain in different places of a files. In that case if we simply use the uniq command, it will not detect these duplicate entries in different lines. In that case we first need to sort the file and then we can find duplicates
fluser@fvm:~/Documents/files$ cat file1
Adam
Sara
Frank
John
Ann
Matt
Harry
Ann
Frank
John
fluser@fvm:~/Documents/files$ sort file1 | uniq -c
1 Adam
2 Ann
2 Frank
1 Harry
2 John
1 Matt
1 Sara
7) Save the output in another file
The output of our uniq command can be simply saved in another file as below
fluser@fvm:~/Documents/files$ cat file1
Hello
Hello
How are you?
Good morning
Good morning
Thank you
fluser@fvm:~/Documents/files$ uniq -u file1
How are you?
Thank you
fluser@fvm:~/Documents/files$ uniq -u file1 output
fluser@fvm:~/Documents/files$ cat output
How are you?
Thank you
8) Ignore characters
In order to ignore few characters at the beginning you can use -s parameter, but you need to specify the number of characters you need to ignore
fluser@fvm:~/Documents/files$ cat file1
1apple
2apple
3pears
4banana
5banana
fluser@fvm:~/Documents/files$ uniq -s 1 file1
1apple
3pears
4banana
0 comments:
Post a Comment