Sunday 15 July 2018

Split File Using Awk Command - Unix / Linux

The split command in unix is used for creating fixed size pieces of output files from a given input file. However there is no option in the split command for creating output files based on some conditions on the input file data.

AWS Command, Split Command, Unix Command, Linux Command

The awk command can be used to split a file based on some conditions into multiple output files. Let see this with an example. The following file regions.dat has data about two regions Asia and Europe:

> cat regions.dat
Asia 123
Asia 456
Europe 789
Europe 956

The requirement is to create a separate file for each region. This means one file for Asia region and another file for Europe region.

Split Using Awk Command


The following awk command splits the input file on the region condition and creates two separate files:

> awk '{filename=$1; print $0 > filename;}' regions.dat
> ls 
Asia Europe regions.dat

> cat Asia
Asia 123
Asia 456

> cat Europe
Europe 789
Europe 956

In the above command, we assigned the first column of the input, which is region, to the filename variable and then printed each line to the filename.

Adding Header and Footer Records


We can modify the above command and add header and footer records to each output file. The below awk command adds header and footer records:

> awk '{filename=$1; Curr_filename =$1;if(Curr_filename != Prev_filename) { if(flag == 1) print "footer" > Prev_filename; print "Header" > filename; } else {flag=1} print $0 > filename; Prev_filename=$1} END{print "footer" > filename}' regions.dat

> ls 
Asia Europe regions.dat

> cat Asia
Header
Asia 123
Asia 456
Footer

> cat Europe
Header
Europe 789
Europe 956
Footer

Related Posts

0 comments:

Post a Comment