The split command in unix is used for creating fixed size pieces of output files from a given input file. However there is no option in the split command for creating output files based on some conditions on the input file data.
The awk command can be used to split a file based on some conditions into multiple output files. Let see this with an example. The following file regions.dat has data about two regions Asia and Europe:
> cat regions.dat
Asia 123
Asia 456
Europe 789
Europe 956
The requirement is to create a separate file for each region. This means one file for Asia region and another file for Europe region.
The following awk command splits the input file on the region condition and creates two separate files:
> awk '{filename=$1; print $0 > filename;}' regions.dat
> ls
Asia Europe regions.dat
> cat Asia
Asia 123
Asia 456
> cat Europe
Europe 789
Europe 956
In the above command, we assigned the first column of the input, which is region, to the filename variable and then printed each line to the filename.
We can modify the above command and add header and footer records to each output file. The below awk command adds header and footer records:
> awk '{filename=$1; Curr_filename =$1;if(Curr_filename != Prev_filename) { if(flag == 1) print "footer" > Prev_filename; print "Header" > filename; } else {flag=1} print $0 > filename; Prev_filename=$1} END{print "footer" > filename}' regions.dat
> ls
Asia Europe regions.dat
> cat Asia
Header
Asia 123
Asia 456
Footer
> cat Europe
Header
Europe 789
Europe 956
Footer
> cat regions.dat
Asia 123
Asia 456
Europe 789
Europe 956
The requirement is to create a separate file for each region. This means one file for Asia region and another file for Europe region.
Split Using Awk Command
The following awk command splits the input file on the region condition and creates two separate files:
> awk '{filename=$1; print $0 > filename;}' regions.dat
> ls
Asia Europe regions.dat
> cat Asia
Asia 123
Asia 456
> cat Europe
Europe 789
Europe 956
In the above command, we assigned the first column of the input, which is region, to the filename variable and then printed each line to the filename.
Adding Header and Footer Records
We can modify the above command and add header and footer records to each output file. The below awk command adds header and footer records:
> awk '{filename=$1; Curr_filename =$1;if(Curr_filename != Prev_filename) { if(flag == 1) print "footer" > Prev_filename; print "Header" > filename; } else {flag=1} print $0 > filename; Prev_filename=$1} END{print "footer" > filename}' regions.dat
> ls
Asia Europe regions.dat
> cat Asia
Header
Asia 123
Asia 456
Footer
> cat Europe
Header
Europe 789
Europe 956
Footer
0 comments:
Post a Comment