Thursday, December 16, 2010

Replace 'grep ...| awk ' by awk pattern matching

I've seen so many occurances of 'grep ...|awk' in my life with people missing out on awk's pattern matching capabilities that I decided to blog about it in what will eventually become a scripting best practices series.

Very often people are tempted to use constructs like this:

grep somepattern filename | awk '{dosomething}'
e.g. find all lines starting with the digit 1
grep '^1' /etc/hosts | awk '{print $2}'

These are 2 processes connected with a pipe and that can be simplified as just one awk process:

awk '/^1/ {print $2}' /etc/hosts

It makes even more sense if there are multiple greps in the pipe.

grep '^1' /etc/hosts | grep -v localhost | awk '{print $2}'
vs.
awk '/^1/ && !/localhost/ {print $2}' /etc/hosts
i.e. combining pattern matching with logical expressions is a useful construct.

This example showed also that an 'if' clause in awk can be written quicker as a pattern match.
The example above is nicer than the equivalent code

awk '/^1/  { if($2!="localhost") print $2}' /etc/hosts
though admittedly both codes are not exactly equal:
the example above rejects any line containing the string 'localhost'
whereas the 'if' example rejects lines where the second field is equal to 'localhost'.

No comments:

Post a Comment