Thursday, November 22, 2012

Newline in awk and Perl

When someone switches from using awk to using Perl one of the beginners mistakes is to understand that awk does some things automatically which you need to code in Perl.

One example: end of line.
awk automatically drops the newline character at the end of a line. In Perl you need to do that manually by using the chomp function (or some other method).

Since echo abc does print the string abc plus a newline character the following string comparison works well in awk:

echo abc | awk '{if($0=="abc") print "yes"}'

will print "yes" whereas the seemingly equivalent in Perl does not:

echo abc | perl -e 'if(<> eq "abc") { print "yes\n" }'

The experienced Perl coder probably does this:

echo abc | perl -e '$a=<>; chomp($a); if($a eq "abc") {print "yes\n" }'

Of course you could change the test and explicitly check for the newline (or better inpout record separator $/ in Perl)

echo abc | perl -e 'if(<> eq "abc$/") {print "yes\n" }'

Or you could use echo -n to feed just a string without newline. And of course print in Perl requires a newline too whereas awk adds it automatically.

A bit more complex: check the content of the last field in an input line.
awk:
echo abc xyz | awk '{ if($NF=="xyz") print "yes" }' 
Perl:
echo abc xyz | perl -e '@a=split / /,<>; chomp(@a); if($a[$#a] eq "xyz") {print "yes\n" }'
Before chomp() the last field equals the string xyz plus a newline and the comparison test will fail.

Perl in that sense is more precise and gives the user greater control, on the other hand awk is an old but well established UNIX tool whose inbuilt features can be used to one's advantage.

It is nice to have tools which do things automatically, the drawback is that you are so getting used to it that over time you forget that these automations exist.

(A real life example for me: my car has parking sensors. After one year I'm already so used to its existence that whenever driving another car I tend to forget that I have to use the good old fashioned method rather than waiting for the frequency of the beeps.) 


No comments:

Post a Comment