Friday, March 11, 2011

Traps and exit codes in shell scripts

Traps in shell scripts are a nice way to provide cleanups, first of all the removal of temporary files but also any other kind of do-at-the-end things (see also the END clause in awk and Perl).

So for general cleanup one would set the trap for signal 0 which isn't a signal but the indication that the script exited normally with an exit code 0.
Other signals might do something additionally to the cleanup and would get caught by traps on their own.

The questions about signals to be discussed in this article:
  • How to ignore them: sometimes one does not want a script to be interrupted by certain signals
  • How to catch them, react accordingly and exit
  • If exiting after a signal: which exit code should be used
Assuming that we have a handler for exit 0 (when the script ends normal) there also should be a signal handler for signal INT and I want to discuss various setups and what happens after signal INT has been received by the script.
Catch signal INT (Ctrl-C)
and ignore it
Catch signal INT (Ctrl-C)
and exit
trap "echo exiting" 0
trap "echo got INT" INT
trap "echo exiting" 0
trap "echo got INT ; exit 1" INT
This script
  • will echo 'got INT' and
  • will resume its operation and will not end
This script
  • will echo 'got INT' and
  • will exit with exit code 1 to indicate that this was not a normal ending.
    Instead of 1 there could be any number.
There is a special case of the script above if one chooses to exit with 0 after catching a signal.
trap "echo exiting" 0
trap "echo got INT ; exit 0" INT
This script
  • will echo 'got INT' and
  • will echo 'exiting' and
  • will exit with exit code 0.
Instead of 'echo ...' there should be some real action in a production script of course.

So if one decides that a signal should not be ignored there is one big question to be answered: does the observer of the script (a calling script or a user) need to know that the script ended due to receiving a signal and because of which signal in particular? This question should be answered with the consideration in mind that scripts often exit with small exit codes due to something going wrong throughout the script.

  • All (or many) signals are mapped to the same non-zero exit code
    There is little room for variation here. The exit code could be a any number. If the script uses a small number (e.g. exit code 1) that might be indistinguishable from other error induced exit codes in the script. Alternatively one could use a high number (greater than 128) to distinguish endings caused by a signal from other endings in the script. But by mapping all signals to one exit code the script does not give its observer a chance to find out exactly which signal led to its end (this can of course be a deliberate design decision).
    trap "echo got SIGNAL; exit 1" INT QUIT TERM
    (the message 'got SIGNAL' could be used to distinguish this type of exit from other exit 1 reasons in the script)
    trap "echo got SIGNAL; exit 129" INT QUIT TERM

  • Signals should be distinguished from each other i.e. mapped to different exit codes
    Same exit code
    trap "echo got INT; exit 1" INT
    trap "echo got TERM; exit 1" TERM
    All signals lead to the same exit code.
    The echo statement is a differentiator but is probably not present in a real life script.
    Different exit codes
    (small numbers)
    trap "echo got INT; exit 1" INT
    trap "echo got TERM; exit 2" TERM
    Here signals lead to different exit codes.
    Issue: they are probably not distinguishable from other points of exit in the script
    Different exit codes
    (high numbers)
    trap "echo got INT; exit 130" INT
    trap "echo got TERM; exit 143" TERM
    Additionally to showing different exit codes the exit codes have been set with a formula 128+signal_number which follows the convention of sh.

So if you are interested in capturing signals in scripts, ending the script and also getting a meaningful exit code telling you which signal then
  • capture each signal individually
  • explictly put an 'exit n' into the signal handler
  • choose n to be 128+signal

This way a calling script can differentiate:
if [ $ex -eq 0 ] ; then
  # All ok
elif [ $ex -lt 128 ] ; then
  # An error occured in the script
  # Script ended due to signal  $ex-128

No comments:

Post a Comment