Tuesday, February 1, 2011

Scope of sub processes in sh vs. ksh

Lately I ran into a problem when I started a process in the background in a Bourne shell script and the 'wait' command did not wait but the script ended immediately.
Looking into the issue it was again due to the scope of sub processes where changes made in a sub process do not influence the parent script. Moving the shell script to ksh solved the issue for me (that was on a Solaris 10 system).

So here are some thoughts about sub processes in various types of shells.

Code that works for both shells

The script starts a sleep process in the background and waits for it to finish.
The 'date' commands note the start and end times.
shksh
#!/bin/sh
date '+%M:%S'
while : ; do
  sleep 10&
  break
done
wait
date '+%M:%S'
#!/bin/ksh
date '+%M:%S'
while : ; do
  sleep 10&
  break
done
wait
date '+%M:%S'
results in the same time difference of 10 seconds
01:24
01:34
01:24
01:34

Code that works only for ksh

Why? Because the while loop in sh is a subshell and thus any process started in it will not influence the parent shell which - in this case - has no idea that sub processes have been started at all.
So the Bourne shell script ends immediately whereas the Korn shell script waits as before.
shksh
#!/bin/sh
date '+%M:%S'
echo | while : ; do
  sleep 10&
  break
done
wait
date '+%M:%S'
#!/bin/ksh
date '+%M:%S'
echo | while : ; do
  sleep 10&
  break
done
wait
date '+%M:%S'
results in an immediate response results in the same behaviour as before: 10 seconds
01:24
01:24
01:24
01:34
So a while loop at the end of a command chain in Bourne shell is to be used with caution. Watch out for cmd | while ... ; do ... ; done

Another example is of course the value of variables (this is a probably better known scenario).
shksh
#!/bin/sh
x=parent 
echo | while : ; do 
  x=child 
  break
done 
echo x=$x  
#!/bin/ksh
x=parent
echo | while : ; do
  x=child
  break
done
echo x=$x
results in
x=parent
x=child

I found that bash behaves like sh but with ksh one has to pay attention and the situation is trickier, means depending on which ksh version you are using: the behaviour above relates to ksh on Solaris 10. The differences in Korn shell implementations make porting of ksh scripts always an issue. Some Korn shells (I think pdksh on Linux) behave in the same way as sh but I don't have access to a system so I cannot check right now.

So any coder who regularily writes for different types of shells or who needs to migrate shell scripts from one type of shell to another or even - in case of ksh - migrates shell scripts to other platforms needs to be aware of these sub process scope issues unless he risks ending up with unwanted behaviour - in the worst case destructive behaviour since the script might be doing something completely different to what it was intended to do.

No comments:

Post a Comment