Andreas' Technical Tidbits: 2011

Wednesday, March 16, 2011

'tee' coprocess vs. 'tee' pipe in Korn shell

I had this piece of code which logged the output of a while loop to a file but also showed it on the screen:

while ...
do
...
done | tee logfile

A code change required to set a variable in the while loop and make it known after the loop had ended so I tried

while ...
do
   x=123
...
done | tee logfile
echo x=$x

but x was empty since in a piped sequence the while loop is considered to be a sub shell and thus it cannot set variables in the parent process.

There is a solution though in Korn shell using coprocesses.

# Start a coprocess to log input to logfile via tee 
# and report it also to the current tty
(tee logfile >/dev/tty)|&

# Send output of while loop to coprocess
while ...
do
   x=123
...
done >&p
echo x=$x

will report x correctly.

This works fine for my example where the script is run always in a terminal window.

If the script is run in the background or via cron its (terminal) output needs to be captured, anything else does not make sense, the script writer had an idea why things would be written to the terminal, surely not just for fun.

Tuesday, March 15, 2011

How to create a drop-down menu in OpenOffice.org

There are two easy ways how to create a drop-down menu in OpenOffice.org calc (this worked already in version 2.x and has been tested up to 3.2.1).

The difference between the two ways is the source of the selection entries.

Selection entries are to be entered manually

Go to a cell where the drop-down menu should appear.
Do Data -> Validity....
In the popup window choose the Criteria tab (usually selected by default).

In the Allow menu choose List.

It will show a new textfield called Entries where you enter the choices for your drop-down menu line by line.

Selection entries come from a cell range on the spreadsheet

Go to a cell where the drop-down menu should appear.
Do Data -> Validity....
In the popup window choose the Criteria tab (usually selected by default).

In the Allow menu choose Cell range.

It will show a new textfield called Source where you enter the choices for your drop-down menu as a cell range e.g. B2:B10.
Note: the cell range needs to be entered manually and can't be selected on the spreadsheet.

Monday, March 14, 2011

Background processes and file descriptors in shell scripts

Lately I stumbled upon an issue in a shell script which left me puzzling for a while.

Reduced to a simple example it goes like this:
envision you have two files wrapper.sh and script.sh whereas wrapper.sh is supposed to call script.sh in backticks:

wrapper.sh

#!/bin/sh
x=`script.sh`      # run script.sh and collect its output    
echo x=$x

script.sh

#!/bin/sh
(sleep 60)&        # start a background process
echo pid=$!        # report the pid of the background process
exit 0

The expected output of wrapper.sh was x=pid=12345 kind of immediately after running it.

The unexpected but experienced behaviour was that wrapper.sh was waiting until the background process had finished. This was defying the purpose of the script since in the original scenario wrapper.sh should have managed (like sending signals) the background process after doing some work in between.

Some experimenting with variations of the scripts and some reading finally revealed the clue to the issue.

Background (better: forked) processes inherit the file descriptors of their parent process
i.e. the 'sleep' background process has the same open fds as script.sh
Running a command in backticks means to collect its stdout until its stdout is closed
i.e. wrapper.sh waits until the stdout of script.sh is closed for good.
Since the 'sleep' background process writes to the same stdout as script.sh the fd is kept open even after script.sh has finished.
It does not matter if 'sleep' is actually writing anything or not, the point is that if it would write something it would write to the inherited open stdout.
('sleep' is just an example. In the real world it would very likely be another script with some complex tasks to fulfil).

The solution is to close stdout of the background process

#!/bin/sh
(exec >&-; sleep 60)&  # start a background process but close stdout first
echo pid=$!            # report the pid of the background process
exit 0

Some hints to explain the situation is the process table showing that wrapper.sh has a defunct sub process (the former script.sh) and the 'sleep' process is a child of init (pid 1). Also a slightly different sub process (echo sub; sleep 60)& leads to x=pid=12345 sub thus showing that wrapper.sh gathered the output of script.sh plus the output of the the sub process.

I wonder how many people are paying attention to this, it is an issue which can be easily overlooked. In essence background processes in scripts like script.sh are daemons since script.sh gives up control of the sub process by simple exiting at some point. So who controls the sub processes, in particular where should they write their output to? Rereading the essentials of a daemon process helps and I will definitly pay more attention to this in the future.

An experiment for the curious:
what happens if stdout was redirected to a file and multiple sub processes were started, each writing to stdout aka. the file? Would everything be written to the file? In which order?

#!/bin/sh
exec 1>/tmp/out
(for i in 1 2 3 4 5; do echo aaaaaaaa; sleep 1 ; done)&
(for i in 1 2 3 4 5; do echo bbbbbbbb; sleep 1 ; done)&
echo DONE

Friday, March 11, 2011

Traps and exit codes in shell scripts

Traps in shell scripts are a nice way to provide cleanups, first of all the removal of temporary files but also any other kind of do-at-the-end things (see also the END clause in awk and Perl).

So for general cleanup one would set the trap for signal 0 which isn't a signal but the indication that the script exited normally with an exit code 0.
Other signals might do something additionally to the cleanup and would get caught by traps on their own.

The questions about signals to be discussed in this article:

How to ignore them: sometimes one does not want a script to be interrupted by certain signals
How to catch them, react accordingly and exit
If exiting after a signal: which exit code should be used

Assuming that we have a handler for exit 0 (when the script ends normal) there also should be a signal handler for signal INT and I want to discuss various setups and what happens after signal INT has been received by the script.

Catch signal INT (Ctrl-C) and ignore it	Catch signal INT (Ctrl-C) and exit
#!/bin/sh trap "echo exiting" 0 trap "echo got INT" INT ...	#!/bin/sh trap "echo exiting" 0 trap "echo got INT ; exit 1" INT ...
This script will echo 'got INT' and will resume its operation and will not end	This script will echo 'got INT' and will exit with exit code 1 to indicate that this was not a normal ending. Instead of 1 there could be any number.
	There is a special case of the script above if one chooses to exit with 0 after catching a signal. #!/bin/sh trap "echo exiting" 0 trap "echo got INT ; exit 0" INT ... This script will echo 'got INT' and will echo 'exiting' and will exit with exit code 0.

Catch signal INT (Ctrl-C)
and ignore it

Catch signal INT (Ctrl-C)
and exit

#!/bin/sh
trap "echo exiting" 0
trap "echo got INT" INT
...

#!/bin/sh
trap "echo exiting" 0
trap "echo got INT ; exit 1" INT
...

This script

will echo 'got INT' and
will resume its operation and will not end

This script

will echo 'got INT' and
will exit with exit code 1 to indicate that this was not a normal ending.
Instead of 1 there could be any number.

There is a special case of the script above if one chooses to exit with 0 after catching a signal.

#!/bin/sh
trap "echo exiting" 0
trap "echo got INT ; exit 0" INT
...

This script

will echo 'got INT' and
will echo 'exiting' and
will exit with exit code 0.

Instead of 'echo ...' there should be some real action in a production script of course.

So if one decides that a signal should not be ignored there is one big question to be answered: does the observer of the script (a calling script or a user) need to know that the script ended due to receiving a signal and because of which signal in particular? This question should be answered with the consideration in mind that scripts often exit with small exit codes due to something going wrong throughout the script.

All (or many) signals are mapped to the same non-zero exit code
There is little room for variation here. The exit code could be a any number. If the script uses a small number (e.g. exit code 1) that might be indistinguishable from other error induced exit codes in the script. Alternatively one could use a high number (greater than 128) to distinguish endings caused by a signal from other endings in the script. But by mapping all signals to one exit code the script does not give its observer a chance to find out exactly which signal led to its end (this can of course be a deliberate design decision).

trap "echo got SIGNAL; exit 1" INT QUIT TERM

(the message 'got SIGNAL' could be used to distinguish this type of exit from other exit 1 reasons in the script)
or

trap "echo got SIGNAL; exit 129" INT QUIT TERM

Signals should be distinguished from each other i.e. mapped to different exit codes

Same exit code	trap "echo got INT; exit 1" INT trap "echo got TERM; exit 1" TERM	All signals lead to the same exit code. The echo statement is a differentiator but is probably not present in a real life script.
Different exit codes (small numbers)	trap "echo got INT; exit 1" INT trap "echo got TERM; exit 2" TERM	Here signals lead to different exit codes. Issue: they are probably not distinguishable from other points of exit in the script
Different exit codes (high numbers)	trap "echo got INT; exit 130" INT trap "echo got TERM; exit 143" TERM	Additionally to showing different exit codes the exit codes have been set with a formula 128+signal_number which follows the convention of sh.

So if you are interested in capturing signals in scripts, ending the script and also getting a meaningful exit code telling you which signal then

capture each signal individually
explictly put an 'exit n' into the signal handler
choose n to be 128+signal

This way a calling script can differentiate:

script.sh
ex=$?
if [ $ex -eq 0 ] ; then
  # All ok
elif [ $ex -lt 128 ] ; then
  # An error occured in the script
else
  # Script ended due to signal  $ex-128
fi

Thursday, March 10, 2011

Counting lines in shell variables

Very often one stores the output of a command in a variable, sometimes the output is a multiline string and one wants to count the number of lines (all of the below is in Bourne shell).

Example:

A=`who`
NUM_USERS=`echo "$A"|wc -l`
echo "Number of users: $NUM_USERS"

Unfortunately this approach is not correct.

If there are users on the system it works fine and the correct count is reported.
But if there are no users then the code above still reports a user count of 1.

Why?
Because echo of an empty variable still adds a newline to the output which is counted by wc. Look at this example (assuming that you don't have a variable called avTyh).

echo "$avTyh" |wc -l
1

Replacing echo by printf does not help much.

printf "$A"|wc -l

works correctly for empty variables but it counts wrongly for a multiline string since it omits the final newline.

Rather than using an if ... else ... construct there is a more elegant solution. Look at this:

printf "$A${A:+\n}" |wc -l

The parameter substitution ${A:+\n} achieves the following:
If $A is set and is non-null then substitute a newline; otherwise substitute nothing.
So if $A is empty then "$A${A:+\n}" is something empty and nothing which counts to zero.
If $A is non-empty then "$A${A:+\n}" is something plus a newline.

Note: in the end I think the strange thing is that in an assignment like A=`who` the string is missing a final newline which leads to the issue after all, I should check why this is the case, maybe in another post.

Friday, March 4, 2011

Signal handling in shell background processes

This article is about my learning experience in signal handling and monitoring sub processes.

Yesterday I got puzzled when a supposedly simple test program did not act as expected.
The shell script trap.sh below sets a trap to catch SIGINT (or signal 2) and exit upon receiving it. It works as expected when run standalone but when invoked as a background process in another test script trapWrapper.sh it failed.

trap.sh

trapWrapper.sh

#!/bin/sh
# Catch signal 2
trap "echo trapped 2;I=1" 2
# Wait until var 'I' is set to something
while : ; do 
  sleep 1; [ -n "$I" ] && break 
done
echo DONE

#!/bin/sh
# Run trap.sh in the background
./trap.sh 2>&1  &
pid=$!
# Sleep 5 seconds 
sleep 5
# ... and then kill the background process
kill -2 $pid
# Wait 
wait $pid
# Exit with exit code of background process
exit $?

Running trapWrapper.sh will wait forever and never end.
When killing it with Ctrl-C it will go away but the trap.sh
process will be left behind and needs to be killed manually.

(the bigger idea behind all this is to have a monitoring script which starts a number of background processes and kills them after a certain timeout period has passed).

So what's the difference when run in background?

The sh man page has the answer:

man sh
...
  Signals
     The INTERRUPT and QUIT signals for an  invoked  command  are
     ignored if the command is followed by &. Otherwise, signals
     have the values inherited by the shell from its parent, with
     the  exception  of  signal 11 (but see also the trap command
     below).

i.e. SIGINT in a background process is ignored (as well as SIGQUIT).

SIGTERM is not mentioned here so the next idea is to enhance trap.sh and adding a signal handler for it and changing trapWrapper.sh so that it sends SIGTERM to the background process.

trap.sh

trapWrapper.sh

#!/bin/sh
# Catch signal 2 (INT) and 15 (TERM)
trap "echo trapped 2;I=1" 2
trap "echo trapped 15;I=1" 15
# Wait until var 'I' is set to something
while : ; do 
  sleep 1; [ -n "$I" ] && break
done
echo DONE

#!/bin/sh
# Run trap.sh in the background
./trap.sh 2>&1  &
pid=$!
# Sleep 5 seconds 
sleep 5
# ... and then kill the background process
kill -15 $pid
# Wait 
wait $pid
# Exit with exit code of background process
exit $?

after 5 seconds this will result in what we wanted:

trapped 15
DONE

Something to remember: the supposedly stronger kill with SIGINT (and SIGQUIT would be the same) does not work due to the ignored signal whereas SIGTERM works fine.

So if you write a script which should act upon SIGINT or SIGQUIT let is also act upon SIGTERM, just to be safe.

Note: this might be different in other shells. When you test this interactively you'll see the difference:

csh

$ trap.sh&
8379
$ ptree 8379
    8360  sh
      8379  /bin/sh ./trap.sh
        8385  sleep 1
$ kill -2 8379
$ ptree 8379
    8360  sh
      8379  /bin/sh ./trap.sh
        8785  sleep 1
$ kill 8379
$ trapped 15
DONE

% trap.sh&
[1] 8116
% ptree 8116
    33465 -csh
      8116  /bin/sh trap.sh
        8207  sleep 1
% kill -2 8116
% trapped 2
DONE

the background process ignores the signal

the background process in csh accepts SIGINT and exits

In case you've wondered about ptree: this was tested on a Solaris box.

Wednesday, February 9, 2011

Env var puzzle in csh

Today (means: some day in May 2008) I spent some time trouble shooting a strange issue until I finally got to the solution so I wrote this blog entry as an educational excercise to show the trouble shooting efforts I went through.

This was on a Solaris 10 machine.

The puzzle

I was working in a terminal window (xterm to be precise) where I had been doing some work throughout the day and I finally got to the point to set an environment variable and check its value (my login shell is csh).

foo:/home/andreas 151 % setenv THIS something
foo:/home/andreas 152 % echo T=$THIS
T=

Oh, I just set the variable and then it reports to be empty? How strange.
I checked the environment:

foo:/home/andreas 153 % env | grep THIS
THIS=something

So this looks ok, the var is definitly set but why does echo report it as empty? (and printf showed the same).

I tested other variable names but they behaved normal: echo showed their values.

I opened a new terminal window and tested this particular variable and others: all worked normal.

How can that be? What was so special about this session?

I trussed echo (for the non-Solaris users: truss is the Solaris command to inspect the behaviour of processes showing system calls etc.) and I saw that it is called without an argument, its env list shows THIS to be set though:

foo:/home/andreas 157 % truss -aeilf -r all -v all echo $THIS
26445/1:        execve("/usr/bin/echo", 0xFFBFE954, 0xFFBFE95C)  argc = 1
26445/1:         argv: echo
26445/1:         envp: USER=andreash LOGNAME=andreash HOME=/home/andreash
26445/1:          ...
26445/1:          THIS=something
...

Think about this for a while before you proceed to the next section for the solution.

The solution

I finally got to the point where I started to check other things which are set in csh: aliases and variables (not env).

The list of aliases did not show anything particular.

And then I ran set:

foo:/home/andreas 161 % set
CSHRC   /home/andreas/.cshrc
HOSTNAME    foo
...
THIS
...

So here you go: in some of my previous work that day in that particular terminal I must have set the variable THIS to empty and forgotten about it and despite setting the environment variable the first setting was still taken into account.
This is one of the traps of csh (others might call it feature, and of course this was new to me).
The Bourne shell family uses export to distinguish and this trap does not exist.

Summary

Here it is again in summary to remember:
in csh when you 'set' a variable a subsequent 'setenv' will not replace the variable in the current context.

set A=aaaaaa
setenv A bbbbbb
echo A=$A
A=aaaaaa

unset A
echo A=$A
A=bbbbbb

Going a little further: if you reverse this and first set a variable via setenv and then via set the second value will prevail

setenv A bbbbbb
set A aaaaaa
echo A=$A
A=aaaaaa

i.e. variables set with set always take precedence over variables set with setenv. Very important to remember. And only the latter will be exported to new shells though. I guess it would be a wise rule to keep a distinct set of names for set/setenv, otherwise things can get confusing (as in my starting puzzle) or things can even go wrong (when programming with false assumptions).

Tuesday, February 8, 2011

Conditional formatting in OpenOffice.org

What is conditional formatting? It means applying certain attributes to a spreadsheet cell (e.g. background, font size, ...) depending on some conditions being met or not. The conditions are based on the content of the cell or - in more complex cases - on other cells.

In OpenOffice.org there are two steps necessary:

Create appropriate styles (to be chosen from)
Set the conditional formatting for selected cells

This is different to Excel where one selects cells, sets the condition and chooses a format. In my view OpenOffice.org is more flexible because of its simplicity of change: if you want to change the format simply edit the style and the change is applied to wherever the style is being used, be it part of the conditional formatting or elsewhere. (I'm not an Excel expert though.)

Create the appropriate styles

Click F11 or Format->Styles and Formatting
Choose cell styles
Move the mouse to the blank middle of the styles window and right click and press 'New...'
Choose a style name e.g. RedBg, then choose the Background tab and set your colour, press OK or change any other attribute e.g. font type and size in the Font tab.

Do this for as many styles as you need, so that you have e.g. 3 new named styles for various backgrounds likeRedBg, GreenBg, AmberBg

Set the conditional formatting

Select the cell or cell range
Choose Format -> Conditional Formatting...
Set the conditions

Example based on numeric values of the cells
  condition 1: 'cell value is' less than 10 and select style RedBg
  condition 2: 'cell value is' less than 20 and select style AmberBg
  condition 3: 'cell value is' greater than or equal to 20 and select style GreenBg
This also shows how to use conditions in ascending order: if the first condition is met then its style is applied. If not then the second condition will be checked aso.
Example based on textual values of the cells
condition 1 'formula is' exact(lower(c5);"no") and select style RedBg
condition 2 'formula is' exact(lower(c5);"yes") and select style GreenBg
The background will be set to either red or green depending on no/yes entries in whichever case.
Important to note: the cell reference c5 should be the lower right hand corner of your cell range

More complex cases are to use conditional formatting based on the entries of other cells or multiple cells, you'd need to construct the conditions as formulas using the appropriate functions.

If you need more than 3 conditions I'm afraid only a macro will get you going.

Wednesday, February 2, 2011

Extracting strings: field delimited vs. regular expression solutions

Imagine that you have the following string describing a directory name

  tmp dir: /var/tmp/somedir

and you want to extract the directory name.

In shell scripts one can often find quick solutions like

... | awk -F: '{print $2}'
or
... | awk -F: '{print $NF}'

each leading to

 /var/tmp/somedir

(note the leading blank).

Now this solution is destined to fail if the directory name may contain a colon.

echo "  tmp dir: /var/tmp/some:dir"  | awk -F: '{print $2}'
 /var/tmp/some

echo "  tmp dir: /var/tmp/some:dir"  | awk -F: '{print $NF}'
dir

The awk solution assumes that the input line is a set of fields with a defined field delimiter which can never be part of the field values (and one could use cut or any other tool which handles field delimited input).

Though this looks nice and easy there's a safer solution which does not rely on this requirement but uses a regular expression to extract the necessary data:

s/[^:]*: //
   Skip everything up to the first colon and a subsequent space
s/[^:]*tmp dir: //
   This is more precise if one knows that there will always be a 
   string 'tmp dir: ' in the input line:
   skip everything up to the first 'tmp dir: '

   Or if you want to save the remainder use something like this
s/[^:]*: \(.*\)/\1/
   (this is for sed)

Both can be used in sed or Perl or any tool that can handle strings and regular expressions and they also have the nice side effect to remove the blank at the beginning of the resulting string.

echo "  tmp dir: /var/tmp/some:dir"  | sed 's/[^:]*: //'
/var/tmp/some:dir

echo "  tmp dir: /var/tmp/some:dir"  | perl -n -e 's/[^:]*: //; print'
/var/tmp/some:dir

The field delimiter solution seems more accessible to users and people sometimes seem to be afraid of regular expressions but the reg exp solution achieves the same and gives you more freedom (the freedom to eventually have the field delimiter as part of the value, just in case).

And an important thing is of course to test solutions if they work when your input does not follow the expected format, in the case above what happens if there is no colon in the input line.

echo "  tmp dir /var/tmp/somedir" | awk -F: '{print $2}'
     <-- There is an empty string here

echo "  tmp dir /var/tmp/somedir" | awk -F: '{print $NF}'
  tmp dir /var/tmp/somedir
     This is a copy of the original input line

echo "  tmp dir /var/tmp/somedir" | sed 's/[^:]*: //'
  tmp dir /var/tmp/somedir
     This is a copy of the original input line

echo "  tmp dir /var/tmp/somedir" | perl -n -e 's/[^:]*: //; print'
  tmp dir /var/tmp/somedir
     This is a copy of the original input line

You need to take this into account when you are using the result of this extraction further on in a script. If you want to use the reg exp solution but in case of the input line not adhering to the format you would want an empty string rather than the original line then do this:

echo "  tmp dir /var/tmp/somedir" | sed -n 's/[^:]*: //p'
     <-- There is an empty string here
     The -n ensures that sed prints only what you want, 
     the p prints the changed input line if there was a successful match,
     no match, no output

echo "  tmp dir /var/tmp/somedir" |perl -n -e 's/[^:]*: // && print'
     In Perl you simply can print if the previous substition was successful

Tuesday, February 1, 2011

Scope of sub processes in sh vs. ksh

Lately I ran into a problem when I started a process in the background in a Bourne shell script and the 'wait' command did not wait but the script ended immediately.
Looking into the issue it was again due to the scope of sub processes where changes made in a sub process do not influence the parent script. Moving the shell script to ksh solved the issue for me (that was on a Solaris 10 system).

So here are some thoughts about sub processes in various types of shells.

Code that works for both shells

The script starts a sleep process in the background and waits for it to finish.
The 'date' commands note the start and end times.

sh	ksh
#!/bin/sh date '+%M:%S' while : ; do sleep 10& break done wait date '+%M:%S'	#!/bin/ksh date '+%M:%S' while : ; do sleep 10& break done wait date '+%M:%S'
results in the same time difference of 10 seconds
01:24 01:34	01:24 01:34

Code that works only for ksh

Why? Because the while loop in sh is a subshell and thus any process started in it will not influence the parent shell which - in this case - has no idea that sub processes have been started at all.
So the Bourne shell script ends immediately whereas the Korn shell script waits as before.

sh	ksh
#!/bin/sh date '+%M:%S' echo \| while : ; do sleep 10& break done wait date '+%M:%S'	#!/bin/ksh date '+%M:%S' echo \| while : ; do sleep 10& break done wait date '+%M:%S'
results in an immediate response	results in the same behaviour as before: 10 seconds
01:24 01:24	01:24 01:34

So a while loop at the end of a command chain in Bourne shell is to be used with caution. Watch out for cmd | while ... ; do ... ; done

Another example is of course the value of variables (this is a probably better known scenario).

sh	ksh
#!/bin/sh x=parent echo \| while : ; do x=child break done echo x=$x	#!/bin/ksh x=parent echo \| while : ; do x=child break done echo x=$x
results in
x=parent	x=child

I found that bash behaves like sh but with ksh one has to pay attention and the situation is trickier, means depending on which ksh version you are using: the behaviour above relates to ksh on Solaris 10. The differences in Korn shell implementations make porting of ksh scripts always an issue. Some Korn shells (I think pdksh on Linux) behave in the same way as sh but I don't have access to a system so I cannot check right now.

So any coder who regularily writes for different types of shells or who needs to migrate shell scripts from one type of shell to another or even - in case of ksh - migrates shell scripts to other platforms needs to be aware of these sub process scope issues unless he risks ending up with unwanted behaviour - in the worst case destructive behaviour since the script might be doing something completely different to what it was intended to do.

Bugs or no bugs in awk, Perl, ...? Floating point arithmetic issues after all

A couple of years ago I ran into one of my biggest 'blushing red' cases in my professional career.

The starting point was some misbehaviour of awk and Perl which I reported as a bug. There weren't any bugs though, just normal behaviour which I hadn't known (when I should have, that's the blushing red). Here's the story.

You don't have to be a math wizard to tell that the following equation is true:

1.2/3 - 0.4 = 0

but awk/nawk and Perl get it wrong:

nawk 'END{ print 1.2/3-0.4}' </dev/null
-5.55112e-17

perl -e 'print 1.2/3-.4'
-5.55111512312578e-17

and even stranger

nawk 'END{ x=1.2/3; print x,x-0.4}' </dev/null
0.4 -5.55112e-17

i.e. x prints as .4 but when you subtract .4 the result is not zero.

I found more examples like 1.4/7-0.2 and first I thought it boiled down to the division by uneven prime numbers and how they are stored internally (strange thoughts one can have) and I also thought it must be a bug in some common C library.

The impact can be severe: if you set x=1.2/3-0.4 and then your code depends on whether x is less than, equal to or greater than zero you might - metaphorically speaking - miss the moon.

Eventually I created a bug report here but I got myself educated by some experts and further reading about floating point arithmetic and why the described behaviour is correct.

The bugs described above aren't bugs but a feature of how floating point works on todays computers, an area which I have neglected and never had any serious issue with in my work life.

The issue manifests in numerous ways, not just the division (my starting point), look at these examples where a simple multiplication goes wrong, pay attention to the 600.93 example:
awk

awk '{printf "%d %d %d\n", 500.93*100, 600.93*100, 700.93*100 }'
50093 60092 70093

Perl

perl -e 'printf("%d %d %d\n", 500.93*100, 600.93*100, 700.93*100)'
50093 60092 70093

main() {
  int x = 600.93*100;
  printf("%d\n",x);      
}

results in 60092

Why? In short because float variables are stored in binary format where certain numbers cannot be stored exactly. Any subsequent operation will also lose some of the information and thus wrong decimal results appear. When and how this information loss happens is hard to foresee unfortunately.

Is there a way around this? that dependes on the programming language. Perl offers modules like BigMath which seem to resolve this, I don't know of a general way.

The issue I'm facing is how to write safe arithmetic in awk and Perl (and who is going to look after existing scripts and rewrites them?).

Look at the awk example again replacing printf by print:

awk '{print 500.93*100, 600.93*100, 700.93*100; }'
50093 60093 70093

so this is correct now. My starting point though was an example where print provided a wrong result so the issue can manifest itself anywhere, there is no guaranteed way out of it it seems.

OpenOffice.org: platform specific code to save a file

In the last couple of days I looked into OpenOffice.org basic code to save data to a temporary file. The code should work on any platform (Unix, Windows, ...).

So I needed to find two things to get to the desired result:

A way to determine the platform type
A location where to store temporary files (platform dependent)
Here's the final code snippet

A way to determine the platform type

All I could find was the getGuiType() function which returns

1 for Windows
3 for MacOS
4 for Unix
and a repeated comment that this function exists for downward compatibility without explaining what the current approach should be.

Note that there is no distinction between OS versions like Windows XP vs. Windows Vista or Solaris vs. Linux.

A location where to store temporary files (platform dependent)

For MacOS and Unix systems /tmp is the obvious place to store temporary files (/var/tmp would probably also ok, even better a user specific dir like /var/tmp/$USER but one would need to ensure first that it really exists and is writable).
On Windows that is a little more complex but the environment variable TEMP seems to be set for all users (different though for each user) and thus a good choice.

Code snippet

' This code creates a temporary file called 'data.lst'
' with a path which works for each OS
Dim sUrl As String
Dim sFile As String
sFile = "/data.lst"
If GetGuiType = 1 Then
  sUrl = "file:" + Environ("TEMP") + sFile
Else
  sUrl = "file:///tmp" + sFile
End If

' Now create some data and store them into the temp file
Dim myData()    'Array to store some data
' Fill myData with contents assuming there are n data points
' ReDim myData(n)
' myData(0) = ...
' myData(1) = ...
' ...

' Call function SaveDataToFile from the OpenOffice.org Macros library 'Tools -> UCB'
SaveDataToFile(sUrl,myData())

A safer approach would be to use an environment variable (e.g. TEMP) for any OS, check whether this env var exists and the value is a writable directory and then use it but safer means (as almost always) more coding. /tmp was invented for the lazy :-)

Is this string a number or an integer? Solutions with expr, ksh and Perl

Taking my experiments with expr posted previously a little further I thought what about writing shell functions which determine whether a given string is a number or an integer?

I had already solutions in ksh for that but wanted to see them with expr (which would be a true Bourne shell solution) and also I did them in Perl.

A number is meant to be a sequence of digits ie. 1234.
An integer is either a number or a number prefixed with a minus sign ie. 1234 or -1234.

#!/bin/sh

# Test if arg $1 is a number
########################################

isNum1() {
    ksh -c "[[ \"$1\" = +([0-9]) ]]"
    return $?
}
isNum2() {
    [ `expr "$1" : '[0-9][0-9]*$'` = "0" ] && return 1
    return 0
}
isNum3() {
    perl -e 'exit 1 unless($ARGV[0]=~/^\d+$/)' -- "$1"
}

# Test if arg $1 is an integer
########################################
isInt1() {
    ksh -c "[[ \"$1\" = *(-)+([0-9]) ]]"
}
isInt2() {
    [ `expr "$1" : '[0-9][0-9]*$'` = "0" -a `expr "$1" : '-[0-9][0-9]*$'` = "0" ] && return 1
    return 0
}
isInt3() {
    perl -e 'exit 1 unless($ARGV[0]=~/^-?\d+$/)' -- "$1"

    # Here's an alternative, better to read in Perl maybe but two commands and a pipe:
    #   echo "$1" | perl -n -e 'exit 1 unless(/^-?\d+$/)'
}

# Test suite
for i in 204 -13 +88 1-2 4+5 46.09 -7.2 a abc 2x -2x t56 -t5 "2 4"
do
  isNum1 "$i" && echo Num1 $i
  isNum2 "$i" && echo Num2 $i
  isNum3 "$i" && echo Num3 $i
  isInt1 "$i" && echo Int1 $i
  isInt2 "$i" && echo Int2 $i
  isInt3 "$i" && echo Int3 $i
done

Executing the script results in
Num1 204
Num2 204
Num3 204
Int1 204
Int2 204
Int3 204
Int1 -13
Int2 -13
Int3 -13

i.e. only the first entry is a number and the first two entries are correctly identified as integers.

One drawback to expr is that it supports only basic regular expressions i.e. some useful special characters like + or '?' cannot be used and thus ksh and Perl provide more concise solutions to the above problem.

Using 'expr' in scripts

As many scriptors know (and probably hate) Bourne shell doesn't have inbuilt arithmetic capabilities so one has to resort to expr for calculations, the most famous maybe being the loop increase:

i=0
while [ $i -lt 10 ] ; do
  ...
  i=`expr $i + 1`
done

expr has more operators though than just the basic arithmetics and I don't see them used very frequently, I think I haven't used them at all so - stumbling upon it accidentally - I thought I'd play with it a little to get a better understanding and here's the result.

The match operator (string comparison with regular expressions)

The operator to compare a string to a regular expression is the colon (:).
expr will return the number of bytes matched (the curious might look into the xpg4 version of expr which returns the number of characters matched).

Also important: the regular expression always starts to compare at the beginning of the string so as if one would have used ^.

A few examples.

f="/a/c"

# does $f match ^a ? No.
expr $f : a
0

# does $f match ^/a ? Yes.
expr $f : /a
2

# does $f contain an 'a' ? Yes.
expr $f : '.*a'
4

# does $f contain a 'b' ? No.   (the regexp must be enclosed in simple quotes here)
expr $f : '.*b'
0

# does $f end with a 'c' ? Yes.
expr $f : '.*c$'
4

# does $f end with a 'b' ? No.
expr $f : '.*b$'
0

All of these examples can be used in a decision process to check whether the result is zero (no match) or not.

x=`expr ... : ...`
if [ $x -eq 0 ] ; then
  : # no match
else
  : # match
fi

To make the code a little safer one has to consider that the string to be matched might contain white space. The examples above will fail so one needs to use double quotes.

f="a b c"
expr $f : a
expr: syntax error
# since this translates to    expr a b c : a    which does not compute

# Double quotes around the string do help
expr "$f" : a
1

And even more useful is to use the extraction reg exp \1: instead of returning the number of bytes in a match one gets a string (if successful) or an empty string.

f="/ab/cd/efg"

# Extract the filename 
expr "$f" : '.*/\(.*\)'
efg

# Extract the dirname
expr "$f" : '\(.*\)/.*'
/ab/cd

f="abcdefg"

expr "$f" : '.*/\(.*\)'
        <---- # Note: this is an empty string here !!!

expr "$f" : '\(.*\)/.*'
        <---- # Note: this is an empty string here !!!

# Why empty strings? because we were trying to match a slash which is not present in $f
# Why empty strings and not 0? because we requested a string between (...)

# Now what if we wanted to solve the following: 
# if $f contains a slash then the filename is everything after the slash
# if $f does not contain a slash it should be considered a filename
# Rather than using  if ... else ... fi we can use expr, read on.

The 'or' and 'and' operator

The or operator is | and the and operator is &, both have to be escaped always.
They can be used to compare two expressions.

or: the first expression is evaluated. If it is NULL (i.e. empty or non-existing) or 0 the second expression will be evaluated too.

a=111
b=0
c=
e=555

# In the 4  comparisons below the second expression is always valid
expr "$a" \| "$e"
111

expr "$b" \| "$e"
555

expr "$c" \| "$e"
555

expr "$d" \| "$e"
555

# Here we compare a 0 value with an empty value
expr "$b" \| "$c"
0

# Here we compare an empty value to a non-existing one
#   The result of expr is also 0
expr "$c" \| "$d"
0

and: both expressions are evaluated. If any of them is NULL or 0 then 0 is returned. Otherwise the first expression.

a=111
b=0
c=
e=555

# In the 4  comparisons below the second expression is always valid
expr "$a" \& "$e"
111

expr "$b" \& "$e"
0

expr "$c" \& "$e"
0

expr "$d" \& "$e"
0

Combining 'match' and 'and/or'

One can use these in combination to solve the emptry string issue above and a default value can be assigned, it is important to remember that expr evaluates left to right.

f="abcdefg"

expr "$f" : '.*/\(.*\)' \| "$f"
abcdefg

# These are two operations, one 'match' and one 'or':   expr   ... : ... \| ...
# If the match operation fails   "$f" : '.*/\(.*\)'    
# then evaluate the right hand side of 'or' and make this the result of 'expr'.
#   So: no slash found in $f then return the original string.

Shell argument handling: $* vs. $@

I never can recall the difference between $* and $@ in scripts so here's a little experiment which should help, hope that I remember this time :-)

First of all there is this little script to show all args of a script, call it args.sh

#!/bin/sh
cnt=0
for arg
do
  cnt=`expr $cnt + 1`
  printf "arg$cnt =\t$arg\n"
done

args.sh 1 "2 3" "4 \"5\" 6"

will show

arg1 = 1
arg2 = 2 3
arg3 = 4 "5" 6

i.e. the script counts and shows three arguments.

In order to test how $* and $@ are passed to sub processes there is a little script test.sh in four variations, once pure and once surrounded by quotes:

#!/bin/sh
args.sh $*

#!/bin/sh
args.sh "$*"

#!/bin/sh
args.sh $@

#!/bin/sh
args.sh "$@"

test.sh 1 "2 3" "4 \"5\" 6"

will lead to

arg1 = 1
arg2 = 2
arg3 = 3
arg4 = 4
arg5 = "5"
arg6 = 6

arg1 = 1 2 3 4 "5" 6

arg1 = 1
arg2 = 2
arg3 = 3
arg4 = 4
arg5 = "5"
arg6 = 6

arg1 = 1
arg2 = 2 3
arg3 = 4 "5" 6

i.e. only the last invocation does what we want: it keeps and forwards the original arguments.

All other invocations either split the arguments by space or concatenate all arguments into one big single argument.

Another little trap I fell into: when trying to save the arguments in a variable neither invocation is successful

#!/bin/sh
a=$*
b="$*"
c=$@
d="$@"

args.sh $a
args.sh $b
args.sh $c
args.sh $d

#!/bin/sh
a=$*
b="$*"
c=$@
d="$@"

args.sh "$a"
args.sh "$b"
args.sh "$c"
args.sh "$d"

all lead to the complete split

all lead to a single argument

arg1 = 1
arg2 = 2
arg3 = 3
arg4 = 4
arg5 = "5"
arg6 = 6

arg1 = 1 2 3 4 "5" 6

This is all in Bourne shell (if in doubt).

Saturday, January 29, 2011

What's the difference between Ctrl-C and kill -2 ?

When reading about signals there is often a phrase to be found similar to "sending SIGINT (or pressing Ctrl-C)" which seems to imply that both are the same.
That is only true in simple cases though as outlined below.

Sending signal INT is the equivalent to sending signal 2 (on most UNIX versions at least) and signals are sent via the kill command in shells or kill functions in programming languages.

Simple Case: the same

The phrase above refers to the issue that in an interactive shell pressing Ctrl-C sends signal INT to the current foreground process.

In the simple cases below pressing Ctrl-C or 'kill -2' are equivalent.

Example:

trap1.sh:
#!/bin/sh
# Catch signal 0
trap "echo exiting" 0
# Catch signal 2
trap "echo trapped;exit" 2
# Forever loop
while : ; do sleep 1; done
echo DONE

# Run trap1.sh and interrupt with Ctrl-C
% trap1.sh
^Ctrapped
exiting

# Run trap1.sh and determine its pid in another terminal and do  kill -2 pid
% trap1.sh
trapped
exiting

i.e. both times the program gets interrupted, a trap message and final exit message are printed, the final DONE is never reached.

Tricky case: different behaviour

Some might have wondered about the example above. Why not replace the while loop with a simple sleep 60? So the script would be this:

trap2.sh:
#!/bin/sh
# Catch signal 0
trap "echo exiting" 0
# Catch signal 2
trap "echo trapped;exit" 2
# Sleep for some time
sleep 60
echo DONE

# Run trap2.sh and interrupt with Ctrl-C: same behaviour as before: the script gets killed
% trap2.sh
^Ctrapped
exiting

# Run trap2.sh and determine its pid in another terminal and do  kill -2 pid
# no success this time, the script continues to run.
# It will end only after 60 seconds sleep time have passed !!!
% trap2.sh
trapped
exiting

So why didn't kill -2 cancel the script?

The signal was sent to the script's process but the spawned 'sleep' process was not affected. A ptree would have shown

12632 /bin/sh trap2.sh
     12633 sleep 60

Ctrl-C on the other hand sends the signal to more than one process: all sub processes of the foreground process are also notified (except background processes). That is the important difference between these two approaches.

Here's a program to test this.

# The script below starts 3 processes as seen in ptree:
#     64473 /bin/sh trap3.sh
#       64474 /bin/sh ./trap3.sh number2
#         64475 sleep 50

trap3.sh:
#!/bin/sh
ID="$1"
[ -z "$ID" ] && ID="number1"
trap "echo $ID exiting" 0
trap "echo $ID trapped 2;exit" 2
[ "$ID" = "number1" ] && ./trap3.sh number2
sleep 50

# Run trap3.sh and interrupt with Ctrl-C
# all processes are killed and both trap.sh processes report their trap messages
% trap3.sh
^Cnumber2 trapped 2
number2 exiting
number1 trapped 2
number1 exiting

# Run trap3.sh and and determine its pid in another terminal and do  kill -2 pid
# After 50 seconds you'll see the trap 0 message from the second trap process
# and the trap 2 and trap 0 messages from the first trap process
# So:
# the second trap3.sh process was never interrupted !!!
# the first trap3.sh process waited until its sub process finished (despite receiving sig 2)
% trap3.sh
number2 exiting
number1 trapped 2
number1 exiting

So watch out when testing program signal handling: if your program reacts as expected to a manual Ctrl-C it might not do the same to a kill signal coming from other processes.

Friday, January 28, 2011

Shell library (part 2)

Here are few more thoughts on top of what I discussed in my article Shell library (part 1).

Split the library into several files
Automatically load all library files
How to prevent a library script to be executed

Split the library into several files

The number of functions to be put into a script might grow, there is also a chance that you want to group it into categories, and I also like the idea of separating variables from functions i.e. I would want to have one file global_vars.sh containing all the variables (and exporting them) and one with functions only.

Since variables can be exported and are thus available to sub processes a lib containing variable settings need only be sourced in in the very first script, neither sub process needs to do that, they just need to be made aware of it somehow.

One idea to achieve that: set a special variable at the beginning of your library script and check that variable in your calling script

# library script
lib_vars_loaded=1; export lib_vars_loaded

# calling script
#!/bin/sh
if [ -z "$lib_vars_loaded" ] ; then
  # Here comes the code to load the lib
fi

Automatically load all library files

A follow-up idea to creating a shell library (and of course derived from what ksh and bash offer) one could think of autoloading all files in a lib directory i.e. the directory consists of (small) files each containing a shell function and all files will be automatically loaded by a script.

# This is the path where all library scripts reside
setenv FPATH /opt/shell/lib:$HOME/lib

#!/bin/sh
# Require that FPATH is set 
[ -n "$FPATH" ] || { echo Error: FPATH is not set ; exit 1 ; }
# Replace colon by space in FPATH
dirpath=`echo "$FPATH" | tr : ' ' `
# Loop through 'dirpath' to find the library
for dir in $dirpath ; do
  # Loop through all files in 'dir' (this excludes files starting with '.')
  for file in ${dir}/* ; do
    # Check if the lib is a readable file in the given dir
    [ -r "${file}" -a -f "${file}" ] && . "${file}"
done

Things to think about:

files are sourced in in alphabetical order, there should not be a dependency on order, this is particularily interesting if you have variable settings split into multiple files, then one var might depend on one which is set in another file which then has to be loaded first.
if you don't need all functions in your script why load them? that is probably excessive and you want to load just what you need
if you need a new function simply put it into a script of its own and add it to the dir, it will be autoloaded, no risk of breaking an existing lib file with syntax error and you know immediately where the error sits

How to prevent a library script to be executed

I found this in a book: put in this line at the top of the library script

#!/bin/echo Error:_this_needs_to_sourced_in
...

and accidentally executing the script will lead to an echoed line (with exit code 0).

Shell library (part 1)

When you write a lot of shell scripts one often duplicates code i.e. you need the same kind of functionality and you rewrite it or copy it from a previous script.

So of course in analogue to programming languages like C or Java it would be nice to capture all the good functionality in a kind of shell library where it's available to whatever you are doing and also has the advantage that if you improve certain code it will be available to all of your scripts.

The shell does not have the notion of library, the closest one can get are scripts which are sourced in by other scripts.

What is a shell library?

I define it as a file which consists of a set of variables and functions which can be sourced in by another script.
A library written for one type of shell very likely does not work with another.
It's not just the difference between the Bourne shell and C-shell families but also in the same family (sh, ksh, bash) you cannot re-use functions: they are implemented differently and don't follow the same rules (global/local vars in functions, syntax) and you can create unexpected results.

So rule #1: stay with one type of shell.

Since I have to deal a lot with legacy code the remainder of this text will refer to Bourne shell (and unfortunately I cannot use any of the more modern features like FPATH, autoload etc.)

Rule #2: the library should not be dependent in any way on the directory where it is placed, it might be invoked from any place which implies e.g. using relative paths is definitly discouraged.

How do you invoke a library (find it and source it in)?

Assume your library sits in /opt/shell/lib/myfuncs.sh and your new script /opt/shell/mytest.sh should invoke it.
The options are that the script knows

the full path of the library
the relative path
a way to determine the location of the library

Source lib with full path

#!/bin/sh
. /opt/shell/lib/myfuncs.sh

Using a full path is - in my mind - a very inflexible solution. There is a variant to that: set the full path in an environment variable so it can be set before sourcing the script.

# Assuming csh is your working shell
setenv LIBPATH /opt/shell/lib

#!/bin/sh
. $LIBPATH/myfuncs.sh

and of course one should check the setting of the variable and the existance of the lib file (see further down).

Source lib with relative path

Using a relative path has the drawback that the script needs to be executed in a specific place, otherwise the relative path does not work.

#!/bin/sh
. lib/myfuncs.sh

does run if executed in /opt/shell and it assumes that the library sits in a 'lib' directory underneath the directory where the script is executed.
If you are in, say $HOME, then calling /opt/shell/mytest.sh will fail.

Of course you can work around by cd-ing to the right place like this:

#!/bin/sh
cd `dirname $0` || exit 1
. lib/myfuncs.sh

which assumes that the library sits in a 'lib' directory underneath the directory where the script is placed (note the subtle difference to the above 'executed').

Determine the lib path

Well, the script needs to find the library somewhere, and if it does not want to search the whole directory tree it needs to have a starting point. That could be: a set of directories which need to be searched (analogous to the FPATH of ksh), either hardcoded or supplied via an environment variable. Again these directories could be full or relative paths.

So I like to follow the convention of later shells and assume that FPATH is a list of directories where things can be picked up.

# Set FPATH to a production lib and a private lib
setenv FPATH /opt/shell/lib:$HOME/lib

#!/bin/sh
# Require that FPATH is set 
[ -n "$FPATH" ] || { echo Error: FPATH is not set ; exit 1 ; }
# Replace colon by space in FPATH
dirpath=`echo "$FPATH" | tr : ' ' `
# Loop through 'dirpath' to find the library
for dir in $dirpath ; do
  # Check if the lib is a readable file in the given dir
  [ -r "${dir}/myfuncs.sh" ] && . ${dir}/myfuncs.sh && break
done

FPATH can contain relative paths too but that creates more problems I think: in a test env it might find the lib in a relative path, on production systems the path might not exist etc.

Caveats:

this code has been using 2 new variables 'dirpath' and 'dir' already (so the lib should not be setting this vars somehow, otherwise there are side effects to the for loop)
it is using the 'tr' tool (which needs to be available and in PATH and not be aliased to something else)
the code has grown beyond the normal 'source in' one liner
the code does not work if path names contain white space

Conclusion

Aside from using the fully qualified path directly in the script all solutions require a certain convention (the name of a global variable and how it is used) but allow a certain flexibility so in the end

the library script can reside anywhere
the calling script can reside anywhere
the calling script can be called anywhere
and all still works: the calling script can find the lib and can also execute the rest of its code.
(For those who are confused: I'm in a certain directory calling a script in another directory (with relative or absolute path) which sources in a library in yet another place and everything should work (as good as the very common case where everything is run and placed in the same directory)

How to copy a Confluence space to another instance of Confluence

Confluence is a professional enterprise wiki and recently I was tasked with transferring contents from one Confluence instance to another.
Problem: I didn't have site admin rights on either instance so the natural XML export/import path was closed and I had to find another solution.

Content in Confluence is organized in so called spaces, think of them as topics maintained by lists of users with varying degrees of permissions (admin, read, write, export aso.), each space consisting of a set of pages. The task was to transfer a number of spaces from one instance to the other. On the first glance not difficult (just transfer the raw wiki markup text), on second glance challenging when you think about attachments, comments etc. but the biggest caveat was that the two instances of Confluence used different access mechanisms (one was using the corporate LDAP and the other an access list of its own) i.e. usernames and passwords were different.
Why caveat?
Because there exists a SOAP based command line interface using Confluence's remote API which provides a copySpace functionality to the same instance or to a different instance but in its current revision it requires that one uses the same username and password on both source and target servers.

The solution: I could enhance the CSOAP source code to get the required functionality: different user name and password on the target server. I introduced two new arguments targetUser and targetPassword. Here's what I did:

Downloaded and unpacked the CSOAP package (version 1.5) (scroll to where it says Download JAR)
Downloaded and unpacked the source code (it's in Java) using the distribution->confluence-cli-1.5.0.source.zip file.
Modified the ConfluenceClient.java file and compile a new confluence-cli-1.5.jar file (the compilation and creation of the new jar file was a little trickier than it sounds)
Replaced the existing confluence-cli-1.5.jar in the the downloaded package (in directory release)

Run the following command to copy a space:

java -jar release/confluence-cli-1.5.0.jar --verbose --server https://confluence1.foo.com --user 1234 --password XXXXXX --action copySpace --targetServer https://confluence2.foo.com" --targetUser bar@foo.com --targetPassword XXXXXX --space SPACE --newSpace SPACEnew --copyAttachments --copyComments --copyLabels

(assume that 1234 is my account on the first and bar@foo.com my account on the second Confluence instance)

This was run for all the necessary pages.
As one can see also comments, attachments and labels are copied over.
What the code does is a copy page by page. The new pages are all created by the same username, also all comments appear to have been written by the same user. I added a little note at the top of each page saying which user originally created it in the first Confluence instance to keep a little history in the new instance.

Prerequisites for this whole effort:
sufficient access rights to both Confluence instances to extract pages and to create spaces and a working and reasonable new JDK. I did my development work on Ubuntu with JDK 1.6 and used the resulting jar file on another of our internal servers (on Solaris) sitting closer to both wiki sites in order to speed up the transfer.

As a good web citizen I created an issue and provided my changed code to the author (I did a couple of further enhancements but they weren't production worthy yet so they are not in the code and I was dragged into other things later on).

Note: another caveat was that the Confluence instances used a different set of plugins (plugins are additional functions which can improve the usability of a wiki big time) i.e. if a page author was relying heavily on a particular plugin in instance 1 these pages will be partially broken in the new instance. That was beyond my task and area of influence though (and it turned out to be no issue for the spaces in question).

Create two page per sheet pdf file on Mac

What I had: a pdf file with rather small pages i.e. when printing in original size there was a lot of white boundary around the page.
What I wanted: a new pdf file containing only a subset of pages and with two pages per sheet instead of just one.

Now this didn't sound complex but I found that I needed to take a couple of tries on my Mac until I finally got it going.

First try: Acroread
When using acroread to view the pdf file and Print to print the selected pages with the Save as PDF option I get a complaint:

Saving a PDF file when printing is not supported, choose File > Save.

And File -> Save does not support to save just a selection of pages (this is Adobe Reader 9.4.0).

Second try: Preview
Of course the alternative to acroread is Preview (version 4.2) and its Print function allowed to select pages and save them to pdf. Only issue: I could not get the Layout changed to have two pages per sheet, I tried various settings but I always ended up with one page per sheet.
Which lead to the

Third try: Preview via PostScript
Again I was using Preview as before but this time I was using Save as PostScript.... The resulting file can be viewed again in Preview, it gets converted to PDF and it finally showed what I wanted: two pages per sheet, only the pages I wanted. Another Save and I got my new pdf file.

So sometimes supposedly easy things take longer than anticipated, that's where your time goes :-)

And just for my own reference here are the settings again:
I didn't change anything in the top part except for page selection so paper size A4, orientation portrait and scale 100%.
In preview I kept Automatically rotate each page and ticked Scale each page to fit paper. In Layout I set Pages per Sheet to 2, left the orientation unchanged (from top-left, top-right, bottom-left, bottom-right) and ticked Border Single Hairline just to get a little visual cue and separation.
And also note that the little Print preview does not show the layout, you always have to create the new file first and then you'll see whether your changes have any effect.

Note: Print options in Preview differ by file type. When you open an image file in Preview there is indeed an 'Images per page' option but this is absent when opening a pdf file so therefore my need for the detour. I'd welcome any simpler solution.

Java and regular expressions

Lately I needed a solution for the following in Java:
in a String replace everything from the beginning up to a certain pattern, the String being the content of a text file which had been read in before (imagine you want to remove the header in html code until and including the <body> tag).

I had expected this to be an easy case (since I consider myself to be quite familiar with regular expressions, though I do not program in Java as a main job) but the following

string.replaceFirst(".*<body>","")

did not work. What it did was (just) to remove the line containing <body> .

Looking for an explanation I found this very nice page about flags in regular expressions and the solution is highlighted there "By default . does not match line terminators.". So one needs to use some special pattern flags to achieve that newlines in the string are embedded and found by the regular expression. Here is the solution

string.replaceFirst("(?s).*<body>","")

(?s) makes dot match any character including line terminators.
Read the page above to find out more about the other flags (this is very likely also described in the Java documentation but I couldn't find it quickly and as easily described as on the page above).

Command line interface to Movable Type blog (Net::Blogger)

After looking for quite some time and trying to understand the Movable Type API and very likely not understanding everything, maybe not even much (here is the route which got me nowhere: all XML-RPC links on Movable Type's API redirect to Typepad's developer site. Nothing here did help, not the JSON stuff nor the scrpting libraries. Since I know Python and PHP only very little I focused on the Perl stuff, especially looking into Perl packages like JSON::RPC, RPC::JSON, RPC::XML etc.but couldn't get a grip how to get something in or out of the blog).
Finally I found an example on the internet which was using Net::Blogger with Movable Type's mt-xmlrpc.cgi so I went for Net::Blogger, an admittedly old (2006?) package relying on (what I found out during the installation) other old and partially deprecated packages like Error but in the end it worked.

Installation
Since all was done on my Ubuntu system I first needed to get my Perl installation there in shape. I hadn't done this for a while but it worked as memorized:

start cpan in a terminal window and follow the instructions
run install CPAN to get an up-to-date version
run install Net::Blogger and let CPAN also download and install the necessary dependent packages

This took a while and there were also some hickups (which I unfortunately forgot so I can't describe them here and tell people how to circumvent them, sorry) but finally the packages got installed and Net::Blogger is available for me.

Example program
Here is an example Perl script which creates a simple blog entry.
It accesses the mt-xmlrpc.cgi script as outlined in the code, you need your username, your blog id (it is listed e.g. in Tools->Import) and your web services password (you can find it in your profile), the assumption is of course that you own the blog or have at least write permissions. Blogging is again a two step process: first you create the blog and secondly you publish it.

use strict;

use Net::Blogger;

# Your blog provider's path to MT scripts
# (assuming a standard installation of MT)
my $url    = "http://blogs.zzz.com/cgi-bin/mt";
my $proxy  = $url . "/mt-xmlrpc.cgi";

# My credentials
my $user   = "xxx.yyy\@zzz.com";         # Your blog username
my $blogId = 20421;                    # The blog id of your blog

# Create blogger object
my $mt = Net::Blogger->new(engine=>"movabletype");
$mt->Proxy($proxy);

# ... and add my credentials
$mt->Username($user);
$mt->Password("xxxxxxxx");      # Enter your web services password here
$mt->BlogId($blogId);

# Create blog entry
my $entry = $mt->newPost(postbody=>\"<B>hello</B> world") 
   or die "[err] ", $mt->LastError( ), "\n";

# Publish blog entry
my $pub_entry = $mt->metaWeblog()->newPost(
   title=>"hello",
   description=>"world",
   publish=>1,
   );

This is nice for creating a blog entry from the command line, one could use a more polished version which e.g. would take the blog text from a file etc.
What is not clear to me yet: how to add tags to the blog and how to assign proper categories (not sure if this is possible via this interface).

And now?
But after having achieved all of this I want more: I would like to migrate a blog entry from somewhere else with not just the blog text but also attributes like original blog date, files or attachments (called assets in Movable Type speak) of the original blog, tags, categories, comments. This can be achieved via Movable Type's import (maybe not the assets piece, not sure) so I really need a command line import interface, not just a command line blog creation interface.