Andreas' Technical Tidbits: 2012.11

Tuesday, November 27, 2012

A little script driven two stage dialog with zenity

On Solaris and Linux there is a little utility called zenity which displays a dialog window. It has various options from simply displaying a string to showing a file dialog progress. I sometimes use it if I write a script but don't want the user to follow the output of the script in a terminal, especially if the script is going to be used by a larger group of mainly non-technical users one needs a GUI dialog. zenity is not very fancy in its capabilities, configurability or design but it serves its purpose.

In this article I want to show how to use zenity several times in a script.

At first the users are presented with a list of servers to choose from and

secondly there is a list of applications to be selected. The list of applications and the geometry of the zenity window depend on the previous selection.

Finally in a third invocation of zenity there is a summary window shown.

The task has been split into two files.
One is a config file which contains the definitions of the various lists.
The second file is the actual script. Since I'm using arrays to define the lists I'm using bash in this example.
(Note: originally the script ran on Solaris 10 in ksh but the 'echo ... | zenity' construct somehow did not on Linux.)

Here is the config file

#!/bin/echo Error:_this_needs_to_be_sourced_in:
#

# A set of arrays (for ksh)
# (server and domain names are fake of course)

  ALIAS[1]="Japan server"
MACHINE[1]="server1.japan"
   MENU[1]='Finance
Logistics
Sales'
  MSIZE[1]="--height=220 --width=330"

  ALIAS[2]="UK server"
MACHINE[2]="serv2a.uk"
   MENU[2]='Finance
Sales'
  MSIZE[2]="--height=190 --width=330"

  ALIAS[3]="US server"
MACHINE[3]="newserv.us"
   MENU[3]='Finance
Logistics
Marketing
Sales'
  MSIZE[3]="--height=220 --width=330"

# Set the number of array elements
numElements=3

The config file defines four arrays:

ALIAS is nickname of the actual server and will be displayed in the first menu

MACHINE is the actual machine name

MENU is the list of choices in the second zenity window

MSIZE is the geometry setting for this menu. Height varies. Finally there is a variable numElements which defines the length of the arrays, 3 in this case.

Here is the script:

#!/bin/bash 

# The config file 
#   Should be in the same directory as the script.
#   Could of course be passed as an argument too.
########################################
CONFIGFILE=`/usr/bin/dirname $0`/zenity.config

# Check if the config file exists
########################################
[ -r $CONFIGFILE ] || { zenity --error --text "Config file does not exist: $CONFIGFILE" ; exit 1 ; }
  
# Source the config file
########################################
. $CONFIGFILE   || { zenity --error --text "Error while sourcing config file: $CONFIGFILE" ; exit 1 ; }

# Step 1: select an option out of the list of aliases
########################################
selection=`\
ind=1
while [ $ind -le $numElements ] ; do
        echo "${ALIAS[$ind]}"   # Piped to zenity
        ind=$((ind+1))
done | zenity --list --height=200 --width=350 \
--column="Choose your Server (highlight and press OK)" \
--title="Remote Server Launcher" \
`
   
# If zenity has been canceled or nothing selected then exit
########################################
[ $? -eq 0 -a "x$selection" != "x" ] || exit 1
  
# Now map the user selection to a machine name
# and show the machine specific menu
########################################
ind=1
while [ $ind -le $numElements ] ; do
        if [ "x$selection" = "x${ALIAS[$ind]}" ] ; then
                server=${MACHINE[$ind]}

                # Stage 2: select the window size and a specific application
                ########################################
                selection=`\
                echo "${MENU[ind]}" | zenity --list ${MSIZE[$ind]} \
                --column="Application (highlight one and press OK)" \
                --title="Remote Server Launcher" \
                `
                # If zenity has been canceled then exit
                [ $? -eq 0 ] || exit 1
                break
        fi
        ind=$((ind+1))
done

# Just for information purposes
########################################
zenity --info --text "You chose:\nServer: ${ALIAS[$ind]} (${MACHINE[$ind]})\nApplication:  $selection  "

A few notes:

zenity is also used to display error messages

the while loops could be replaced by for ind in {1..$numElements} in ksh but this does not work in bash

the zenity windows are invoked inside backticks in order to capture their output

the script shows various options of zenity and how to use them.

after having gone through these choices of course something real should be done rather than just displaying the choices (but that is beyond this article).

this array technique allows you to dynamically set the content of zenity dialogs, to set variables which are hidden to the user and to separate the logic from the content (and there even could be done more e.g. define all geometries in the config file)

Here is the first zenity window (on Ubuntu 11) after having clicked one server.

Here is the second zenity window with the appropriate menu and geometry

And finally here is the last zenity window to recap the selections

Sunday, November 25, 2012

VirtualBox: command line (ssh) access to guest system

Today I needed to access my VirtualBox Solaris session from my Mac (the screensaver seemed to hang and I wanted to kill it to regain access).

I found this very informative posting which contained all the information and I took its contents and put them into this little script (not really necessary but a nice little exercise to write a loop for these kinds of repetitive statements).

#!/bin/bash

Machine="Solaris 10"    # My guest system
Adapter=e1000
GuestPort=22
HostPort=2222
Protocol=TCP

# Note: do not put any spaces into the brace enclosed string,
# it will break brace expansion
for var in {Protocol,HostPort,GuestPort} ; do
  echo $var
  eval VBoxManage setextradata \"$Machine\"  "VBoxInternal/Devices/$Adapter/0/LUN#0/Config/ssh/$var" \${$var}
done

# This is just to check if the settings are ok
VBoxManage getextradata "$Machine"  enumerate

# And this is also only a test to see if the system is reachable
ping -p $HostPort localhost

Access to the guest then works like this
ssh -p 2222 localhost

Access did not work right away for my VirtualBox Solaris session since it was running at the time when I executed the commands. In order to enable access I had to save and restart the session (of course a reboot would have worked too).

Given that VBoxManage is in your PATH the script should work on any UNIX system with bash.

I admit though that this script is probably more confusing than helpful since it kind of hides what is actually going on, the original posting simply lists 3 statements which are easy to read and understand.

Saturday, November 24, 2012

Reusable code in shell scripts or How to create a shell library

When working a lot with shell scripts (either your own or others) you get to the point where certain pieces of code seem to be repeated numerous times so eventually one starts to wonder if and how one could build and use some library of reusable code.
A seasoned programmer could eventually end up with a library of standard functions or better a library for various shells (sh, bash, ksh, etc.) and various operating systems (Solaris vs. Linux being the major distinction but also the various releases of each major OS show differences).

For a particular project written in sh/ksh on Solaris I built a library and below I'll explain a few of the considerations.

What is a shell library

Without having seen the phrase 'shell library' anywhere else to me it is a collection of environment variables and functions. So using a shell library is invoking a file containing this collection and thus setting the environment of the executing shell script.

Why is a shell library useful

Many shell scripts contain settings of environment variables and definitions of functions at the beginning of the script. When working in projects with multiple scripts where the same or similar settings are being used it does seem to make sense to put these settings into a single place. An important advantage of such an approach: if the setting needs to be changed later it needs to be changed in only one place. This concept is obvious for programmers but I have seen it rarely used in shell scripts.
When I see pieces of code like HOST=`hostname` and many more of such statements repeated in dozens of scripts (all of them part of a big project) it is time to start using a library.

What goes in

That is probably the simplest question: I'm almost tempted to say any piece of code that is used twice or more should be in the library.

Changes over time

One of the big questions is how to handle a library over time.
New things need to be added i.e. the library grows.
Maybe one has ideas to improve the current code and thus code changes.
Will the changed library still work in all older invocations (backward compatibility)?

How to invoke the library?

Use the dot operator "." as in


. lib/mylib.sh

assuming that your library sits in a file called mylib.sh in a sub directory lib.

Location

In order to invoke the library the calling script needs to locate it. Where should the library reside?
Assuming that it is part of a project (and thus a collection of scripts which are deployed in conjunction) you need to define a directory (without established standards for shell script libraries you might as well call it lib following the convention of other languages).

Some examples

Simplest case: setting a variable

HOST=`hostname` ; export HOST

So your scripts need to run the hostname command only once. Of course the underlying assumption here is that the hostname command can be found in the PATH of the user executing the script.

Extract a variable

i.e. extract pieces of information out of a larger output.
Say you have the output of id and you want the username:

id uid=712(joe) gid=100(other) groups=100(other),22(staff)

The following extracts the string between the first parentheses.

USER=`id |sed -e 's/).*//' -e 's/.*(//'` ; export USER

Setting a variable for if clauses

The control flow in scripts very often depends on whether a variable has a certain value or not. You can introduce a (boolean) variable to subsume this logic.

Imagine that you want to test whether the script is executed by root or not. One could use the USER variable and (always) test like this

if [ "$USER" = "root" ] ; then ... ; fi

An alternative could be this setting in your library which creates a new variable isRootUser

isRootUser=`ID=\`id | sed -e 's/uid=//' -e 's/(.*//'\`; [ $ID -eq 0 ] && echo $ID` ; export isRootUser

This at first glance complex piece of code simply

runs the id command and extracts the uid and sets the variable ID

checks whether ID is zero (this would also cover the case that there is a second superuser account with uid 0) and if so then sets the variable isRootUser to ID

The variable can then be invoked as follows:

if test $isRootUser ; then ... ; fi

Advantages of this approach:

the root check is encapsulated in the setting of isRootUser (if you decide to use a different method to identify the root user you can change it here and change it only once in the library)

it runs only once at the invocation of the library (not possibly multiple times in your script)

thereafter a very simple check using a variable with a telling name can be used as many times as needed

Common functions

Maybe this is the more interesting piece and related to other programming languages: defining a set of reusable functions. Due to the nature of shells you have to watch out for the scope and use of variables (local / global / input / return).

A simple function to print an error message and stop the script:

die() { echo "Error: $*" >&2 exit 1 }
# Usage: # die Some condition has not been met # or: die "Some condition has not been met" # or: die "Some condition" has "not been" met

A wrapper to mkdir including nicer error handling:

mk_dir() { [ -z "${1}" ] && return 1 [ -d "${1}" ] || mkdir -p "${1}" 2>/dev/null || { echo "Error: cannot mkdir $1"; return 1; } return 0 }
# Usage: # mk_dir DIRECTORY # if you are not interested if successful or not # or: mk_dir DIRECTORY || return 1 # if you want to stop further execution of a function after failure # or: mk_dir DIRECTORY || exit 1 # if you want to stop the script after failure

Check if your you are dealing with a number (positive integer or zero) by invoking another shell (in this case: ksh):

isNum() { ksh -c "[[ \"$1\" = +([0-9]) ]]" return $? } # Usage: # isNum $N && echo "yes" # do something if ok # or: isNum $N || echo "no" # do something if not ok

Have fun building your own libraries.

Friday, November 23, 2012

A little exercise about a recent forum question (input field handling in awk and Perl)

Just recently the following question was posted to the UNIX scripting group in Linkedin:
remove all duplicate entries in a colon separated list of strings
e.g. a: b:c: b: d:a:a: b:e::f should be transformed to a: b:c: d:e::f

Some of the fields contain spaces which should be preserved in the output of course, there is an empty field too which (to me and other authors) indicates that the fields are not necessarily ordered. Here I won't discuss the suggested solutions, I also did not answer to the original posting because I read it one month too late.

awk
But when reading the question my brain already got working and I could not help to try for myself. The obvious tool of choice for exercises like this is awk because awk has inbuilt mechanisms for viewing lines as a sequence of fields with configurable field separators.

A solution could be


BEGIN { 

  FS=":" ;    # field separator

  ORS=":"     # output record separator

}

{ for(i=1;i<=NF;i++) {  # for all input fields

    if( f[$i] ) {       # check if array entry for field already exists

     continue;          # if yes: go to next field
    } else {

     print $i;          # if no: print the field content

     f[$i] = 1;          # and record it in array 'f'

} }
}

which leads to this output:
a: b:c: d:e::f:

The script can be shortened by omitting superfluous braces and 'else' to


BEGIN { FS=":" ; ORS=":" }

{ for(i=1;i<=NF;i++) { if(f[$i]) continue; f[$i]=1; print $i; } }

The script uses a very simple straightforward logic: loop through all input fields, if a field is new then print it, if not skip it. This is achieved by storing each field in an associated array 'f' when it first occurs.
Using the field separator FS for splitting the input line and the output record separator ORS when printing (you need to know that 'print' automatically adds ORS) makes this an easy task.

There is one issue though: this solution adds an extra colon at the very end (compared to the requested output), this could be an issue or not depending on the context of this request so one might prefer this code:


BEGIN { FS=":" }

{ printf $1; f[$1]=1;

  for(i=2;i<=NF;i++) { if(f[$i]) continue; f[$i]=1; printf FS $i }
}

which uses a slightly different logic: the first field is printed straight away (and recorded), the loop checks the remaining fields 2..NF and prints the field separator as a prefix to the field content. This code also works for the extreme case where there is just one field and no colon.

Perl
I then wondered if this couldn't be done equivalently or even shorter in Perl but my best solution is a little bit lengthier because I have to use 'split' to get the individual fields.


$FS=":";

@s = split($FS,<>);

for($i=0;$i<=$#s;$i++) {$e=$s[$i]; next if(exists($f{$e})); $f{$e}=1; print $e,$FS }

I could have used command line options "-a -F:" to avoid the 'split' but I need FS to be defined anyway for the output (I don't know if the split pattern defined by -F can be accessed in Perl).
I use 'split' to chop up the input line and put it into an array 's'. Then the same logic applies as in awk. Instead of an associative array I'm using a hash table 'f' in Perl. The variable 'e' is only used to avoid repeated occurances of $s[$i]. In the end tit's a matter of personal preference which solution you take.

It should be noted that I tested with

echo -n "...." | awk '...' or perl -e '...'

which feeds a string without newline to the pipe which helped to avoid 'chomp' in Perl for removing the newline in the last field.

Thursday, November 22, 2012

Create anonymous pdf file

I often use LibreOffice (previously I used OpenOffice.org, not sure where these two are heading to) to create a text and its Export to PDF... function to create the corresponding pdf file.

Today I wanted to create an anonymous pdf file. It was the copy of a text where I had omitted all personal references (name, address, links to personal web sites, etc.) and I also wanted the pdf file to be anonymous in the sense that it shouldn't contain any personal trace.

I'm not sure that I reached 100% anonymity but here is what I found: the Properties section of the pdf file contained a number of references to my name which I had to get rid off one by one.

File: which is the file name. Originally my initials were part of it but I had saved the file without it already using a neutral file name.

Title: often I give documents a title in File -> Properties.. . In the Description tab there is a Title entry where in this instance I had put my name too which I needed to remove.
(My first try was the LibreOffice Export to PDF... dialog. There is a tab User Interface with a sub section Windows which shows a tick box Display document title which is ticked by default. After removing the tick the title had disappeared from the the top of the pdf window but it still showed in the pdf properties.)

Author: I have set my profile in LibreOffice preferences User Data e.g. name, address etc. I use it to include parts of it whenever necessary. In order to exclude this information from showing up in a pdf file I had to change the LibreOffice file's properties.
Go to File -> Properties.. and choose the General tab. There is a tick box Apply User Data which is ticked by default. Removing the tick will prevent from user data being used. The LibreOffice file needs to be first saved and then exported again.

Location: since I usually create and save files on my system under my account (which contains my full name) it showed as Macintosh HD:Users:fullname:Documents so I had to find an anonymous place and I chose /Users/Shared.

Since I'm not a pdf expert there might be other (maybe hidden) references somewhere in the pdf file. I'd love to know.

Newline in awk and Perl

When someone switches from using awk to using Perl one of the beginners mistakes is to understand that awk does some things automatically which you need to code in Perl.

One example: end of line.
awk automatically drops the newline character at the end of a line. In Perl you need to do that manually by using the chomp function (or some other method).

Since echo abc does print the string abc plus a newline character the following string comparison works well in awk:


echo abc | awk '{if($0=="abc") print "yes"}'

will print "yes" whereas the seemingly equivalent in Perl does not:


echo abc | perl -e 'if(<> eq "abc") { print "yes\n" }'

The experienced Perl coder probably does this:


echo abc | perl -e '$a=<>; chomp($a); if($a eq "abc") {print "yes\n" }'

Of course you could change the test and explicitly check for the newline (or better inpout record separator $/ in Perl)


echo abc | perl -e 'if(<> eq "abc$/") {print "yes\n" }'

Or you could use echo -n to feed just a string without newline. And of course print in Perl requires a newline too whereas awk adds it automatically.

A bit more complex: check the content of the last field in an input line.
awk:


echo abc xyz | awk '{ if($NF=="xyz") print "yes" }'

Perl:


echo abc xyz | perl -e '@a=split / /,<>; chomp(@a); if($a[$#a] eq "xyz") {print "yes\n" }'

Before chomp() the last field equals the string xyz plus a newline and the comparison test will fail.

Perl in that sense is more precise and gives the user greater control, on the other hand awk is an old but well established UNIX tool whose inbuilt features can be used to one's advantage.

It is nice to have tools which do things automatically, the drawback is that you are so getting used to it that over time you forget that these automations exist.

(A real life example for me: my car has parking sensors. After one year I'm already so used to its existence that whenever driving another car I tend to forget that I have to use the good old fashioned method rather than waiting for the frequency of the beeps.)

The entries in this blog were written over the past couple of years (starting in 2006) in various places and I decided to put them here in one blog.

Some entries have been edited, others appear as I had conceived them originally.

I started to work with UNIX in the early eighties. Scripting has been part of my work more or less since then, be it as a free lance programmer during university times or later at my jobs. As other applications or tools came along with them came problems or sometimes tricky or unexpected behaviours which took some time to master. Many of the posts have originated from script errors which took some time to trouble shoot, some have caused red ears when I found out that I had misunderstood certain concepts for many year.

There is always something to learn.