Thursday, December 16, 2010

Create a comma separated string in shell: a b c -> "(a,b,c)" (no, not sed)

Well, the title does not really explain the real issue but I couldn't find a better one.

Another try: assume you have a list of tokens in a shell script and you want to build a comma separated list out of those tokens and you want to do this in a loop there is usually an issue with one comma too many at the beginning or at the end.

For simplicity I use a simple token list.
This example of course could be solved faster with sed.
I add a more complex example at the end.

A first approach is as follows and will create a list with an empty first element so to speak:
TOKENS="a b c"
LIST="("
for token in $TOKENS ; do
   LIST="${LIST},${token}"
done
LIST="${LIST})"
echo $LIST
(,a,b,c)
So how does one get rid of the extra empty element at the beginning of the list?

One might add an if-statement:
TOKENS="a b c"
LIST="("
for token in $TOKENS ; do
   if [ "x$LIST" = "x(" ] ; then
      LIST="${LIST}${token}"
   else
      LIST="${LIST},${token}"
   fi
done
LIST="${LIST})"
echo $LIST
(a,b,c)
In my search for a shorter one line solution I found the following approach using shell paremeter substitution.
TOKENS="a b c"
LIST=""
for token in $TOKENS ; do
      LIST="${LIST:-(}${LIST:+,}${token}"
done
LIST="${LIST})"
echo $LIST
(a,b,c)
This is admittedly not easy to understand on the first glance unless you are very familiar with the parameter substitution.
I'm using the complementary idea of :- and :+ .
${LIST:-(} will put out either the current value of LIST (if it exists and is not empty) or a ( .
So in the first invocation of the loop LIST is not yet set and thus ( is put out.
In the next rounds LIST is set and will be put out as is.
${LIST:+,} will put out either a comma (if LIST exists and is set) or nothing at all.
In the first invocation of the loop LIST is not yet set and nothing is put out.
In the next rounds LIST is set and a comma will be put out.

In all cases the token is appended at the end.

Here is a an example with more complex tokens which contain a space and they should be surrounded by quotes in the resulting list.
A="sam smith"
B="jane jones"
C="gabe miller"
LIST=""
for token in "$A" "$B" "$C" ; do
   LIST="${LIST:-(}${LIST:+,}'${token}'"
done
LIST="${LIST})"
echo $LIST
('sam smith','jane jones','gabe miller')

2 comments:

  1. Your blog has given me that thing which I never expect to get from all over the websites. Nice post guys!

    ReplyDelete
  2. this is a very handy tidbit. Yet another intricacy that I never knew about... Concise and to the point. Thanks ...

    ReplyDelete