Wednesday, July 8, 2020

Ansible - convert yaml file to json

Using the Ansible filters from_yaml and to_json it is easy to construct a task to convert a yaml file to JSON.

Here is the Ansible playbook yaml_to_json.yml (in a real world example the names of the files would probably be parameterized).

---

- hosts: localhost

  tasks:
    - shell: cat my.yaml
      register: result

    - set_fact:
        myvar: "{{ result.stdout | from_yaml | to_json }}"

    - copy:
        content: "{{ myvar }}"
        dest:    my.json

Here is an example of a yaml input file which contains some special characters, lists, strings, numbers etc.

---

xx:
 - a:
 - b:
     A: aadf7{fdfd"öög
     B: [ 'asdfadsf*5%@@@"masdf' , 123456 ]

yy:
 n:
   C: 'AAdf7{fdfd"öög'
   D: [ 'ASdfadsf*5%@@@"masdf' , 789.56 ]
 m:

When run via ansible-playbook yaml_to_json.yml the JSON output file below (run via jq .) is generated.

{
  "yy": {
    "m": null,
    "n": {
      "C": "AAdf7{fdfd\"\\u00f6\\u00f6g",
      "D": [
        "ASdfadsf*5%@@@\"masdf",
        789.56
      ]
    }
  },
  "xx": [
    {
      "a": null
    },
    {
      "b": {
        "A": "aadf7{fdfd\"\\u00f6\\u00f6g",
        "B": [
          "asdfadsf*5%@@@\"masdf",
          123456
        ]
      }
    }
  ]
}
Note how lists and dictionaries and special characters are put into the resp. JSON format.

Also note that I am using the Ansible copy module. A shell: echo "{{ myvar }}" > my.json does not work since it does not take care of the correct quote and special characters subsitutions.

Friday, March 20, 2020

Different 'exit' behaviour between bash and Korn shell in command pipelines

When running command pipelines it is sometimes convenient to break out of a sub process and exit a shell script.

There are differences though between different types of shell and I want to show this behaviour and its consequences , in particular how common expectations might be met or not, and also suggest solutions.

Here is my example. It is a simple script: a command pipeline consisting of a printf (printing two lines) followed by a while, wrapped by a starting and closing echo

echo Before pipeline
printf "text1\ntext2\n" | while read line ; do
  # Do something useful with 'line'
  echo $line
done
echo After pipeline
and the output - with whichever shell - is
Before pipeline
text1
text2
After pipeline

Now let's modify this script and add an exit into the while loop so that it looks like this

echo Before pipeline
printf "text1\ntext2\n" | while read line ; do
  # Do something useful with 'line'
  echo $line
  exit 1
done
echo After pipeline
As artificial as this example might look you can simply assume that there are more complex things happening in the while loop and under certain conditions one might want to exit the loop.

Here are the results for bash and Korn shell ksh

bash
#!/bin/bash
ksh
#!/bin/ksh
Before pipeline
text1
After pipeline
Before pipeline
text1
Exit code: 0 Exit code: 1
Common behaviour: both shells leave the loop and neither prints the second line 'text2'
bash continues with the commands after the while loop
(and exits with the result of the last 'echo' command)
ksh exits the whole script

While the behaviour of ksh seems more natural (exit means exit everything) the bash behaviour can be explained when considering that the while loop is a sub shell with its own scope as if it would be a separate shell script. Exiting the loop means to return to the parent. This also means that the parent script can capture this exit code and thus the solution is a check after the loop.

echo Before pipeline
printf "text1\ntext2\n" | while read line ; do
  # Do something useful with 'line'
  echo $line
  exit 1
done
[ $? -ne 0 ] && echo ERROR && exit 2
echo After pipeline
The result for bash: exit code 2 and the output
Before pipeline
text1
ERROR
The exit code check line is meaningless for ksh since it will never be reached.

I've seen plenty of bash scripts with similar constructs where the author coded a quick 'exit' line into the while loop but forgetting to check the result, probably assuming the ksh behaviour.

Tuesday, February 4, 2020

Circular shift of lists in python

Scenario

Assume I have a list in python
a = [ -2, -1, 0, 1, 2 ]
(an admittedly simple list here with integer numbers).
I want to shift these entries in the list to the left or right so that the entries moving out of the list enter the list again at the other end e.g. I want to shift 2 to the left to get this list:
[0, 1, 2, -2, -1]

Solutions

There are solutions using numpy but I want to present another easy solution just using slices.
shift = 2
x = a[shift:]       # This is [ 0, 1, 2 ]
y = a[:shift]       # This is [ -2, -1 ]
print( x + y )

[0, 1, 2, -2, -1]
I am
  • setting my shift value to 2
  • creating a slice from index 2 to the end of the list
  • creating a slice from the beginning of the list until index - 1
  • adding the two slices to get the shifted result

    This also works for negative shift values (shift to the right) which is a particularily nice feature of the pythong [:] operator.

    shift = -1
    x = a[shift:]       # This is [ 2 ]
    y = a[:shift]       # This is [ -2, -1, 0, 1 ]
    print( x + y ) 
    
    [2, -2, -1, 0, 1]
    

    Add-on: calculate the new index of a shifted element

    Starting with
    a = [ -2, -1, 0, 1, 2 ]
    
    the element -2 has index 0. The shift 2 to the left result
    [0, 1, 2, -2, -1]
    
    puts element -2 at index 3.

    How can I calculate that?

    
    def index_after_shift( old_index, shift ):
      return ( old_index - shift ) % len(a) 
    
    # Examples
    for shift in [2, -1 , 6 ]:
      for ind in [ 0,1,2,3,4]:
        print( ind, shift, index_after_shift( ind, shift ) )
      print()
    
    0 2 3
    1 2 4
    2 2 0
    3 2 1
    4 2 2
    
    0 -1 1
    1 -1 2
    2 -1 3
    3 -1 4
    4 -1 0
    
    0 6 4
    1 6 0
    2 6 1
    3 6 2
    4 6 3
    
    The function index_after_shift calculates the new index.
    A shift 2 to the left i.e. shift = 2 means that we subtract 2 from the current index to get to the new one therefore old_index - shift. This is easily understandable for indexes 2, 3, 4, ... which will become 0, 1, 2, ....
    What do we do with smaller indexes?>
    Here we are using the modulo function which will do the necessary calculation for us e.g.
    shift = 2
    old_index = 1
    # length of a is 5
    ( old_index - shift ) % len(a)
    # = ( 1 - 2 ) % 5
    # = -1 % 5
    # = 4
    
    The modulo has converted the negative number resulting from the subtraction into a positive one.

    shift to the right means we have to add something to the index to get to our new index (since the variable shift is negative for right shifts we subtract a negative number in the function which results in the addition of a positive number).

  • Thursday, January 30, 2020

    Modulo operation in programming languages - differently implemented

    Many programming languages support an operation which they call modulo operator and which is often designated by the symbol
      %
    Often it is also referred to as the remainder after division.

    Only recently I found out that this operation does not behave as one might think. The results differ depending on which programming language you are using.

    Before I go into the details I would like to illustrate the difference which shows when using negative numbers.

    Example

    I want to calculate these two expressions:
     27 % 10
    -27 % 10
    
    The result for the second expression will be different depending on the programming language.

    C

    When you are using this statement in C
    printf("%d  %d\n", 17 % 10, -17 % 10 );
    
    you get
    7  -7
    

    Python

    When you are using this statement in python
    print( '{0}  {1}'.format( 17 % 10,  -17 % 10 ) )
    
    you get
    7  3
    

    Explanation

    modulo as remainder by division

    Some programming languages implement modulo as a remainder of division operation. Thus -27 % 10 results in the leftover of -27 when you take away the maximum multiple of 10 , so you are left with -7.

    modulo as mathematically correct number

    Other programming languages implement modulo as correct in the mathematical sense.
    Mathematically modulo is defined as the number which needs to be added to a multiple of the divisor to get to the original.
    x = m % n
    There must be a number 'a' so that
    a * n + x = m
    and this condition should be met:
    0 <= x < n
    
    In our case:
    x = 3
    a = -2
    =>
    a * 10 + x = -2 * 10 + 3 = -17
    

    Conclusion

    Since I am not a programming languages expert I can only refer to the interesting Wikipedia article about modulo operations.
    This subject is worth knowing if any of your programming efforts involve some number operations.
    My personal "watch out" topic is awk programming where the behaviour is non-mathematical like C.