Wednesday, November 20, 2019

Ansible - dictionaries vs. lists

When working with more complex variables in Ansible (often based on yaml input files) you always have to be aware whether the variable is a list or a dictionary. Applicable filters and methods differ and can lead to errors or unexpected results.

Example input yaml

Here I am defining two variables x and y with some sub elements.
At first glance there doesn't seem to be much difference and if you are about to design a yaml for whatever purpose both solutions might seem interchangeable. The difference lies in its usage which we will see below.
x:
  b1:
    c1: 1
    c2: "aaa"
    c3:
  b2:
    c2: "bbb"
    c3: 5

y:
  - b1:
      c1: 1
      c2: "aaa"
      c3:
  - b2:
      c2: "bbb"
      c3: 5
Call the file dict.yml.

How to check the variable type

Lately there is a new filter in Ansible called type_debug which I find incredibly useful when in doubt.
(unfortunately it was not available in Ansible 1.x, it would have saved me a lot of time)
- hosts: localhost

  tasks:
  - include_vars: dict.yml

  - debug:
      msg: "x: {{x | type_debug}} / y: {{y |type_debug}}"
will show
TASK [debug] ************************************************************************************
ok: [localhost] => {
    "msg": "x: dict / y: list"
}
i.e. I have a dictionary and a list.

How to access the elements

The elements of
  • a dictionary are accessed by name i.e. x['b1']
  • a list are accessed by position i.e. y[0]
    Something like x[0] or y['b1'] would generate a VARIABLE IS UNDEFINED.
    I also show the variable type of the sub elements.
    - hosts: localhost
    
      tasks:
      - include_vars: dict.yml
      
      - debug:
          msg: "{{x['b1'] | type_debug}}"
    
      - debug:
          var: x['b1']
    
      - debug:
          msg: "{{y[0] | type_debug}}"
    
      - debug:
          var: y[0]
    
    will show
    TASK [debug] ************************************************************************************
    ok: [localhost] => {
        "msg": "dict"
    }
    
    TASK [debug] ************************************************************************************
    ok: [localhost] => {
        "x['b1']": {
            "c1": 1,
            "c2": "aaa",
            "c3": null
        }
    }
    
    TASK [debug] ************************************************************************************
    ok: [localhost] => {
        "msg": "dict"
    }
    
    TASK [debug] ************************************************************************************
    ok: [localhost] => {
        "y[0]": {
            "b1": {
                "c1": 1,
                "c2": "aaa",
                "c3": null
            }
        }
    }
    
    
    You should also note the difference in the result. They are both dictionaries but in the x-case we get a simple dictionary with 3 elements whereas in the y-case we get a dictionary with one element b1 which a sub element of type dictionary.

    How to loop through sub elements

    You can loop easily through the elements by supplying the variable to with_items. The distinction is in what you get as an item. with_item provides
  • dictionary elements as strings and you need to access the sub elements via the {{x[item]}} method
  • list elements are dictionaries
    - hosts: localhost
    
      tasks:
      - include_vars: dict.yml
      
      - debug:
          msg: "{{item}}: {{item|type_debug}} / {{x[item]}}: {{x[item]|type_debug}}"
        with_items: "{{x}}"
    
      - debug:
          msg: "{{item}}: {{item|type_debug}}"
        with_items: "{{y}}"
    
    TASK [debug] ************************************************************************************
    ok: [localhost] => (item=b1) => {
        "msg": "b1: AnsibleUnsafeText / {u'c3': None, u'c2': u'aaa', u'c1': 1}: dict"
    }
    ok: [localhost] => (item=b2) => {
        "msg": "b2: AnsibleUnsafeText / {u'c3': 5, u'c2': u'bbb'}: dict"
    }
    
    TASK [debug] ************************************************************************************
    ok: [localhost] => (item={u'b1': {u'c3': None, u'c2': u'aaa', u'c1': 1}}) => {
        "msg": "{u'b1': {u'c3': None, u'c2': u'aaa', u'c1': 1}}: dict"
    }
    ok: [localhost] => (item={u'b2': {u'c3': 5, u'c2': u'bbb'}}) => {
        "msg": "{u'b2': {u'c3': 5, u'c2': u'bbb'}}: dict"
    }
    

    How to access the bottommost elements

    Say we want to access the value of c2 of b1 for both x and y. There is in both cases the bracket and the dot approach for the dictionary sub elements. In the y-case you need to supply the list position too but it also can be used with the dot approach.
    - hosts: localhost
    
      tasks:
      - include_vars: dict.yml
      
      - debug:
          var: x['b1']['c2']
    
      - debug:
          var: x.b1.c2
    
      - debug:
          var: y[0]['b1']['c2']
    
      - debug:
          var: y.0.b1.c2
    
    will lead to
    TASK [debug] ************************************************************************************
    ok: [localhost] => {
        "x['b1']['c2']": "aaa"
    }
    
    TASK [debug] ************************************************************************************
    ok: [localhost] => {
        "x.b1.c2": "aaa"
    }
    
    TASK [debug] ************************************************************************************
    ok: [localhost] => {
        "y[0]['b1']['c2']": "aaa"
    }
    
    TASK [debug] ************************************************************************************
    ok: [localhost] => {
        "y.0.b1.c2": "aaa"
    }
    

    Conclusion

    Filters like join, unique, setaddr, map etc. only make sense for the correct variable type. You always need to know what you are dealing with and thus you will be able to create working playbooks faster. It might also influence your design decision when you want to map data into a fitting yaml.

    Why did I write this article

    When I am writing Ansible playbooks they are often based on input yaml files (sometimes my own, sometimes from others) and also often these yaml files contain complex structures which need to be parsed and interpreted correctly.
    I am parsing complex structures by creating intermediate steps and creating new variables with set_fact which contain sub structures of the original one. A common mistake I make is that at certain points in my code I am not sure whether the variable in use is a list or a dictionary. Subsequently when I am using a method or filter this can lead to an error or - worse - to a valid but incorrect result (e.g. an "empty" variable) which will lead to further content errors downstream and the end result is puzzling.
    So I thought for my sake and the sake of the reader a little summary article would help, in particular since I find the Ansible documentation not always as helpful as it could be.