Skip to content Skip to sidebar Skip to footer

Parse Nested Custom Yaml Tags

I have some yaml with application-specific tags (from an AWS Cloud Formation template, to be exact) that looks like this: example_yaml = 'Name: !Join [' ', ['EMR', !Ref 'Environmen

Solution 1:

You are close, but the problem is that you are using the method construct_yaml_seq(). That method is actually a registered constructor for the normal YAML sequence (the one that eventually makes a Python list) and it calls the construct_sequence() method to handle the node that gets passed in, and that is what you should do as well.

As you are returning a string, which cannot deal with recursive data structures, you don't need to use the two step creation process (first yield-ing, then filling out) which the construct_yaml_seq() method follows. But this two step creation process is why you encountered a generator.

construct_sequence returns a simple list, but as you want the nodes underneath the !Join available when you start processing, make sure to specify the deep=True parameter, otherwise the second list element will be an empty list. And because construct_yaml_seq(), doesn't specify deep=True, you did not get the pieces in time in your function (otherwise you could have actually used that method).

import yaml
from pprint import pprint


defaws_join(loader, node):
    join_args = loader.construct_sequence(node, deep=True)
    # you can comment out next lineassert join_args == [' ', ['EMR', '{Environment}', '{Purpose}']] 
    delimiter = join_args[0]
    joinables = join_args[1]
    return delimiter.join(joinables)

defaws_ref(loader, node):
    value = loader.construct_scalar(node)
    placeholder = '{'+value+'}'return placeholder

yaml.add_constructor('!Join', aws_join, Loader=yaml.SafeLoader)
yaml.add_constructor('!Ref', aws_ref, Loader=yaml.SafeLoader)

example_yaml = "Name: !Join [' ', ['EMR', !Ref 'Environment', !Ref 'Purpose']]"

pprint(yaml.safe_load(example_yaml))

which gives:

{'Name': 'EMR {Environment} {Purpose}'}

You should not use load(), it is documented to be potentially unsafe, and above all: it is not necessary here. Register with the SafeLoader and call safe_load()

Solution 2:

You need to change:

def aws_join(loader, node):
    delimiter = loader.construct_scalar(node.value[0])
    value = loader.construct_sequence(node.value[1])
    return delimiter.join(value)

Then you will get output:

{'Name': 'EMR {Environment} {Purpose}'}

Post a Comment for "Parse Nested Custom Yaml Tags"