ruamel.yaml deletes unused anchors

75 Views Asked by At

I have the following yaml file:

anchors:
  kubernetes:
  - kubelet: &1GiKubeReserved
      kubeReserved:
        cpu: 80m
        memory: 1Gi
        pid: 20k
  - kubelet: &2GiKubeReserved
      kubeReserved:
        cpu: 160m
        memory: 2Gi
        pid: 20k
cluster:
  name: test
  kubernetes:
    kubelet:
      <<: *1GiKubeReserved

Loading the above file with ruamel.yaml removes the unused 2GiKubeReserved anchor.

-  - kubelet: &2GiKubeReserved
+  - kubelet:

Snippet of the Python code:

from ruamel.yaml import YAML

yaml = YAML()
file = 'example.yaml'
with open(file, 'r+', encoding="utf-8") as f:
    data = yaml.load(f)
    f.seek(0)
    yaml.dump(data, f)
    f.truncate()

Is there a way to preserve this kind of unused anhors?

In this case I'm expecting to have no diff.

1

There are 1 best solutions below

1
Anthon On

An anchor is stored on an Anchor() instance in the .anchor attribute attached to the object. That Anchor() instance, apart from the anchor value, has an attribute always_dump that is set to False by default. You can set that during load time (as shown in this answer), or walk recursively over the datastructure and change that always_dump attribute:

import sys
import ruamel.yaml

yaml_str = """\
anchors:
  kubernetes:
  - kubelet: &1GiKubeReserved
      kubeReserved:
        cpu: 80m
        memory: 1Gi
        pid: 20k
  - kubelet: &2GiKubeReserved
      kubeReserved:
        cpu: 160m
        memory: 2Gi
        pid: 20k
cluster:
  name: test
  kubernetes:
    kubelet:
      <<: *1GiKubeReserved
"""

def set_all_anchors_to_dump(data):
    if hasattr(data, 'anchor') and not data.anchor.always_dump:
        data.anchor.always_dump = True
    if isinstance(data, dict):
        for k, v in data.items():
            set_all_anchors_to_dump(v)
    elif isinstance(data, list):
        for elem in data:
            set_all_anchors_to_dump(elem)
    
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
data = yaml.load(yaml_str)
# print('>>>', data['anchors']['kubernetes'][1]['kubelet'].anchor.always_dump)
set_all_anchors_to_dump(data)
yaml.dump(data, sys.stdout)

which gives:

anchors:
  kubernetes:
  - kubelet: &1GiKubeReserved
      kubeReserved:
        cpu: 80m
        memory: 1Gi
        pid: 20k
  - kubelet: &2GiKubeReserved
      kubeReserved:
        cpu: 160m
        memory: 2Gi
        pid: 20k
cluster:
  name: test
  kubernetes:
    kubelet:
      <<: *1GiKubeReserved