Processing a large dataset of nested data

114 Views Asked by At

I have a rather large set of data which is structured in a somewhat unique fashion. It looks something like this:

foo:
- name: "some name"
  location: "some location"
  type: "someType"

  bar:
    - name: "A bar element"
      location: "location here"
      type: "someOtherType"
      attachments:
        - type: "attachmentTypeA"
          name: "Attachment name"
        - type: "attachmentTypeB"
          name: "Attachment name"

  baz:
    - name: "another name"
      location: "another location"
      type: "anotherType"

      qux:
        - name: "My name here"
          location: "My location here"
          type: "SomeOtherTypeHere"

          xyzzy:
            - name: "Another name here"
              location: "Another location here"
              type: "anotherTypeHere"

              bar:
                - name: "Some name here"
                  location: "Some location here"
                  type: "typeHere"
                  attachments:
                    - type: "attachmentTypeA"
                      name: "attachment name here"
                    - type: "attachmentTypeA"
                      name: "attachment name here"
                    - type: "attachmentTypeB"
                      name: "attachment name here"

                - name: "Another name here"
                  location: "Another location here"
                  type: "anotherTypeHere"
                  attachments:
                    - type: "attachmentTypeA"
                      name: "attachment name here"
                    - type: "attachmentTypeC"
                      name: "attachment name here"
                    - type: "attachmentTypeD"
                      name: "attachment name here"
    - name: "Another baz listing"
      location: "Baz location"
      type: "bazTypeHere"

So basically, you have "foo" at the top level (and there can be more than one foo, but always at the top level). In general, the structure is:

foo > baz > qux > xyzzy > bar

However, any of the sub elements can be at the root, or under foo, provided they are in order. So these are valid:

foo
  qux
    xyzzy
      bar
        attachments
      bar
        attachments

As is this:

foo
  baz
    qux
      xyzzy
        bar
          attachments
        bar
          attachments
  qux
    xyzzy
      bar
        attachments
      bar
        attachments
  xyzzy
    bar
      attachments
    bar
      attachments

And so on. It's whacky, I know. But that's the dataset I inherited. I looked at the examples, in particular the DeserializeObjectGraph and LoadingAYamlStream examples. The DeserializeObjectGraph approach gets kind of crazy when the data is laid out like this. I finally gave up on it as it just got too hairy. The stream approach seems like a better fit, I think, but I'm running into troubles.

I am loading up the YAML as follows:

        string contents = System.IO.File.ReadAllText ( fileName );
        var input = new StringReader (contents);
        var yaml = new YamlStream ();
        yaml.Load (input);

As you can see, nothing fancy there. I'm just trying to get a "tree" of objects that I can then iterate through. I tried using the AllNodes property from the root node, but I can't for the life of me figure out how to iterate through them recursively in some manner than makes sense. I will also confess that I am a C# n00btard that is still learning (old C guy here), so bear with me!

Can anyone suggest an approach, or possibly some code or even pseudocode that might be able to help me out?

0

There are 0 best solutions below