Using libyaml to parse tree like structure

907 Views Asked by At

I am a newbie to YAML and I want to parse the following yaml file :

basket :
 size : 10
 type : organic
 fruit1:
  mango : 5
  type : farm-fresh
 fruit2:
  peach : 43
  manufacturer : xyz
 color : brown
 design : netted
 ...

The yaml file will follow the above format, with any random string name and values(string, float, int, etc). I want to store each of these values in a struct, that has key and values as character array.

struct Input {
 char key[100]:
 char value[100];
}; 

There exists an array of the above struct to store the values from the yaml file.

So the data from the yaml files should be stored as:

 //Input[x].key                  //Input[x].value
basket.size                       10
basket.fruit1.mango               5
basket.fruit2.manufacturer        xyz
basket.color                      brown
basket.desgin                     netted

I wrote an application to parse the yaml file, and I get individual nodes/leaves as an string output. So based on above yaml files, I get node values as basket, size, 5, 43, etc. I followed the approach as defined here. This is one of the good resource I found to learn yaml so far.

This approach is not that useful to me, since I do not have any relation between my previous nodes to the leaves and vice versa.

Does libyaml provide a way to maintain this relationship in a tree and then give return in response to a query. I am bound to use libyaml due to the project requirements. But any other suggestions would also be welcome.

1

There are 1 best solutions below

2
On BEST ANSWER

The resource you linked describes several ways of parsing YAML. Token-based parsing, opposed to what the tutorial says, is not useful at all unless you are implementing a syntax highlighter. For all other cases, you want to use event-based parsing. So I'll assume you tried to use that.

Does libyaml provide a way to maintain this relationship in a tree

Event-based parsing does maintain the tree structure (not sure what exactly you mean by relationship in a tree), you get …Start and …End events for sequences and mappings, which describe the input structure. It is quite straightforward to build a list of struct Input walking over the event stream:

#include <yaml.h>
#include <string.h>
#include <stdio.h>
#include <stdbool.h>
#include <assert.h>

struct Input {
  char key[100];
  char value[100];
};

struct Input gen(const char *key, const char *value) {
  struct Input ret;
  strcpy(ret.key, key);
  strcpy(ret.value, value);
  return ret;
}

void append_all(yaml_parser_t *p, struct Input **target,
        char cur_key[100], size_t len) {
  yaml_event_t e;
  yaml_parser_parse(p, &e);
  switch (e.type) {
    case YAML_MAPPING_START_EVENT:
      yaml_event_delete(&e);
      yaml_parser_parse(p, &e);
      while (e.type != YAML_MAPPING_END_EVENT) {
        // assume scalar key
        assert(e.type == YAML_SCALAR_EVENT);
        if (len != 0) cur_key[len++] = '.';
        memcpy(cur_key + len, e.data.scalar.value,
            strlen(e.data.scalar.value) + 1);
        const size_t new_len = len + strlen(e.data.scalar.value);
        yaml_event_delete(&e);
        append_all(p, target, cur_key, new_len);
        if (len != 0) --len;
        cur_key[len] = '\0'; // remove key part
        yaml_parser_parse(p, &e);
      }
      break;
    case YAML_SCALAR_EVENT:
      *(*target)++ = gen(cur_key, e.data.scalar.value);
      break;
    default: assert(false);
  }
  yaml_event_delete(&e);
}

int main(int argc, char *argv[]) {
  yaml_parser_t p;
  yaml_event_t e;
  yaml_parser_initialize(&p);
  FILE *f = fopen("foo.yaml", "r");
  yaml_parser_set_input_file(&p, f);
  // skip stream start and document start
  yaml_parser_parse(&p, &e);
  yaml_event_delete(&e);
  yaml_parser_parse(&p, &e);
  yaml_event_delete(&e);

  char cur_key[100] = {'\0'};
  struct Input input[100];
  struct Input *input_end = input;
  append_all(&p, &input_end, cur_key, 0);

  // skip document end and stream end
  yaml_parser_parse(&p, &e);
  yaml_event_delete(&e);
  yaml_parser_parse(&p, &e);
  yaml_event_delete(&e);

  yaml_parser_delete(&p);
  fclose(f);

  // print out input items
  for (struct Input *cur = input; cur < input_end; ++cur) {
    printf("%s = %s\n", cur->key, cur->value);
  }
}