Easily parsable output from rrdtool

12.3k Views Asked by At

I'm working with a large bunch of RRD-files, where I have to query the data quite a lot - and mostly by reading all the data and pass it on.

Currently, I use rrdtool fetch <filename> CF --start XXX --end YYY, but as it only returns data for one CF at a time, I first have to do a separate query to find the CF's (= run and parse rrdtool info <filename>) and then run rrdtool fetch for each found CF. The output is trivial to parse, though.

Alternately, there is rrdtool xport DEF:XX=<filename>:RRA:CF ... XPORT:XX:XX ... with multiple "sets" of the latter commands for each thing I want. On the upside, this can give me all the data in one go, but I still need to have a fairly good idea about what data I want beforehand. Also, it only spits out XML (always a hassle to parse).

I have a feeling I'm missing something very obvious, as it simply can't be such a big hassle to get a list of timestamp → numbers out of a file... Any clues?

3

There are 3 best solutions below

1
On BEST ANSWER

While there are patches around for adding JSON-support, there is currently no way around:

  • Parsing at least two different output formats (rrdtool info's ASCII and then either XML from rrdtool xport or tabular data from rrdtool fetch).
  • Dumping the entire contents of the file to XML via rrdtool dump and then re-implementing quite a bit of librrd's internals.
3
On

If you want the 'table of contents' use rrdtool info, if you want the whole content, use rrdtool dump.

0
On

I've written a parser that turns the output of rrdtool info /tmp/pb_1_amp.rrd into a nested array. So from:

filename = "/tmp/pb_1_amp.rrd"
rrd_version = "0003"
step = 1800
last_update = 1372685403
header_size = 1208
ds[amp].index = 0
ds[amp].type = "GAUGE"
ds[amp].minimal_heartbeat = 3200
ds[amp].min = 0.0000000000e+00
ds[amp].max = 1.0000000000e+02
ds[amp].last_ds = "5.6"
ds[amp].value = 1.6800000000e+01
ds[amp].unknown_sec = 0
rra[0].cf = "AVERAGE"
rra[0].rows = 576
rra[0].cur_row = 385
rra[0].pdp_per_row = 1
rra[0].xff = 5.0000000000e-01
rra[0].cdp_prep[0].value = NaN
rra[0].cdp_prep[0].unknown_datapoints = 0
rra[1].cf = "AVERAGE"
rra[1].rows = 672
rra[1].cur_row = 159
rra[1].pdp_per_row = 6
rra[1].xff = 5.0000000000e-01
rra[1].cdp_prep[0].value = 1.6999833333e+01
rra[1].cdp_prep[0].unknown_datapoints = 0
rra[2].cf = "AVERAGE"
rra[2].rows = 732
rra[2].cur_row = 639
rra[2].pdp_per_row = 24
rra[2].xff = 5.0000000000e-01
rra[2].cdp_prep[0].value = 1.6999833333e+01
rra[2].cdp_prep[0].unknown_datapoints = 0
rra[3].cf = "AVERAGE"
rra[3].rows = 1460
rra[3].cur_row = 593
rra[3].pdp_per_row = 144
rra[3].xff = 5.0000000000e-01
rra[3].cdp_prep[0].value = 6.6083527778e+02
rra[3].cdp_prep[0].unknown_datapoints = 0

to:

Array
(
    [filename] => /tmp/pb_1_amp.rrd
    [rrd_version] => 0003
    [step] => 1800
    [last_update] => 1372685403
    [header_size] => 1208
    [ds] => Array
        (
            [amp] => Array
                (
                    [index] => 0
                    [type] => GAUGE
                    [minimal_heartbeat] => 3200
                    [min] => 0.0000000000e+00
                    [max] => 1.0000000000e+02
                    [last_ds] => 5.6
                    [value] => 1.6800000000e+01
                    [unknown_sec] => 0
                )

        )

    [rra] => Array
        (
            [0] => Array
                (
                    [cf] => AVERAGE
                    [rows] => 576
                    [cur_row] => 385
                    [pdp_per_row] => 1
                    [xff] => 5.0000000000e-01
                    [cdp_prep] => Array
                        (
                            [0] => Array
                                (
                                    [value] => NaN
                                    [unknown_datapoints] => 0
                                )

                        )

                )

            [1] => Array
                (
                    [cf] => AVERAGE
                    [rows] => 672
                    [cur_row] => 159
                    [pdp_per_row] => 6
                    [xff] => 5.0000000000e-01
                    [cdp_prep] => Array
                        (
                            [0] => Array
                                (
                                    [value] => 1.6999833333e+01
                                    [unknown_datapoints] => 0
                                )

                        )

                )

            [2] => Array
                (
                    [cf] => AVERAGE
                    [rows] => 732
                    [cur_row] => 639
                    [pdp_per_row] => 24
                    [xff] => 5.0000000000e-01
                    [cdp_prep] => Array
                        (
                            [0] => Array
                                (
                                    [value] => 1.6999833333e+01
                                    [unknown_datapoints] => 0
                                )

                        )

                )

            [3] => Array
                (
                    [cf] => AVERAGE
                    [rows] => 1460
                    [cur_row] => 593
                    [pdp_per_row] => 144
                    [xff] => 5.0000000000e-01
                    [cdp_prep] => Array
                        (
                            [0] => Array
                                (
                                    [value] => 6.6083527778e+02
                                    [unknown_datapoints] => 0
                                )

                        )

                )

        )

)

It's in PHP but it should be easy to port to any other language. Here's the code:

$store = array();
foreach ($lines as $line) {
    list($raw_key, $raw_val) = explode(' = ', $line);

    $keys      = preg_split('/[\.\[\]]/', $raw_key, -1, PREG_SPLIT_NO_EMPTY);
    $key_count = count($keys);
    $pointer   = &$store;

    foreach ($keys as $key_num => $key) {
        if (!array_key_exists($key, $pointer)) {
            $pointer[$key] = array();
        }
        $pointer = &$pointer[$key];
        if ($key_num+1 === $key_count) {
            $pointer = trim($raw_val, '"');
        }
    }
}

It assumes the rrdtool info output is split by newline (\n) and found in $lines. Hope this helps.