Log data reduction for variable bandwidth data link

168 Views Asked by At

I have an embedded system which generates samples (16bit numbers) at 1 milli second intervals. The variable uplink bandwidth can at best transfer a sample every 5ms, so I am looking for ways to adaptively reduce the data rate while minimizing the loss of important information -- in this case the minimum and maximum values in a time interval.

A scheme which I think should work involves sparse coding and a variation of lossy compression. Like this:

  1. The system will internally store the min and max values during a 10ms interval.
  2. The system will internally queue a limited number (say 50) of these data pairs.
  3. No loss of min or max values is allowed but the time interval in which they occur may vary.
  4. When the queue gets full, neighboring data pairs will be combined starting at the end of the queue so that the converted min/max pairs now represent 20ms intervals.
  5. The scheme should be iterative so that further interval combining to 40ms, 80ms etc is done when necessary.
  6. The scheme should be linearly weighted across the length of the queue so that there is no combining for the newest data and maximum necessary combining of the oldest data.

For example with a queue of length 6, successive data reduction should cause the data pairs to cover these intervals:

initial: 10 10 10 10 10 10  (60ms, queue full)
 70ms:   10 10 10 10 10 20
 80ms:   10 10 10 10 20 20
 90ms:   10 10 20 20 20 20
100ms:   10 10 20 20 20 40
110ms:   10 10 20 20 40 40
120ms:   10 20 20 20 40 40
130ms:   10 20 20 40 40 40
140ms:   10 20 20 40 40 80

New samples are added on the left, data is read out from the right.

This idea obviously falls into the categories of lossy-compression and sparse-coding.

I assume this is a problem that must occur often in data logging applications with limited uplink bandwidth therefore some "standard" solution might have emerged.

I have deliberately simplified and left out other issues such as time stamping.

Questions:

  1. Are there already algorithms which do this kind of data logging? I am not looking for the standard, lossy picture or video compression algos but something more specific to data logging as described above.
  2. What would be the most appropriate implementation for the queue? Linked list? Tree?
3

There are 3 best solutions below

1
On

The term you are looking for is "lossy compression" (See: http://en.wikipedia.org/wiki/Lossy_compression ). The optimal compression method depends on various aspects such as the distribution of your data.

0
On

As i understand you want to transmit min() and max() of all samples in a timeperiod.

eg. you want transmit min/max every 10ms with taking samples every 1ms?

if you do not need the individual samples you simply compare them after each sampling

i=0; min=TYPE_MAX; max=TYPE_MIN;// First sample will always overwrite the initial values
while true do
    sample = getSample();
    if min>sample then
        min=sample

    if max<sample then
        max=sample

    if i%10 == 0 then
        send(min, max);
        // if each period should be handled seperatly: min=TYPE_MAX; max=TYPE_MIN;
done

you can also save bandwidth with sending data only on changes (depends on sample data: if they dont change very quick you will save a lot)

0
On

Define a combination cost function that matches your needs, e.g. (len(i) + len(i+1)) / i^2, then iterate the array to find the "cheapest" pair to replace.