I have an embedded system which generates samples (16bit numbers) at 1 milli second intervals. The variable uplink bandwidth can at best transfer a sample every 5ms, so I am looking for ways to adaptively reduce the data rate while minimizing the loss of important information -- in this case the minimum and maximum values in a time interval.
A scheme which I think should work involves sparse coding and a variation of lossy compression. Like this:
- The system will internally store the min and max values during a 10ms interval.
- The system will internally queue a limited number (say 50) of these data pairs.
- No loss of min or max values is allowed but the time interval in which they occur may vary.
- When the queue gets full, neighboring data pairs will be combined starting at the end of the queue so that the converted min/max pairs now represent 20ms intervals.
- The scheme should be iterative so that further interval combining to 40ms, 80ms etc is done when necessary.
- The scheme should be linearly weighted across the length of the queue so that there is no combining for the newest data and maximum necessary combining of the oldest data.
For example with a queue of length 6, successive data reduction should cause the data pairs to cover these intervals:
initial: 10 10 10 10 10 10 (60ms, queue full)
70ms: 10 10 10 10 10 20
80ms: 10 10 10 10 20 20
90ms: 10 10 20 20 20 20
100ms: 10 10 20 20 20 40
110ms: 10 10 20 20 40 40
120ms: 10 20 20 20 40 40
130ms: 10 20 20 40 40 40
140ms: 10 20 20 40 40 80
New samples are added on the left, data is read out from the right.
This idea obviously falls into the categories of lossy-compression and sparse-coding.
I assume this is a problem that must occur often in data logging applications with limited uplink bandwidth therefore some "standard" solution might have emerged.
I have deliberately simplified and left out other issues such as time stamping.
Questions:
- Are there already algorithms which do this kind of data logging? I am not looking for the standard, lossy picture or video compression algos but something more specific to data logging as described above.
- What would be the most appropriate implementation for the queue? Linked list? Tree?
The term you are looking for is "lossy compression" (See: http://en.wikipedia.org/wiki/Lossy_compression ). The optimal compression method depends on various aspects such as the distribution of your data.